Tuesday, 9 February 2016

How Much Of Tor Is Used For Illegal Purposes?

A paper just published by researchers at Kings College attempts to quantify how much of Tor is used for illegal purposes.  Or rather, what proportion of Tor's hidden services are used for illicit purposes.

The scans returned a total of 5,205 live websites within the hidden services network, out of which 2,723 were conducting one of the following activities:

Results from scans conducted by Kings College Reserachers

These results were reported with what was termed a "high degree of confidence".  Where sites were categorised as "None" it was because there was no content (hence counted neither as illegal or legal in nature), and "Unknown" means it was not possible to determine the nature of the content.

The definitions used by these researchers for illegal activities are:

The first thought about this data is that it appears to have found only 5205 Onion sites whereas Tor report many more .Onion unique addresses:

Numbers of unique .Onion sites recorded by the Tor Project
Interestingly, Tor reports approximately 35,000 unique .Onion addresses.  But the researchers report approximately 300,000 addresses having been detected, representing 205,000 unique pages (individual sites obviously comprise many pages).  This suggests that the researchers' analysis does cover a statistically significant proportion of the hidden services in operation.

Hence, the conclusion, which seems to be statistically valid, means that at least 57% of "active" sites were involved in illegal activities. 

Whilst many headlines have reported this as being the majority of Tor is used for illegal purposes, it needs to be remembered that this also means that 43% is being used for legal purposes.  57% is a relatively small majority.

The Tor Project themselves wrote a report last year attempting to extrapolate the level of Tor usage that was dedicated to visiting hidden services.  It is this method that resulted in their figures shown above.  They note that at least 1% of the Tor nodes need to be reporting such statistics for it to be reliable.  However, the number of such nodes has been steadily increasing so the figures reported by the Tor Project appear to be valid:

Fraction of Tor nodes reporting hidden services statistics

Using the data provided by the Tor Project you can compare the total bandwidth within the Tor network with how much bandwidth is being used for hidden services:
Bandwidth used by hidden services within Tor network (Mbits/s)

From this you can see that out of a total bandwidth in excess 150,000 Mbits/s approximately 1300 Mbits/s are used by hidden services ie less than 1%.

It might be slightly erroneous to extrapolate numbers of types of user from bandwidth usage but I think it indicates that even if the "majority" of hidden services are used for illegal purposes, Tor users in general are not using the network to access these services.

Of course, as hinted at in the first paragraph above, not all illegal activity on Tor will be limited to hidden services so there is perhaps further work to be done to analyse the destinations in general for those who use Tor to access the web.