If you use an anonymity network such as Tor on a regular basis, you are probably familiar with various annoyances in your web browsing experience, ranging from pages saying “Access denied” to having to solve CAPTCHAs before continuing. Interestingly, these hurdles disappear if the same website is accessed without Tor. The growing trend of websites extending this kind of “differential treatment” to anonymous users undermines Tor’s overall utility, and adds a new dimension to the traditional threats to Tor (attacks on user privacy, or governments blocking access to Tor). There is plenty of anecdotal evidence about Tor users experiencing difficulties in browsing the web, for example the user-reported catalog of services blocking Tor. However, we don’t have sufficient detail about the problem to answer deeper questions like: how prevalent is differential treatment of Tor on the web; are there any centralized players with Tor-unfriendly policies that have a magnified effect on the browsing experience of Tor users; can we identify patterns in where these Tor-unfriendly websites are hosted (or located), and so forth.
Today we present our paper on this topic: “Do You See What I See? Differential Treatment of Anonymous Users” at the Network and Distributed System Security Symposium (NDSS). Together with researchers from the University of Cambridge, University College London, University of California, Berkeley and International Computer Science Institute (Berkeley), we conducted comprehensive network measurements to shed light on websites that block Tor. At the network layer, we scanned the entire IPv4 address space on port 80 from Tor exit nodes. At the application layer, we fetch the homepage from the most popular 1,000 websites (according to Alexa) from all Tor exit nodes. We compare these measurements with a baseline from non-Tor control measurements, and uncover significant evidence of Tor blocking. We estimate that at least 1.3 million IP addresses that would otherwise allow a TCP handshake on port 80 block the handshake if it originates from a Tor exit node. We also show that at least 3.67% of the most popular 1,000 websites block Tor users at the application layer.
We find that the websites that block Tor mostly belong to Autonomous Systems (ASes) corresponding to mobile and access ISPs, and hosting services. Some of these ASes perform wholesale blocking of Tor, that is all the IP addresses in the AS block Tor. We also wrote classifiers to map websites to their web hosting services. Our results bring out CloudFlare and Akamai as dominant Tor blockers, highlighting the amplified blocking effect such centralized web services may create when their Tor-unfriendly policy trickles down to thousands of their client websites. The figure below shows the top 20 websites by how many Tor nodes they block, from the Alexa top-1,000 list. Each row in this figure represents a website, and each column represents a Tor exit node (of about 900 total). So a blue bar means that the website blocks a Tor exit node. Clearly, these websites (mostly hosted by Akamai and Amazon Web Services) block a large fraction of Tor exit nodes. We think that some of this blocking is caused by blacklists that include Tor exit nodes, yet other instances likely arise when abuse generated from Tor exit nodes trigger automated blocking mechanisms on websites.
Our work provides a first step towards addressing the problems faced by Tor users by characterizing websites that treat traffic from the Tor network differently from other sources. The next steps, as described by Tor developer Roger Dingledine, involve social activism to engage with major players on the web such as CloudFlare and get their perspective on this problem and discuss possible solutions. There is not much we can do in the case of entities such as ISPs and countries that preemptively block all Tor exit nodes as a matter of policy, beyond some alleviation in the form of awareness campaigns to highlight the problem (such as, Tor’s “Don’t Block Me” initiative). With abuse-based blocking, we need solutions to enable precise filtering beyond IP address blocking of Tor exit nodes, so that benign Tor users don’t have to suffer from the abusive actions of other Tor users sharing the same exit node.
In a broader context, our work calls attention to a new kind of blocking that is mandated by publishers. In the classical censorship scenario, blocking takes place near the user, for example an intermediate device dropping a user’s request for a blacklisted website. In publisher side blocking, the user’s request arrives at the publisher, but the publisher (or something working on its behalf) refuses to respond based on some property of the user. Who else over the Internet besides Tor users is subject to publisher-side blocking?
“Do You See What I See? Differential Treatment of Anonymous Users” by Sheharbano Khattak, David Fifield, Sadia Afroz, Mobin Javed, Srikanth Sundaresan, Vern Paxson, Steven J. Murdoch, and Damon McCoy will be presented at the Network and Distributed System Security Symposium, San Diego, US, 21–24 February 2016.
This post also appears on the University of Cambridge Computer Laboratory Security Group blog, Light Blue Touchpaper.