ACE-CSR opening event 2015/16: talks on malware, location privacy and wiretap law

The opening event for the UCL Academic Centre of Excellence for Cyber Security Research in the 2015–2016 academic term featured three speakers: Earl Barr, whose work on approximating program equivalence has won several ACM distinguished paper awards; Mirco Musolesi from the Department of Geography, whose background includes a degree in computer science and an interest in analysing myriad types of data while protecting privacy; and Susan Landau, a professor at Worcester Polytechnic Institute and a visiting professor at UCL and an expert on cyber security policy whose books include Privacy On the Line: the Politics of Wiretapping and Encryption (with Whitfield Diffie) and Surveillance or Security? The Risks Posed by New Wiretapping Technologies.

Detecting malware and IP theft through program similarity

Earl Barr is a member of the software systems engineering group and the Centre for Research on Evolution, Search, and Testing. His talk outlined his work using program similarity to determine whether two arbitrary programs have the same behaviour in two areas relevant to cyber security: malware and intellectual property theft in binaries (that is, code reused in violation of its licence).

Barr began by outlining his work on detecting malware, comparing the problem to that facing airport security personnel trying to find a terrorist among millions of passengers. The work begins with profiling: collect two zoos, and then ask if the program under consideration is more likely to belong to the benign zoo or the malware zoo.

Rather than study the structure of the binary, Barr works by viewing the program as strings of 0s and 1s, which may not coincide with the program’s instructions, and using information theory to create a measure of dissimilarity, the normalised compression distance (NCD). The NCD serves as an approximation of the Kolmogorov Complexity, a mathematical measure of the complexity of the shortest description of an object, which is then normalised using a compression algorithm that ignores the details of the instruction set architecture for which the binary is written.

Using these techniques to analyse a malware zoo collected from sources such as Virus Watch, Barr was able to achieve a 95.7% accuracy rate. He believes that although this technique isn’t suitable for contemporary desktop anti-virus software, it opens a new front in the malware detection arms race. Still, Barr is aware that malware writers will rapidly develop countermeasures and his group is already investigating counter-countermeasures.

Malware writers have three avenues for blocking detection: injecting new content that looks benign; encryption; and obfuscation. Adding new content threatens the malware’s viability: raising the NCD by 50% requires doubling the size of the malware. Encryption can be used against the malware writer: applying a language model across the program reveals a distinctive saw-toothed pattern of regions with low surprise and low entropy alternating with regions of high surprise and high entropy (that is, regions with ciphertext). Obfuscation is still under study: the group is using three obfuscation engines available for Java and applying them repeatedly to Java malware. Measuring the NCD after each application shows that after 100 iterations the NCD approaches 1 (that is, the two items being compared are dissimilar), but that two of the three engines make errors after 200 applications. Unfortunately for malware writers, this technique also causes the program to grow in size. The cost of obfuscation to malware writers may therefore be greater than that imposed upon white hats.

Continue reading ACE-CSR opening event 2015/16: talks on malware, location privacy and wiretap law

How Tor’s privacy was (momentarily) broken, and the questions it raises

Just how secure is Tor, one of the most widely used internet privacy tools? Court documents released from the Silk Road 2.0 trial suggest that a “university-based research institute” provided information that broke Tor’s privacy protections, helping identify the operator of the illicit online marketplace.

Silk Road and its successor Silk Road 2.0 were run as a Tor hidden service, an anonymised website accessible only over the Tor network which protects the identity of those running the site and those using it. The same technology is used to protect the privacy of visitors to other websites including journalists reporting on mafia activity, search engines and social networks, so the security of Tor is of critical importance to many.

How Tor’s privacy shield works

Almost 97% of Tor traffic is from those using Tor to anonymise their use of standard websites outside the network. To do so a path is created through the Tor network via three computers (nodes) selected at random: a first node entering the network, a middle node (or nodes), and a final node from which the communication exits the Tor network and passes to the destination website. The first node knows the user’s address, the last node knows the site being accessed, but no node knows both.

The remaining 3% of Tor traffic is to hidden services. These websites use “.onion” addresses stored in a hidden service directory. The user first requests information on how to contact the hidden service website, then both the user and the website make the three-hop path through the Tor network to a rendezvous point which joins the two connections and allows both parties to communicate.

In both cases, if a malicious operator simultaneously controls both the first and last nodes to the Tor network then it is possible to link the incoming and outgoing traffic and potentially identify the user. To prevent this, the Tor network is designed from the outset to have sufficient diversity in terms of who runs nodes and where they are located – and the way that nodes are selected will avoid choosing closely related nodes, so as to reduce the likelihood of a user’s privacy being compromised.

How Tor works
How Tor works (source: EFF)

This type of design is known as distributed trust: compromising any single computer should not be enough to break the security the system offers (although compromising a large proportion of the network is still a problem). Distributed trust systems protect not only the users, but also the operators; because the operators cannot break the users’ anonymity – they do not have the “keys” themselves – they are less likely to be targeted by attackers.

Unpeeling the onion skin

With about 2m daily users Tor is by far the most widely used privacy system and is considered one of the most secure, so research that demonstrates the existence of a vulnerability is important. Most research examines how to increase the likelihood of an attacker controlling both the first and last node in a connection, or how to link incoming traffic to outgoing.

When the 2014 programme for the annual BlackHat conference was announced, it included a talk by a team of researchers from CERT, a Carnegie Mellon University research institute, claiming to have found a means to compromise Tor. But the talk was cancelled and, unusually, the researchers did not give advance notice of the vulnerability to the Tor Project in order for them to examine and fix it where necessary.

This decision was particularly strange given that CERT is worldwide coordinator for ensuring software vendors are notified of vulnerabilities in their products so they can fix them before criminals can exploit them. However, the CERT researchers gave enough hints that Tor developers were able to investigate what had happened. When they examined the network they found someone was indeed attacking Tor users using a technique that matched CERT’s description.

The multiple node attack

The attack turned on a means to tamper with a user’s traffic as they looked up the .onion address in the hidden service directory, or in the hidden service’s traffic as it uploaded the information to the directory.

Continue reading How Tor’s privacy was (momentarily) broken, and the questions it raises

New EU Innovative Training Network project “Privacy & Us”

Last week, “Privacy & Us” — an Innovative Training Network (ITN) project funded by the EU’s Marie Skłodowska-Curie actions — held its kick-off meeting in Munich. Hosted in the nice and modern Wisschenschafts Zentrum campus by Uniscon, one of the project partners, principal investigators from seven different countries set out the plan for the next 48 months.

Privacy & Us really stands for “Privacy and Usability” and aims to conduct privacy research and, over the next 3 years, train thirteen Early Stage Researchers (ESRs) — i.e., PhD students — to be able to reason, design, and develop innovative solutions to privacy research challenges, not only from a technical point of view but also from the “human side”.

The project involves nine “beneficiaries”: Karlstads Universitet (Sweden), Goethe Universitaet Frankfurt (Germany), Tel Aviv University (Israel), Unabhängiges Landeszentrum für Datenschutz (Germany), Uniscon (Germany), University College London (UK), USECON (Austria), VASCO Innovation Center (UK), and Wirtschaft Universitat Wien (Austria), as well as seven partner organizations: the Austrian Data Protection Authority (Austria), Preslmayr Rechtsanwälte OG (Austria), Friedrich-Alexander University Erlangen (Germany), University of Bonn (Germany), the Bavarian Data Protection Authority (Germany), EveryWare Technologies (Italy), and Sentor MSS AB (Sweden).

The people behind Privacy & Us project at the kick-off meeting in Munich, December 2015
The people behind Privacy & Us project at the kick-off meeting in Munich, December 2015

The Innovative Training Networks are interdisciplinary and multidisciplinary in nature and promote, by design, a collaborative approach to research training. Funding is extremely competitive, with acceptance rate as low as 6%, and quite generous for the ESRs who often enjoy higher than usual salaries (exact numbers depend on the hosting country), plus 600 EUR/month mobility allowance and 500 EUR/month family allowance.

The students will start in August 2016 and will be trained to face both current and future challenges in the area of privacy and usability, spending a minimum of six months in secondment to another partner organization, and participating in several training and development activities.

Three studentships will be hosted at UCL,  under the supervision of Dr Emiliano De Cristofaro, Prof. Angela Sasse, Prof. Ann Blandford, and Dr Steven Murdoch. Specifically, one project will investigate how to securely and efficiently store genomic data, design and implementing privacy-preserving genomic testing, as well as support user-centered design of secure personal genomic applications. The second project will aim to better understand and support individuals’ decision-making around healthcare data disclosure, weighing up personal and societal costs and benefits of disclosure, and the third (with the VASCO Innovation Centre) will explore techniques for privacy-preserving authentication, namely, extending these to develop and evaluate innovative solutions for secure and usable authentication that respects user privacy.

Continue reading New EU Innovative Training Network project “Privacy & Us”

Scaling Tor hidden services

Tor hidden services offer several security advantages over normal websites:

  • both the client requesting the webpage and the server returning it can be anonymous;
  • websites’ domain names (.onion addresses) are linked to their public key so are hard to impersonate; and
  • there is mandatory encryption from the client to the server.

However, Tor hidden services as originally implemented did not take full advantage of parallel processing, whether from a single multi-core computer or from load-balancing over multiple computers. Therefore once a single hidden service has hit the limit of vertical scaling (getting faster CPUs) there is not the option of horizontal scaling (adding more CPUs and more computers). There are also bottle-necks in the Tor networks, such as the 3–10 introduction points that help to negotiate the connection between the hidden service and the rendezvous point that actually carries the traffic.

For my MSc Information Security project at UCL, supervised by Steven Murdoch with the assistance of Alec Muffett and other Security Infrastructure engineers at Facebook in London, I explored possible techniques for improving the horizontal scalability of Tor hidden services. More precisely, I was looking at possible load balancing techniques to offer better performance and resiliency against hardware/network failures. The focus of the research was aimed at popular non-anonymous hidden services, where the anonymity of the service provider was not required; an example of this could be Facebook’s .onion address.

One approach I explored was to simply run multiple hidden service instances using the same private key (and hence the same .onion address). Each hidden service periodically uploads its own descriptor, which describes the available introduction points, to six hidden service directories on a distributed hash table. The hidden service instance chosen by the client depends on which hidden service instance most recently uploaded its descriptor. In theory this approach allows an arbitrary number of hidden service instances, where each periodically uploads its own descriptors, overwriting those of others.

This approach can work for popular hidden services because, with the large number of clients, some will be using the descriptor most recently uploaded, while others will have cached older versions and continue to use them. However my experiments showed that the distribution of the clients over the hidden service instances set up in this way is highly non-uniform.

I therefore ran experiments on a private Tor network using the Shadow network simulator running multiple hidden service instances, and measuring the load distribution over time. The experiments were devised such that the instances uploaded their descriptors simultaneously, which resulted in different hidden service directories receiving different descriptors. As a result, clients connecting to a hidden service would be balanced more uniformly over the available instances.

Continue reading Scaling Tor hidden services

George Danezis – Smart grid privacy, peer-to-peer and social network security

“I work on technical aspects of privacy,” says George Danezis, a reader in security and privacy engineering at UCL and part of the Academic Centre of Excellence in Cyber Security Research (ACE-CSR). There are, of course, many other limitations: regulatory, policy, economic. But, he says, “Technology is the enabler for everything else – though you need everything else for it to be useful.” Danezis believes providing privacy at the technology level is particularly important as it seems clear that both regulation and the “moralising” approach (telling people the things they shouldn’t do) have failed.

https://www.youtube.com/watch?v=wAbKB0kaH6c

There are many reasons why someone gets interested in researching technical solutions to intractable problems. Sometimes the motivation is to eliminate a personal frustration; other times it’s simply a fascination with the technology itself. For Danezis, it began with other people.

“I discovered that a lot of the people around me could not use technology out of the box to do things personally or collectively.” For example, he saw NGOs defending human rights worry about sending an email or chatting online, particularly in countries hostile to their work. A second motivation had to do with timing: when he began work it wasn’t yet clear that the Internet would develop into a medium anyone could use freely to publish stories. That particular fear has abated, but other issues such as the need for anonymous communications and private data sharing are still with us.

“Without anonymity we can’t offer strong privacy,” he says.

Unlike many researchers, Danezis did not really grow up with computers. He spent his childhood in Greece and Belgium, and until he got Internet access at 16, “I had access only to the programming books I could find in an average Belgian bookshop. There wasn’t a BBC Micro in every school and it was difficult to find information. I had one teacher who taught me how to program in Logo, and no way of finding more information easily.” Then he arrived at Cambridge in 1997, and “discovered thousands of people who knew how to do crazy stuff with computers.”

Danezis’ key research question is, “What functionality can we achieve while still attaining a degree of hard privacy?” And the corollary: at what cost in complexity of engineering? “We can’t just say, let’s recreate the whole computer environment,” he said. “We need to evolve efficiently out of today’s situation.”

Continue reading George Danezis – Smart grid privacy, peer-to-peer and social network security

Experimenting with SSL Vulnerabilities in Android Apps

As the number of always-on, always-connected smartphones increase, so does the amount of personal and sensitive information they collect and transmit. Thus, it is crucial to secure traffic exchanged by these devices, especially considering that mobile users might connect to open Wi-Fi networks or even fake cell towers. The go-to protocol to secure network connection is HTTPS i.e., HTTP over SSL/TLS.

In the Android ecosystem, applications (apps for short), support HTTPS on sockets by relying on the android.net, android.webkit, java.net, javax.net, java.security, javax.security.cert, and org.apache.http packages of the Android SDK. These packages are used to create HTTP/HTTPS connections, administer and verify certificates and keys, and instantiate TrustManager and HostnameVerifier interfaces, which are in turn used in the SSL certificate validation logic.

A TrustManager manages the certificates of all Certificate Authorities (CAs) used to assess a certificate’s validity. Only root CAs trusted by Android are contained in the default TrustManager. A HostnameVerifier performs hostname verification whenever a URL’s hostname does not match the hostname in the peer’s identification credentials.

While browsers provide users with visual feedback that their communication is secured (via the lock symbol) as well as certificate validation issues, non-browser apps do so less extensively and effectively. This shortcoming motivates the need to scrutinize the security of network connections used by apps to transmit user sensitive data. We found that some of the most popular Android apps insufficiently secure these connections, putting users’ passwords, credit card details and chat messages at risk.

Continue reading Experimenting with SSL Vulnerabilities in Android Apps

Teaching Privacy Enhancing Technologies at UCL

Last term I had the opportunity and pleasure to prepare and teach the first course on Privacy Enhancing Technologies (PETs) at University College London, as part of the MSc in Information Security.

The course covers principally, and in some detail, engineering aspects of PETs and caters for an audience of CS / engineering students that already understands the basics of information security and cryptography (although these are not hard prerequisites). Students were also provided with a working understanding of legal and compliance aspects of data protection regimes, by guest lecturer Prof. Eleni Kosta (Tilburg); as well as a world class introduction to human aspects of computing and privacy, by Prof. Angela Sasse (UCL). This security & cryptographic engineering focus sets this course apart from related courses.

The taught part of the course runs for 20 hours over 10 weeks, split in 10 topics:

Continue reading Teaching Privacy Enhancing Technologies at UCL

Measuring Internet Censorship

Norwegian writer Mette Newth once wrote that: “censorship has followed the free expressions of men and women like a shadow throughout history.” Indeed, as we develop innovative and more effective tools to gather and create information, new means to control, erase and censor that information evolve alongside it. But how do we study Internet censorship?

Organisations such as Reporters Without Borders, Freedom House, or the Open Net Initiative periodically report on the extent of censorship worldwide. But as countries that are fond of censorship are not particularly keen to share details, we must resort to probing filtered networks, i.e., generating requests from within them to see what gets blocked and what gets through. We cannot hope to record all the possible censorship-triggering events, so our understanding of what is or isn’t acceptable to the censor will only ever be partial. And of course it’s risky, or even outright illegal, to probe the censor’s limits within countries with strict censorship and surveillance programs.

This is why the leak of 600GB of logs from hardware appliances used to filter internet traffic in and out of Syria was a unique opportunity to examine the workings of a real-world internet censorship apparatus.

Leaked by the hacktivist group Telecomix, the logs cover a period of nine days in 2011, drawn from seven Blue Coat SG-9000 internet proxies. The sale of equipment like this to countries such as Syria is banned by the US and EU. California-based manufacturer Blue Coat Systems denied making the sales but confirmed the authenticity of the logs – and Dubai-based firm Computerlinks FZCO later settled on a US$2.8m fine for unlawful export. In 2013, researchers at the University of Toronto’s Citizen Lab demonstrated how authoritarian regimes in Saudi Arabia, UAE, Qatar, Yemen, Egypt and Kuwait all rely on US-made equipment like those from Blue Coat or McAfee’s SmartFilter software to perform filtering.

Continue reading Measuring Internet Censorship

One-out-of-Many Proofs: Or How to Leak a Secret and Spend a Coin

I’m going to EUROCRYPT 2015 to present a new zero-knowledge proof that I’ve developed together with Markulf Kohlweiss from Microsoft Research. Zero-knowledge proofs enable you to demonstrate that a particular statement is true without revealing anything else than the fact it is true. In our case the statements are one-out-of-many statements, intuitively that out of a number of items one of them has a special property, and we greatly reduce the size of the proofs compared to previous works in the area. Two applications where one-out-of-many proofs come in handy are ring signatures and Zerocoin.

Ring signatures can be used to sign a message anonymously as a member of a group of people, i.e., all a ring signature says is that somebody from the group signed the message but not who it was. Consider for instance a whistleblower who wants to leak her company is dumping dangerous chemicals in the ocean, yet wants to remain anonymous due to the risk of being fired. By using a ring signature she can demonstrate that she works for the company, which makes the claim more convincing, without revealing which employee she is. Our one-out-of-many proofs can be used to construct very efficient ring signatures by giving a one-out-of-many proof that the signer holds a secret key corresponding to a public key for one of the people in the ring.

Zerocoin is a new virtual currency proposal where coins gain value once they’ve been accepted on a public bulletin board. Each coin contains a commitment to a secret random serial number that only the owner knows. To anonymously spend a coin the owner publishes the serial number and gives a one-out-of-many proof that the serial number corresponds to one of the public coins. The serial number prevents double spending of a coin; nobody will accept a transaction with a previously used serial number. The zero-knowledge property of the one-out-of-many proof provides anonymity; it is not disclosed which coin the serial number corresponds to. Zerocoin has been suggested as a privacy enhancing add-on to Bitcoin.

The full research paper is available on the Cryptology ePrint Archive.

A Digital Magna Carta?

I attended two privacy events over the past couple of weeks. The first was at the Royal Society, chaired by Prof Jon Crowcroft.

All panelists talked about why privacy is necessary in a free, democratic society, but also noted that individuals are ill equipped to achieve this given the increasing number of technologies collecting data about us, and the commercial and government interests in using those.

During the question & answer session, one audience member asked if we needed a Digital Charter to protect rights to privacy. I agreed, but pointed out that citizens and consumers would need to express this desire more clearly, and be prepared to take collective action to stop the gradual encroachment.

The second panel – In the Digital Era – Do We Still Have Privacy? – organised in London by Lancaster University this week as part of its 50th Anniversary celebrations, chaired by Sir Edmund Burton.

One of the panelists – Dr Mike Short from Telefonica O2 – stated that it does not make commercial sense for a company to use data in a way that goes against their customer’s privacy preferences.

But there are service providers that force users to allow data collection – you cannot have the service unless you agree to your data being collected (which goes against the OECD principles for informed consent) or the terms & conditions so long that users don’t want to read them – and even if they were prepared to read them, they would not understand them without a legal interpreter.

We have found in our research at UCL (e.g. Would You Sell Your Mother’s Data, Fairly Truthful) that consumers have a keen sense of ‘fairness’ about how their data is used – and they definitely do not think it ‘fair’ for them to be used against their express preferences and life choices.

In the Q & A after the panel the question of what can be done to ensure fair treatment for consumers, and the idea of a Digital Charter, was raised again. The evening’s venue was a CD’s throw away from the British Library, where the Magna Carta is exhibited to celebrate its 800th anniversary. The panelists reminded us that last year, Sir Tim Berners-Lee called for a ‘Digital Magna Carta’ – I think this is the perfect time for citizens and consumers to back him up, and unite behind his idea.