Skip to main content

Cookieless Browser Tracking

We all know about tracking cookies and privacy. However, according to EFF it isn't necessary to use cookies to do a fair job of tracking your browser activities. According to their research browsers give 10.5 bits of identifying information in the userAgent string, which is supplied to the web server with every request. This is around a third of the information required to uniquely identify you.

They have set up a website to gather more data and give you a 'uniqueness' indicator for your browser, which you can find here. This data set is growing quite rapidly and will tell you how many of the userAgent strings they have received that are the same as yours. I managed to find a machine to test that was unique amongst the 195,000 machines they have tested. This means that someone could potentially track that machine even if cookies are disabled. Even if you come out with the same userAgent string as others, you can be narrowed down by using geolocation of your IP, browser plugins, installed fonts, screen resolution, etc. This isn't a new idea and others have tried it, like browserrecon. Of course if you have a static IP address then you are fairly easy to track anyway.

Various suggestions are made to help protect yourself, such as don't allow scripts to run on untrusted websites, which is fairly obvious. However, although this may reduce the amount of data given out from highs of 15.5 bits on a Blackberry or 15.3 bits on Debian, this won't stop the whole problem. It seems like the worst devices for giving out identifying information are Blackberry and Android phones, with minimum figures of over 12 bits. The best combination would seem to be FireFox running on Windows, which can be controlled down to only 4.6 bits (although highs are around double this), but this could just be because it's the most common combination.

What can you do? Don't visit untrusted sites. Also, you could change your userAgent string. It is just a text string stating the capabilities of your machine so that the web server can customise content to suit you. However, there is no real harm in tweaking this to fall in line with more common strings so that you are harder to track. You have to be careful here, because just removing most of the information will probably make your userAgent string unique. Alternatively, you could regularly change the string. Perhaps browsers should change the string with every connection? Plugins could do this, like User Agent Switcher. This would allow you to use different strings across different sites. Maybe hiding certain activities by temporarily switching the userAgent string would be useful.

FireFox and Opera are both quite easy to configure - type about:config or opera:config in the address bar respectively and navigate to the userAgent options. Internet Explorer is slightly more trickey, in that you have to make a registry change to alter the userAgent string. Navigate to [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\User Agent] in regedit. Here you can create string values for 'Compatible', 'Version' and 'Platform' to control what is sent. Under the 'Post Platform' key are a whole bunch of additional parameters that will be added to the string, so you can change or remove these.

Comments

  1. Hello,

    I would like to refer to an old project of mine. browserrecon is an implementation which uses application fingerprint techniques to identify web clients:

    http://www.computec.ch/projekte/browserrecon/

    Bye, Marc

    ReplyDelete
  2. Hi Marc,

    I did put a link in the main text to your project. Happy to have you add it again though.

    Luke

    ReplyDelete

Post a Comment

Popular Posts

Coventry Building Society Grid Card

Coventry Building Society have recently introduced the Grid Card as a simple form of 2-factor authentication. It replaces memorable words in the login process. Now the idea is that you require something you know (i.e. your password) and something you have (i.e. the Grid Card) to log in - 2 things = 2 factors. For more about authentication see this post . How does it work? Very simply is the answer. During the log in process, you will be asked to enter the digits at 3 co-ordinates. For example: c3, d2 and j5 would mean that you enter 5, 6 and 3 (this is the example Coventry give). Is this better than a secret word? Yes, is the short answer. How many people will choose a memorable word that someone close to them could guess? Remember, that this isn't a password as such, it is expected to be a word and a word that means something to the user. The problem is that users cannot remember lots of passwords, so remembering two would be difficult. Also, having two passwords isn't real

How Reliable is RAID?

We all know that when we want a highly available and reliable server we install a RAID solution, but how reliable actually is that? Well, obviously, you can work it out quite simply as we will see below, but before you do, you have to know what sort of RAID are you talking about, as some can be less reliable than a single disk. The most common types are RAID 0, 1 and 5. We will look at the reliability of each using real disks for the calculations, but before we do, let's recap on what the most common RAID types are. Common Types of RAID RAID 0 is the Stripe set, which consists of 2 or more disks with data written in equal sized blocks to each of the disks. This is a fast way of reading and writing data to disk, but it gives you no redundancy at all. In fact, RAID 0 is actually less reliable than a single disk, as all the disks are in series from a reliability point of view. If you lose one disk in the array, you've lost the whole thing. RAID 0 is used purely to speed up dis

Trusteer or no trust 'ere...

...that is the question. Well, I've had more of a look into Trusteer's Rapport, and it seems that my fears were justified. There are many security professionals out there who are claiming that this is 'snake oil' - marketing hype for something that isn't possible. Trusteer's Rapport gives security 'guaranteed' even if your machine is infected with malware according to their marketing department. Now any security professional worth his salt will tell you that this is rubbish and you should run a mile from claims like this. Anyway, I will try to address a few questions I raised in my last post about this. Firstly, I was correct in my assumption that Rapport requires a list of the servers that you wish to communicate with; it contacts a secure DNS server, which has a list already in it. This is how it switches from a phishing site to the legitimate site silently in the background. I have yet to fully investigate the security of this DNS, however, as most