Skip to main content

Twitter Steganography

I have recently been thinking about Steganography again and various carriers as well as applications. For those of you that don't know what Steganography is, it simply means 'hidden writing' from the Greek. Some examples of steganography are: tatooing the scalps of messengers and then waiting for their hair to grow back; writing a message on the wood of a wax tablet before pouring the wax in; 'invisible inks'; pin pricks above characters in a cover letter; etc. Basically, we have a 'cover', which could be an image, passage of text, etc., that we are happy for anyone to see and a message that we want to hide within it so that it is undetectable. It turns out that this last part is quite hard.

Anyway, I thought I'd look at techniques to embed data within Twitter as it is popular now and people are starting to monitor it. Hiding within a crowd, however, is a good technique as it takes quite a lot of resources to monitor all activity on a service like Twitter. The techniques described here would work equally well on other social networks, such as LinkedIn, Facebook, etc. How do we embed data within a medium that allows only 140 plaintext characters though? Well, there are several methods, a few of which I'll talk about here. I'm only going to discuss methods that would be quite simple to detect if you knew what you were looking at, but that will go undetected by the majority of people.

The first method is to use a special grammar within your Tweet. If the person you are communicating with knows the grammar then you can alter a message to pass data back and forth. A simple example of this technique would be to choose 2, 4 or 8 words that mean the same thing, but each one represents a value. For example, you could use fast, speedy, quick and rapid to represent 0, 1, 2 and 3 respectively, effictively giving you 2-bits of embedded data. If we had 8 words then we would have 3-bits and so on. This can be extended to word order in the sentence and even the number of words per sentence. However, messages can be difficult to construct in such a way as to be readable and this is not a high data rate. We could probably get only one or two bytes worth of data in an update message.

Another method is suggested by Adrian Crenshaw. He used unicode characters, giving access to two versions of the charcterset. So the lower range represented 0's and the upper range of characters represented 1's. This is a good scheme, as you then transfer as many bits as there are characters in your message. This gives a maximum of 140 bits. The issue with his scheme is that on some devices and Twitter clients the two character sets look quite different and it is definitely detectable. However, a good idea nonetheless.

Following on from this, we can encode bits within the message, so that they aren't seen by the user, by appending whitespace to the end of the message. Whitespaces are things like a space or a tab, i.e. a place where a letter isn't. A simple method to embed your data is to represent a 0 by a space and a 1 by a tab. The good thing is that web browsers will display multiple whitespaces as only a single space, so this will be invisible within a browser. Other clients will print them out, but there's nothing to see. Now, Twitter, and most social media clients, will strip whitespace from the end of your message as they assume that you added them by accident. This will destroy your data. However, if you add the   HTML code to the end of your message then it will keep all the whitespace (indeed, you could put any character at the end, but you may see multiple spaces in some clients). The advantage of using the   is that it is a whitespace character and won't be displayed in your message. Now, you will need to write a short message and add the non-breaking space at the end, so you won't have that much space, but you should be able to get up to nearly 16 ASCII characters in this way, but certainly over 100 bits if you keep your message short.

We can also be quite blatant with our data. We can rely on the fact that people won't know we're transferring data and won't look very hard. A simple URL shortening service can be exploited in two ways to embed data. The simplest method is to make up a URL. Twitter users rely on http://bit.ly and http://twitpic.com extensively. If we base-64 encode our text or data, then we can add 6 bytes (or characters) to a URL. For example, I could tweet: "Just read this http://bit.ly/UkxSIFVL and saw the photo http://twitpic.com/IEx0ZC4=". Now, these URLs are fake and don't lead anywhere. However, the base-64 encoded text of the two URLs decodes to "RLR UK Ltd." and how many people will follow your link anyway. Even if they do, the two sites here will just put up a helpful message that there was an error with the URL. You can now appologise and provide two real URLs. Meanwhile the message has got across. Obviously more URLs mean more data - up to 36 bytes if you just send 6 URLs.

The second method of using a URL shortening service is to write your own. Now you can provide real URLs but flag particular IP addresses or require the addition of an extra parameter to the URL to make it show a different page to the person you are trying to communicate with, e.g. a password. This isn't really Steganography as such, but could be used to transfer URLs that can be checked by someone else and don't reveal the true target.

The final method I'm going to discuss here is the use of a Stego Profile Image. All social media networks allow you to upload and display a small image on your page. Why not use traditional Steganographic techniques to embed data within this image. If you change your image regularly then it won't look suspicious when you change it to transfer data to someone. There are tools on the Internet to do this for you by replacing the Least Significant Bit (LSB) of every pixel with one bit of your data. This is a simple scheme and easy to detect. There are other much better schemes that are not only harder to detect, but that will give you more 'space' within the image to store your data. To give you some idea, a 4-colour, 73x73 pixel GIF like Twitter's default images can store nearly 4KB of data with no visual impact. However, that's for another blog post...

Comments

  1. If you are interested in a "very large" data-hiding capacity steganography, why not visit the following Web site.
    http://datahide.org/BPCSe/
    From its link you can download a fantastic steganography program for Windows without any charge.

    ReplyDelete

Post a Comment

Popular Posts

Trusteer or no trust 'ere...

...that is the question. Well, I've had more of a look into Trusteer's Rapport, and it seems that my fears were justified. There are many security professionals out there who are claiming that this is 'snake oil' - marketing hype for something that isn't possible. Trusteer's Rapport gives security 'guaranteed' even if your machine is infected with malware according to their marketing department. Now any security professional worth his salt will tell you that this is rubbish and you should run a mile from claims like this. Anyway, I will try to address a few questions I raised in my last post about this. Firstly, I was correct in my assumption that Rapport requires a list of the servers that you wish to communicate with; it contacts a secure DNS server, which has a list already in it. This is how it switches from a phishing site to the legitimate site silently in the background. I have yet to fully investigate the security of this DNS, however, as most

Web Hosting Security Policy & Guidelines

I have seen so many websites hosted and developed insecurely that I have often thought I should write a guide of sorts for those wanting to commission a new website. Now I have have actually been asked to develop a web hosting security policy and a set of guidelines to give to project managers for dissemination to developers and hosting providers. So, I thought I would share some of my advice here. Before I do, though, I have to answer why we need this policy in the first place? There are many types of attack on websites, but these can be broadly categorised as follows: Denial of Service (DoS), Defacement and Data Breaches/Information Stealing. Data breaches and defacements hurt businesses' reputations and customer confidence as well as having direct financial impacts. But surely any hosting provider or solution developer will have these standards in place, yes? Well, in my experience the answer is no. It is true that they are mostly common sense and most providers will conform

Trusteer's Response to Issues with Rapport

I have been getting a lot of hits on this blog relating to Trusteer's Rapport, so I thought I would take a better look at the product. During my investigations, I was able to log keystrokes on a Windows 7 machine whilst accessing NatWest. However, the cause is as yet unknown as Rapport should be secure against this keylogger, so I'm not going to share the details here yet (there will be a video once Trusteer are happy there is no further threat). I have had quite a dialogue with Trusteer over this potential problem and can report that their guys are pretty switched on, they picked up on this very quickly and are taking it extremely seriously. They are also realistic about all security products and have many layers of security in place within their own product. No security product is 100% secure - it can't be. The best measure of a product, in my opinion, is the company's response to potential problems. I have to admit that Trusteer have been exemplary here. Why do I