Monday, January 31, 2011

Breaking a (very weak) steganography software: Camouflage

Breaking a (very weak) steganography software:

Camouflage


Steganography strength (is it easy to see there is hidden data?): Low
Cryptography strength (is it easy to recover the hidden data?): Low


Or: don't believe what you see on TV. They don't know what they are talking about.



     1. Background


Steganography is the technique for hiding data inside other data, for example, to hide a secret message inside a picture, or a secret picture inside a music file. There are several techniques to do that, and several softwares available. Some use complex algorithms and are pretty good at doing their job (it's difficult to affirm that there is actually hidden data, and even more difficult to retrieve it), some other use very simple algorithms and are easy to detect and break. You can find reliable and scientific information about steganography, digital watermarking (which is basically the same thing) and how to detect them on several web pages on the web, like the Neil Johnson site, the Fabien Petitcolas site, the Outguess page (here you can find a tool to detect steganography in images), and several others.
A few days ago, actually September 11th of 2002, first anniversary of the attack in the United States, there was a short subject talking about steganography use by terrorists. It was aired on the french private TV network "Canal Plus" on the show "Le Journal des Bonnes Nouvelles". Not only the tabloïd-like subject by itself raised my bullshit detector alarm to the red level (it's an old rumour, never proven, but the journalists transformed this rumour in facts: they said several times that terrorists actually used steganography), but also there was a lot of technical errors in the commentary. Sloppy and cheap journalism at its best, using the last hype or rumours to scare the audience.
They did a "demonstration" of a "famous" and "unbreakable, even by the NSA" steganography sofware, which hides data in a "totally indetectable way", and is "illegal". Here are some screenshots of the show:





[Images © Canal Plus]

When I commented in the french cryptography newsgroup about the fact that steganography is often detectable, the "computer specialist" interviewed in the show went totally mad (his insults in public and private are not worth translating), and proposed me a challenge. He set up a website with two JPG photos and challenged people to find which one contained a hidden Word document and what is the text contained in this document. It took me a few minutes with an hexadecimal editor to detect which image was modified, and post a first message. Less than one hour later, I posted a second message to show that I recovered the hidden data easily.
Let's see how I did.




     2. Presence of hidden data is evident


I first saw that one of the pictures had data added at the end. Because almost all the file formats have a fixed structure, and JPG is no exception, you can very easily see where the actual image ends, and where the "hidden" data starts. So much for the "undetectable" steganography software. The amount of data was compatible with a short Word file, so I guessed I was on the right track.
There are very few steganography software that hide data at the end of files, because it's an extremely weak and detectable scheme. I found out the software they used was "Camouflage" (the homepage of this software seems to be no more available). Compare its interface below with the show screenshots above.




Because the data at the end of the file, although evident to detect, seemed to be encrypted or scrambled in some way, I downloaded the software to do a few tests, and I was ready to reverse engineer it to trace its routines. I found out the data is so weakly encrypted that I didn't even need that. A few tests with choosen passwords were enough to break the software.




     3. Breaking Camouflage


Let me put a few tests images here, so you can follow the procedure on your own computer. Everything you need is an hexadecimal editor, and of course the "Camouflage" software if you want to do your own tests. For curious people, the photo a lovely piece from my art hologram collection, "Lucy in a tin hat" by english artist Patrick Boyd. The hidden message is a simple ASCII text called "secret_message.txt", containing the text "This is the secret message.". You can get it here.



The original JPG picture, without anything hidden in it, is 5,139 bytes.
The original picture with the secret message added, without password, size is 6,021 bytes.
The original picture, with the secret message added, password is "aaaa", size is 6,021 bytes.
The original picture, with the secret message added, password is "a" repeated 255 times, size is 6,021 bytes. 255 bytes is the maximum size for the password, you will understand why later; if it's longer it will produce an error (we could probably use this for a classical buffer overflow exploit to force Camouflage to execute some arbitrary code).



Please note that the following hexadecimal dump was not done with the exact same above images, but the structure of data is exactly the same.
The first thing you notice by comparing the original image with any of the other ones is the big block of very recognizable data at the end of the file, just after the FF D9 "end of JPG file" signature.
It starts with "20 00", and then some variable header, probably some data like the size of the hidden and original files, then you may have the encrypted hidden file data, and then a bunch of "20" (32 in decimal, the ASCII code for space: these buffers are probably for storing ASCII strings) with two small islands of encrypted data, and then a final and fixed signature.
If you don't want to go through it yourself, here is for example the added data at the end of the image with the hidden text file but no password:


1190     D2 46 97 A7 36 4B 11 FE E5 88 F5 5C DF F2 5F FF     End of the JPG file
11A0     D9 20 00 10 5E C2 01 B0 6C B1 38 10 5E C2 01 50     Start of Camouflage data
11B0     B6 E7 88 10 5E C2 01 00 34 6A 25 1B 00 00 00 56
11C0     FD 13 51 2C CF 67 C1 95 A7 DA 45 53 0A FD C1 FC
11D0     11 6A 3E 9E 85 06 35 CA 46 E3 FF FF FF FF 20 00     Some data
11E0     10 5E C2 01 20 12 83 53 10 5E C2 01 80 3D E9 88     + encrypted hidden file
11F0     10 5E C2 01 90 22 3F 72 71 F0 19 50 69 D2 4B 8C
1200     84 BC CC 04 47 0A B0 C7 E1 11 20 20 20 20 20 20
1210     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1220     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1230     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1240     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1250     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20     Empty buffer
1260     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1270     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1280     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1290     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
12A0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
12B0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
12C0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
12D0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
12E0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
12F0     20 20 20 20 20 20 20 67 F8 0A 56 75 88 7E 91 86     Some data
1300     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1310     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1320     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1330     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1340     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1350     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1360     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20     Empty buffer
1370     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1380     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1390     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
13A0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
13B0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
13C0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
13D0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
13E0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
13F0     20 20 20 20 20 20 1B 00 00 00 A1 11 00 00 02 00     Some data
1400     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1410     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1420     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1430     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1440     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1450     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1460     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20     Empty buffer
1470     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1480     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1490     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14A0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14B0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14C0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14D0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14E0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14F0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 74
1500     A4 54 10 22 97 20 20 20 20 20 20 20 20 20 20 20     Some data
1510     20 20 20



It would be easy to know exactly what all of these fields mean (for example, the size of the hidden text, which is 27 in decimal or 1B in hexa, appears twice, I underlined it above), but it's not needed. Let's get the only interesting one: the password.
The second thing that is really surprising is that, when the password changes, the first block of data, containing the "encrypted" and "hidden" message, does not change! Really weird: the encryption does not depend on the password! Only a few bytes are modified in the last big "20" island.
Let's now compare what's in this last big "20" island (starting at offset 1400h) of different files encrypted with different passwords.
First, the image with the text file hidden without password (same than above). It's all empty:


1400     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1410     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1420     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1430     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1440     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1450     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1460     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1470     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1480     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1490     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14A0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14B0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14C0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14D0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14E0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14F0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20  

Second, the image with the text file hidden with password "aaaa". I highlighted in yellow the modified bytes. You can see that a 4 bytes long password results in a 4 bytes modification. A clear sign that the encryption is weak:



1400     63 F4 1B 43 20 20 20 20 20 20 20 20 20 20 20 20
1410     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1420     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1430     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1440     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1450     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1460     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1470     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1480     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
1490     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14A0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14B0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14C0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14D0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14E0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
14F0     20 20 20 20 20 20 20 20 20 20 20 20 20 20 20  

Finally, the image with the text file hidden with password consisting in a 255 times repetition of "a" character. I highlighted in yellow the modified bytes. All the bytes in the buffer are now modified. Notice that the first 4 ones are the same than above, another clear sign of weakness:



1400     63 F4 1B 43 6D C7 75 80 80 AE DE 04 41 0E FF D2
1410     F8 04 2B 32 9A 97 14 35 CC 42 AC 1F FD 48 86 9D
1420     83 98 2C B3 23 2F 67 A1 99 FB 7D 03 59 15 45 61
1430     34 BE 20 AA 60 C3 D6 92 EE EB BC CD 52 E2 01 48
1440     92 19 45 5F 1B 8A B2 85 FC FC 22 F5 2B A6 24 0C
1450     44 15 8A 6A F9 A8 1D 9D A9 DB 53 0A 61 B2 A4 A3
1460     F5 55 CE D1 84 F4 1C 4B E5 C5 3E 84 0F 46 4B BA
1470     F7 1F 5F 29 58 27 AE 0E 10 CB 5D 50 FB C8 FF EE
1480     E8 12 D2 58 AB 53 B4 91 50 38 1D 63 4F E7 56 98
1490     4A 1F 30 93 20 E0 6D B5 04 74 96 11 B5 78 F9 41
14A0     DE 41 D9 34 06 AD E0 79 ED 72 5D 02 5D F3 70 85
14B0     3A 7A 69 43 01 2D 2B A4 EB D2 A4 14 A2 F1 1B 93
14C0     D3 D7 A9 B1 59 EB A3 E7 91 CD 88 AB 3D 2F 5F 68
14D0     48 19 48 F8 3B E5 B4 DB 3F B4 F3 1B 59 9B B1 01
14E0     8D 94 46 DB 8F D6 BF FE FA BF 04 B5 17 58 17 FD
14F0     BB 09 EC C9 C1 C7 7F B8 BA 6E 2C CA F3 AC 10   

The conclusion is that the password is stored at this position, probably masked by XORing it with a key composed by a fixed string of bytes. This string is now easy to obtain. Because XOR is reversible, you just have to XOR the above data with the password, which is "aaaa...", so in hexadecimal "61616161...":



63F41B436DC7758080AEDE04410EFFD2 F8042B329A971435CC42AC1FFD48869D
83982CB3232F67A199FB7D0359154561 34BE20AA60C3D692EEEBBCCD52E20148
9219455F1B8AB285FCFC22F52BA6240C 44158A6AF9A81D9DA9DB530A61B2A4A3
F555CED184F41C4BE5C53E840F464BBA F71F5F295827AE0E10CB5D50FBC8FFEE
E812D258AB53B49150381D634FE75698 4A1F309320E06DB504749611B578F941
DE41D93406ADE079ED725D025DF37085 3A7A6943012D2BA4EBD2A414A2F11B93
D3D7A9B159EBA3E791CD88AB3D2F5F68 481948F83BE5B4DB3FB4F31B599BB101
8D9446DB8FD6BFFEFABF04B5175817FD BB09ECC9C1C77FB8BA6E2CCAF3AC10 


XOR

61616161616161616161616161616161 61616161616161616161616161616161
61616161616161616161616161616161 61616161616161616161616161616161
61616161616161616161616161616161 61616161616161616161616161616161
61616161616161616161616161616161 61616161616161616161616161616161
61616161616161616161616161616161 61616161616161616161616161616161
61616161616161616161616161616161 61616161616161616161616161616161
61616161616161616161616161616161 61616161616161616161616161616161
61616161616161616161616161616161 616161616161616161616161616161 


=

02957A220CA614E1E1CFBF65206F9EB3 99654A53FBF67554AD23CD7E9C29E7FC
E2F94DD2424E06C0F89A1C6238742400 55DF41CB01A2B7F38F8ADDAC33836029
F378243E7AEBD3E49D9D43944AC7456D 2574EB0B98C97CFCC8BA326B00D3C5C2
9434AFB0E5957D2A84A45FE56E272ADB 967E3E483946CF6F71AA3C319AA99E8F
8973B339CA32D5F031597C022E8637F9 2B7E51F241810CD46515F770D4199820
BF20B85567CC81188C133C633C9211E4 5B1B0822604C4AC58AB3C575C3907AF2
B2B6C8D0388AC286F0ACE9CA5C4E3E09 297829995A84D5BA5ED5927A38FAD060
ECF527BAEEB7DE9F9BDE65D47639769C DA688DA8A0A61ED9DB0F4DAB92CD71 



     4. Back to the challenge


So now we know where the password is (fixed location relative to the end of the file, offset -275 in decimal), and how to decipher it. Let's go back to the challenge (the page does not exist anymore, but it's not really important, you can apply this analyse to any Camouflaged file).
We can find in the second image that the password buffer contains:



71 FA 0F 51 61 C3 66 85 84
We just need to XOR it with the 9 first bytes of the key, which are:



02 95 7A 22 0C A6 14 E1 E1
The result is:



73 6F 75 73 6D 65 72 64 65
s  o  u  s  m  e  r  d  e



Which, translated back from ASCII, is: "sousmerde" (an insult). We can try it with Camouflage, and it works nicely, we can now extract the "hidden" Word file (which contains other insults). You can test it by downloading the images on the challenge (once again, the page does not exist anymore, but it's not really important, as you can apply this analyse to any Camouflaged file pages).
[As a funny side note: a few hours after I posted my results on the cryptography newsgroup (first and second message), the images suddenly changed, and the "computer specialist", ridiculous because his "unbreakable" challenge was so easily broken, claimed that the original images never existed and I invented it all. Fortunately several people downloaded the files and independantly verified my results before he changed them].
Well, my hobby being reverse engineering and not psychiatry, I let the dog bark, and I thought this small analysis could nevertheless be interesting for some people, so here it is. I've quickly programmed a small utility to automatize the recovery of Camouflage passwords [Now version 0.2]. Source included, as always. It works with Camouflage 1.1.1 and 1.2.1.




     5. Conclusions


Don't trust what is said on TV, journalists don't know what they are talking about, and instead of doing a little bit of research asking to competent people (there are plenty in the academia and the corporate worlds), they fall for the hype, and listen to people who are incompetent or just want to have their faces on a TV screen.
Most of the steganography software around are easy to detect and to break.
If the algorithm used in some encryption or steganography software is not documented precisely, its strenght is probably very weak. Never use them for serious security purposes.
Don't trust what you see on the internet, and that includes this page. Be especially aware of people with a big mouth who use big words ("unbreakable", "indetectable", etc...). Test everything yourself, or ask different people who may know more. There are plenty of forums on Usenet with specialists about almost any subject you can imagine.
[Note written much later: I've since discovered some other tools to unprotect Camouflage files:
- CKFP (Camouflage / Kamaleon File Patcher) by Vikt0ry.
- CamouflageCrack by Kasky.
- CamoDetect Perl script by Andrew Christensen, found on PacketStorm]

Have a nice day!

4 comments:

  1. Nice copy/paste of my article, without mention of the source :
    http://www.guillermito2.net/stegano/camouflage/index.html

    ReplyDelete
    Replies
    1. Thats a sorry state. I read it all on guillermito2.net long back and mentioned it in my book as well in the year 2008

      Delete
  2. Also, after you mention the source, could you please host the images yourself ? They are still on my server. Thanks.

    ReplyDelete
  3. I agree to Guillermito. All the hardwork was done by him and it was copied as it is. Even i have mentioned this all in my book but at least cared to mention the original author.
    Its absolutely shameful

    Boonlia Prince Komal

    ReplyDelete