TamuCTF 2017 - Steganography - Musical Bits

The TamuCTF is a Jeopardy-style CTF. This walkthrough is explaining the “Musical Bits” challenge which is a Steganography challenge. I worked with Iptior on this one and it tooks severals hours of pain before success! Let’s go!

Discovery

The challenge gave us a text telling that a hacker had a strang song.. Nothing to learn from this. So we download the file given and start by running the file command.

$ file magical_bits
magical_bits: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 44100 Hz

So we have an audio file. We open it using an audio player and we can hear many little sounds during 20 minutes. These sound are little “beeps” of 2 types and are really fast. Thanks to the title of the challenge, we can guess these sounds are a binary representation.

We then want to see what compose the sound, so we open the file in Audacity. After zooming a lot and skip the “introduction” part, we see some patterns. Here is a sample of these ones.

musical_bits_1

We see packets of 8 signals and we can guess this is bytes and bits. Indeed, we can see 3 types of signal. 2 types are almost equals, one is bigger than the other and the last type is at the end of each packet.

So, maybe the meaning of each signal is the following :

  • Big signal is a "1" bit
  • Small signal is a "0" bit
  • Long signal is here to separate each byte
Following this logic, we can try to convert manually the first bytes.. If we take the 4 first bytes, we obtain the following binary string :

11111111 11011000 11111111 1110000

Then we convert this into hexadecimal to get : FF 4D FF E0

Hmm hmm.. After some research, we found that this is the signature for a JPEG file ! We are know sure that we have the good method, and that we have to extract all these bits to form a JPEG picture.

Extraction - Firt attempt

First, I looked for some tool to extract this, or a way to get this in python or whatever.. But I ran out of ideas and finally I decided to try exporting all the song into raw data.

After removing the introduction part, we have a nice raw data file we can exploit. Using hexedit, we can display the content of the file.

musical_bits_2

If we scroll down, we can notice that data often look like the shape we see on the above picture. Furthermore, we can notice 2 different shapes.

musical_bits_3

That could be our “0” and “1”.. One again, manually, we scroll shapes and see if that match. And.. Yes! We have our data. So now, we have a way to extract our bits.

Indeed, we have to find a pattern in each shape to know if the read data is a “0” or a “1”.

We tried many supposed pattern and extracted many bits, but we always had a problem, like several wrong bits.. As long as it was not the good solution, we won’t show our scripts for this part.

Extraction - Second attempt

And that’s the moment where Iptior comes in the challenge.. :)

I ran out of idea and sent to the team my python script and the file, and he came with a solution. The idea of pattern was good but the way we started it wasn’t that good.

So we did something we missed at the beginning, that could have saved us lot of time. We first ran the strings command on the file. Here is a sample of the output

$ strings magical_bits
RIFF
WAVEfmt
data

...

':*p,..v/L0
,}++*
'a&L%Y$
!u!>!
%2%v$
#=$r$b$
"2!? H
< B \ \ b - ':*p,..v/L0 ,}++* 'a&L%Y$ !u!>!
%2%v$
#=$r$b$
"2!? H
< B \ \ b - ':*p,..v/L0 ,}++* 'a&L%Y$ !u!>!
%2%v$
#=$r$b$
"2!? H
< B \ \ b - ':*p,..v/L0 ,}++* 'a&L%Y$ !u!>!
%2%v$
#=$r$b$
"2!? H
< B \ \ b - ':*p,..v/L0 ,}++* 'a&L%Y$ !u!>!
%2%v$
#=$r$b$
"2!? H
< B \ \ b - ':*p,..v/L0 ,}++* 'a&L%Y$ !u!>!
%2%v$
#=$r$b$
"2!? H
< B \ \ b - ':*p,..v/L0 ,}++* 'a&L%Y$ !u!>!
%2%v$
#=$r$b$
"2!? H
< B \ \ b - ':*p,..v/L0 ,}++* 'a&L%Y$ !u!>!
%2%v$
#=$r$b$
"2!? H
< B \ \ b - 1 A A c ## ` . ^ . W ( O s D b q c , 5 H C ... ':*p,..v/L0 ,}++* 'a&L%Y$ !u!>!
%2%v$
#=$r$b$
"2!? H
< B
\ \
b -
* V b J
x t
s 0
!F s
6
Y !
* V b J
x t
s 0
!F s
6
Y !
* V b J
x t
s 0
!F s
6
Y !
* V b J
x t
s 0
!F s
6
Y !
1 A
A c

...

Hmm hmm.. Here again, we can see some characters very often, maybe we can pull out a pattern of this output ?

After some research, we found a pattern!

  • "1" can be extracted when we find the "<" and "B" characters
  • "0" can be extracted when we find the "!", "F" and "s" characters
From this, we can write a small python script to extract all these bits and get a binary string.
#!/usr/bin/env python

with open("essai.txt", "r") as fichier:
musi=fichier.read()

musi = musi.split('\n')
outBin=""
for textM in musi:
if("!" in textM and "s" in textM and "F" in textM):
outBin+="0"
elif("B" in textM and "<" in textM):
outBin+="1"

print(outBin[1:])

Conversion

Right now, we have a binary string, but we found that we should have a picture at the end. So let’s convert this string into an image!

In order to do this, we will use another script that Iptior wrote.

#!/usr/bin/env python

with open("binary.txt", "r") as fichier:
binaire=fichier.read()
outFile=""

i=1
outHex=""

while i <= (len(binaire)-8):
number = (int(binaire[i]+binaire[i+1]+binaire[i+2]+binaire[i+3]+binaire[i+4]+binaire[i+5]+binaire[i+6]+binaire[i+7],2)) #on peut je pense faire number = int(binaire[i:i+8],2)
outFile += chr((number))
i = i+8
outHex += str(hex(number)[2:])

print(outHex)

Getting the flag

We then try to open the generated image.

musical_bits_4 class=

Here comes the flag !! :D

ECW 2017 - Web - Hall of Fame