Thursday, August 27, 2015

Flare-On 2015 #2 Write-Up, Part 1

I've been away from CTFs for a while, but at the urging of some colleagues, I took a brief look at this year's Flare-On challenge. I won't post a complete write-up today because I wouldn't want to give away the flag itself before the Flare-On challenge ends on September 8th. But maybe this article will give you ideas if you are still working on any of the challenges.

Because I knew I wouldn't be able to stay with it, I focused on using my time to learn how to do new things. The first challenge was easy and educational because FLARE produced a relevant write-up in 2014 that includes an IDA script to decode XOR-encoded data, a technique which has also been documented in Practical Malware Analysis. The second challenge, however, featured some unusual instructions in its decode-and-check loop:



The sahf, adc, and popf mnemonics were unfamiliar to me, however this did not look like a case of disassembly desynchronization. I started by reading the Intel manuals...


...but my short attention span got the better of me. So, I began using WinDbg to see why the sahf and adc instructions were used, what effect they would have on the program's control flow, and whether I was wrong about the disassembly desync. After a single iteration of this, I decided that my laziness and short attention span would require me to create a repeatable substrate for an iterative process of observation and experimentation: I needed automate the instrumentation.

There are several options for this. The painstaking way would be to pick up the code I wrote based on Microsoft's debugger documentation and a CodeProject article from 2011. Other alternatives included experimenting with DynamoRIO or Intel's PIN (which I will get into another time). And then there's Visi's vtrace. This latter tool is Python-based, so it was pretty appealing. I decided to give it a shot.

First, I downloaded the vtrace/vdb distribution (I might have grabbed it from here). Then I took a look at Visi's article to get started. I also found an interesting presentation and a PlaidCTF write-up that provided me with some additional ideas.

Reversing in IDA gave me some offsets to start with:

# Offsets for d88dafdaefe27e7083ef16d241187d31 *very_success.exe:
# cs:0x4010ae <-- sahf instruction: single iteration of password character scan
# cs:0x401046 <-- call kernel32!ReadFile, len at ebp-4
# cs:0x40104c <-- return from kernel32!ReadFile, len at ebp-4
# ds:0x402159 <-- gPasswordFromCONIN$[]

With this, I wrote a vtrace script that used the MyBreakpoint class in the presentation above to manipulate program execution and mark how many times the sahf instruction was hit. Specifically, I wrote a function to skip the ReadFile() Win32 API call and instead feed a password of my choosing to the program:

34 def skipread(trace):
35     print "Broken at eip = " + hex(trace.getProgramCounter()) + " (skipread)"
36     trace.setProgramCounter(0x400000 + 0x104c)
37     print "Trying " + ''.join(pw)
38     tmp = ''.join(pw)
39     # buf = pw.encode('utf-8')
40     trace.writeMemory(0x400000 + 0x2159, tmp)
41     print "ebp: " + hex(trace.parseExpression("rbp"))
42     print "ebp-4: " + hex(trace.parseExpression("rbp-4"))
43     print "[ebp-4]: " + hex(trace.parseExpression("poi(rbp-4) & 0xffffffff"))
44     trace.writeMemory(trace.parseExpression("rbp-4 & 0xffffffff"), struct.pack("@i", 37))

I executed this with a few different passwords. Of particular interest were the results with all "a" characters and all "b" characters:



I noticed something useful here: the breakpoint I had set on the sahf instruction to mark the number of loop iterations was hit once more when processing the "a" password than when processing the "b" password. Looking back on my notes, I see that I was enthusiastic about this:
Holy **** yes, saw password processing breakpoint hit twice for 'a'x37 and only once for 'b'*37
The reason for my excitement was that this probably meant the password could be brute forced. So, that is exactly what I did. Sure, maybe I skipped the whole part where you analyze and understand the code. But there's more to life than intimately understanding every Intel instruction I've ever encountered. In my next article, I'll post the scripts and talk about the results.