While browsing that classiest of "news" sites, Slashdot, this evening, I came across an interesting ad for DSD seeking job applicants. In the bottom third of the ad was a seemingly Base64-encoded string:
On first inspection, running the decoded string through a hexdump, my first assumption was that it was some sort of encrypted message, with the 00 bytes signifying some sort of delimiter - judging by eye, it was roughly 64 bytes for the middle segment of the message, and perhaps the final 4 bytes were some sort of checksum. The distribution didn't quite seem right for an encrypted message, though.. in fact, some of the pairs of bytes seemed a bit familiar. Then I twigged. x86 assembly!
00000000 E8 00 00 00 00 call 0x00000005
00000005 5B pop ebx ; mov ebx, ip
00000006 8B CB mov ecx, ebx
00000008 83 C3 1E add ebx, 0x1E
0000000B 33 C0 xor eax, eax ; clear eax, edx
0000000D 33 D2 xor edx, edx
0000000F 8A 03 mov al, [ebx]
00000011 8A 11 mov dl, [ecx]
00000013 32 C2 xor al, dl
00000015 88 03 mov [ebx], al ; [ecx + 0x1E] ^= [ecx]
00000017 3C 00 cmp al, 0x00 ; [ecx] == [ecx + 0x1E]?
00000019 74 2B jz 0x00000046 ; yes? then breakpoint
0000001B 83 C1 01 add ecx, 0x01 ; no? ++ecx, try again
0000001E 83 C3 01 add ebx, 0x01
00000021 EB EC jmp 0x0000000F
00000023 33 FF xor edi, edi ; This is all crap from here on - just an encoded URL
00000025 BF F3 F9 31 1C mov edi, 0x1C31F9F3
0000002A B7 44 mov bh, 0x44
0000002C A5 movs es:[edi], ds:[esi]
0000002D A4 movs es:[edi], ds:[esi]
0000002E 67 F9 stc
00000030 75 1C jnz 0x0000004E
00000032 A5 movs es:[edi], ds:[esi]
00000033 E7 75 out 0x75, al
00000035 12 61 01 adc ah, [ecx+0x1]
00000038 04 E7 add al, 0xE7
0000003A A4 movs es:[edi], ds:[esi]
0000003B 62 EC bound ebp, esp
0000003D A7 cmps ds:[esi], es:[edi]
0000003E 64 8F C2 pop edx
00000041 00 00 add [eax], al
00000043 19 1C 3A sbb [edx+edi], ebx
00000046 CC int3
This code starts off by using "call +0", 'E8 00 00 00 00' to make this code position independent - calling the next opcode will push the current instruction pointer onto the stack, and when following it up with a "pop ebx", will copy the IP into ebx. This means that all locations in ebx/ecx will be relative to the first 'pop ebx' operation.
The next part of the code is pretty straightforward - from position (5 + 0x1E), or 0x23 onwards, it uses the opcodes from position 5 onwards as a simple xor key - continuing until the memory at n and n+0x1E match. Because the encoded data overlaps with the start of the encoded string, you can't just offset the data by 0x1E and look for matches; you must compare the already-xor'ed data in memory with the eventual value. The loop ends up terminating at offset 0x45, with value 0x3A, and then calls int 0x03 to generate a breakpoint, and presumably let you see the memory in whatever debugger they expect you to be using..
It's a cute strategy, but ultimately ineffective - who wants to work in Canberra for below-average pay? Anyone who knows their stuff in IT security is going to want a hell of a lot more than 91k a year. Even if they do get to write a bit of assembly for job ads.