An In-Depth Tutorial on Auto Assembler: Extended
INDEX:
i) Intro
1) Taking a closer look at code-caves
2) LoadBinary()
3) A deeper look at flags
4) A deeper look at registers
5) Abusing the Call instruction to modify EIP
6) Define Byte/Word/DWORD
7) Using negative values in assembly
8) The Stack
9) Using Cheat Engine to debug assembly
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section i: Intro
Well, my purpose for writing this it that there are many tutorials on basic assembler/hacking/whatever, but there’s nothing that explains the more advanced stuff. A lot of this stuff that is covered in this tutorial is stuff that I’ve had to do research on, and it wasn’t easy to figure out. A lot of people don’t know quite how to research some of this stuff (or really know that it exists/never really thought about it), so I’m compiling it.
Note: This is a more advanced/extended version of my previous tutorial. If you haven’t read that one, then I highly suggest you read it here, before continuing. Yeah, some of this will be review, but get over it.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 1: Taking a closer look at code-caves
First, let’s start off with a definition–what is a codecave?
A codecave is a segment of code placed in a region of unused memory, executed from a statement within the used memory. That’s a bit of a mouthful, so let’s use an example.
Say we have this game to hack (isn’t that so rare? =P), and a line of instruction, let’s say modifies our health. Let’s say that instruction is
Code: |
sub [esp+50],ecx |
What if we want to modify that code to do a couple other weird things. Maybe we’ve got a variable stored somewhere in the program that we want to set the health to? We use a codecave.
Let’s start off by deciding what we want our codecave to do. Let’s say that we want it to take the value of 0xDEADBEEF and put that into the health.
Code: |
push eax //save the value of eax, so we don’t screw anything up mov eax,[DEADBEEF] //move the value of deadbeef into eax mov [esp+50],eax //move eax (the value of deadbeef) into our health value pop eax //restore eax, so nothing gets screwed over |
Now we set that up. Let’s say that the address of the subtraction instruction was at address 00123456. Our codecave script might look something like this:
Code: |
alloc(our_codecave,1024) //one kilobyte is enough space, it’s overdoing it, but oh well label(return) 00123456: //the address of our subtracting instruction, as I mentioned earlier our_codecave: //let’s define it |
(Just note, that this would not be right, because the sub instruction is only 4 bytes long, but a jmp is 5–check out my other tutorial for information on that.)
Now, I stated that I would explain the return stuff. What that does is helps Cheat Engine to know where to bring your code back. It would be a huge pain to go and calculate where exactly everything would be. Cheat Engine gives this handy little function so that you don’t. We label return at “return:”, so Cheat Engine marks that specific address in its collection of information, so that when we say “jmp return”, it substitutes the address into “return”. An example?
Where we label return, it would be 0012345B (00123456 + 5), so Cheat Engine takes this, and goes to the “jmp return”, and instead makes it “jmp 0012345b”.
And there you have it, a (mostly) working script demonstrating a codecave.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 2: LoadBinary()
What is LoadBinary? To understand that, one would have to look at the code of Cheat Engine:
Code: |
1229 begin 1230 binaryfile:=tmemorystream.Create; 1231 try 1232 binaryfile.LoadFromFile(loadbinary[i].filename); 1233 ok2:=writeprocessmemory(processhandle,pointer(testdword),binaryfile.Memory,binaryfile.Size,bw); 1234 finally 1235 binaryfile.free; 1236 end; 1237 end; |
(taken from here)
So basically, this is just taking a file with a bunch of bytes (assembly instructions) and sticking them into the process, at the specified address.
Where is this used?
LoadBinary can be used to trick the game into thinking that it is not having its memory modified. For example, the maplestory CRC script from a while back consisted of loading the original bytes of MapleStory into a section of memory (LoadBinary), then when the game checked the memory, it used a codecave (see previous section) to make the game think that nothing was wrong.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 3: A deeper look at flags
First, let’s run through a quick use of all of the general flags.
The Carry Flag:
This flag is set (value = 1) when an unsigned overflow occurs. What is that? An unsigned overflow would be something along the lines of 255+5, or 200 + 60 (They both go over the value 255, which is the maximum value of a byte). The flag is cleared (value = 0) when there is no overflow.
The Zero Flag:
This flag is set when the result of an operation is zero (I’ll talk about the use of this in compare operations (cmp) later). It is cleared when the result is non-zero.
The Sign Flag:
This flag is set when the result of an operation is negative (1-2). The flag is cleared when a result is positive.
The Overflow Flag:
This flag is similar to the carry flag, except that it is set when a signed overflow occurs (value exceeds -128 or 127).
The Parity Flag:
This one is a little confusing (or at least, it was for me). What you have to remember though, is binary. This flag is set when the resulting bit count is even. For example, if you had the result “4”, its binary equivalent would be 100. How many bits does it have? 3 = even, so Parity Flag is cleared. However, if you had 50 as your result (110010), the Parity Flag would be set.
The Adjust Flag:
This flag is set when an overflow occurs in the bottom 4 bits of a result (when doing things with stuff such as BCD).
~~~~~
Now, I’m sure you all are familiar with something along the lines of “TICK ZF!!!”
Ok, so you can use CE to set the zero flag, but what if you want to do it in a script?
Here’s a little ghetto way of doing it, by messing with the way ZF works:
Code: |
push eax mov eax,3 cmp eax,3 pop eax |
Yeah, it’s four lines long, but do you have a better way?
Now, what about the other flags, you ask?
You should be able to figure out how to set others, just remember to reset the registers when you’re done (push/pop). 😉
~~~~~~~~~~~~
I mentioned that I would explain how Zero Flag affects the compare instructions, and I suppose I’ll do that now.
Now, first of all, you should know how the compare instruction works. This instruction acts like a SUB instruction. For example, if you were comparing 2 and 3
Code: |
cmp 2,3 |
(note that you can’t actually do this, but in theory…)
PF, AF, and SF would be set. Let’s look at that in depth (going back to our definitions here)
[PF] This doesn’t seem like it matters much–but sure, whatever. We’ve got an even number of bits.
[AF] Overflow, eh?
[SF] Ahh, so the result is negative? Wait, 2-3 = -1, which happens to be a negative value, right?
(I’m using Cheat Engine to debug this, I discuss this later on).
Let’s check out another one.
Code: |
cmp 4,3 |
(once again, this cannot actually happen, we’re discussing theory)
[PF] One bit, meh, sure.
Let’s think about that, though: 4-3 = 1, right? 1 in binary is simply “1”, right? Coincidence anyone? =P
~~~~~~~~~~~~
If you think about this, you can really screw up some games. Let’s say that the game has a compare to check if you’re dead:
Code: |
cmp [playeraddress],0 |
If you take that and make a codecave:
Code: |
cmp [playeraddress],0 push eax mov eax,2 cmp eax,1 pop eax jmp whatever |
You’ve now cleared the zero flag. Why does that matter?
Let’s say that your health is actually 0:
Code: |
cmp 0,0 |
0 – 0 = 0, right? That would set the Zero Flag, telling the game that you’re dead. However, if you clear the Zero Flag, the game has no way to tell whether you’re dead or not.
~~~~~~~
If you want to learn more about which instructions affect which flags, look at Appendix A, of Volume 1 of the Intel Reference Manuals in my credits.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 4: A deeper look at registers
First, let’s go ahead and define the 32 bit registers (the ones most commonly used):
Volume 1 of Intel Reference Manuals wrote: |
EAX—Accumulator for operands and results data.
EBX—Pointer to data in the DS segment. ECX—Counter for string and loop operations. EDX—I/O pointer. ESI—Pointer to data in the segment pointed to by the DS register; source pointer for string operations. EDI—Pointer to data (or destination) in the segment pointed to by the ES register; destination pointer for string operations. ESP—Stack pointer (in the SS segment). EBP—Pointer in the data on the stack (in the SS segment). |
That should explain that. Why isn’t EIP included? I’m going to explain that later.
Now, all of the 8 registers explained above have registers which contain the low 16 bits. To get this, all you would do is remove the “E” from the name. For example, to access the low 16 bits of the EBP register, you would just use “BP”; for the low bits of EDX, you would simply use “DX”.
~~~~~~~~~~~
SPECIAL REGISTERS:
There are two special registers, and neither can be modified directly.
What are these two registers? The EIP register (as I mentioned above), and the EFLAGS register.
Let’s start off with a definition of each from the manual:
Volume 1 wrote: |
EFLAGS Register The 32-bit EFLAGS register contains a group of status flags, a control flag, and a group of system flags… … INSTRUCTION POINTER The instruction poitner (EIP) register contains the offset in the current code segment for the next instruction to be executed. |
Let’s elaborate:
First, the EFLAGS register:
This register (as all 32-bit registers) contains 32 bits. What does this mean exactly?
01101010101001010101010100101010
That is 32 bits, or binary digits. Remember that a binary digit can be either one (1) or zero (0).
Most of these bits on the EFLAGS register represent a certain flag. Here’s a table representing them.
Code: |
——————————————— | BIT | DESCRIPTION | TYPE | | NUMBER | | | |———-|————————–|——-| | 0 | CARRY FLAG | ST | | 1 | RESERVED | RE | | 2 | PARITY FLAG | ST | | 3 | RESERVED | RE | | 4 | AUXILIARY FLAG | ST | | 5 | RESERVED | RE | | 6 | ZERO FLAG | ST | | 7 | SIGN FLAG | ST | | 8 | TRAP FLAG | ST | | 9 | INTERUPT FLAG | SY | | 10 | DIRECTION FLAG | CO | | 11 | OVERFLOW FLAG | ST | | 12 | I/O PRIVILEGE LEVEL | SY | | 13 | NESTED TASK | SY | | 14 | RESERVED | RE | | 15 | RESUME FLAG | SY | | 16 | VIRTUAL-8086 MODE | SY | | 17 | ALIGNMENT CHECK | SY | | 18 | VIRTUAL INTERUPT FLAG | SY | | 19 | VIRTUAL INTERUPT PENDING | SY | | 20 | RESERVED | RE | | 21 | RESERVED | RE | | 22 | RESERVED | RE | | 23 | RESERVED | RE | | 24 | RESERVED | RE | | 25 | RESERVED | RE | | 26 | RESERVED | RE | | 27 | RESERVED | RE | | 28 | RESERVED | RE | | 29 | RESERVED | RE | | 30 | RESERVED | RE | | 31 | RESERVED | RE | ——————————————— KEY: ST – Status Flag |
You can use this information for when, for example, you want to flip a flag in a script.
If you wanted to flip a flag, you could create a codecave with the following code:
Code: |
codecave: //original instruction here, so you don’t mess things up push eax //save eax, so we don’t screw it up pushf //push EFLAGS onto the stack pop eax //pop EFALGS off the stack, into eax //eax now contains the EFLAGS register or eax, //set the 6th bit of eax push eax nop //just to help check, for debugging purposes (I’ll discuss this later) popf jmp return //go back to wherever you left |
Yeah, you could do it the little easy way of screwing with the cmp instruction and what-not, but it’s always handy to know more than one method. =P
Plus, this makes you seem smarter. 😉
~~~~~~~~~~~~~~
Now, let’s elaborate on the EIP register. First off, THE EIP REGISTER CANNOT BE MODIFIED DIRECTLY. (Read the next section to find out how to mess with it). Let’s go back to that definition again:
Volume 1 wrote: |
INSTRUCTION POINTER
The instruction poitner (EIP) register contains the offset in the current code segment for the next instruction to be executed. |
Little confusing, right?
Basically, the program is divided into different code segments. The EIP register takes the base address of your current code segment, then subtracts it from the next address to be executed. This lets the processor know where to go next.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 5: Abusing the Call instruction to modify EIP
First of all, you have to remember that EIP can’t be modified directly.
Now, if you think about this, you could actually do something along the lines of
Code: |
label(lolJump)
codecave: call lolJump lolJump: mov [eax-6],90909090 //eax = &”pop eax” (& = C++ for “address of”) pop eax jmp return |
And that would execute a db 90 90 90 90 on codecave+1 (the starting of the call statement). With this, you can actually write self modifying code! (Note: This was partially explained in the Volume 1, but someone actually wrote a tutorial on it, and I can’t seem to find it, or who it’s by. If someone could notify me of who it was, so I could give appropriate credits, I would greatly appreciate it.).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 6: Define Byte/Word/Dword
First off, just so we can clear things up, here’s a small table giving you the opcode mnemonics of the three things:
Code: |
define byte db define word dw define dword dd |
These are really simple to understand; all they’re really doing are modifying the bytes of the current address. For example, db 90 changes that address to a “nop” command, because “90” is the byte for a nop.
You could also do
Code: |
db 83 7d 0c 00 |
which would be translated into
Code: |
cmp dword ptr [ebp+0c],00 |
Now, note how db has a space between each 2 digits (each byte), watch this
Code: |
dw 837d 0c00 |
Same thing, but different opcode, and note how dw has a space between each 4 digits (each word). And finally:
Code: |
dd 837d0c00 |
Same thing, once again.
You can use these to help shorten your scripts (though it’s reccomended to include comments in your script to help people understand what you’re doing, because
Code: |
cmp dword ptr [ebp+0c],00 |
is much more understandable than
Code: |
dd 837d0c00 |
)
Here’s an example…
Say you want to change the opcode at 0xDEADBEEF to
Code: |
xor eax,eax |
rather than
Code: |
xor eax,ebx |
You might make a script:
Code: |
[ENABLE]
DEADBEEF: [DISABLE] DEADBEEF: |
or… you could do the following (which also might help for AOB searching, I suppose)
Code: |
[ENABLE]
DEADBEEF: [DISABLE] DEADBEEF: |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 7: Using negative values in Assembly
To create a negative value in Assembly, all you do is flip the bits and add 1.
Let’s go through a couple examples:
What is -1 in assembly?
First, we get the bits of 1:
Code: |
00000001 |
Next, flip them:
Code: |
11111110 |
add one:
Code: |
11111111 |
and change that back to hex, you get
Code: |
FF |
So the answer to the question “What is -1 in assembly?” is 0xFF.
~~
What about -5?
Take the bits of 5:
Code: |
00000101 |
Flip them
Code: |
11111010 |
add one
Code: |
11111011 |
convert to hex
Code: |
FB |
And there you go, the answer is 0xFB.
~~
Let’s do one last example:
If you wanted to move the value of -10 into al, what would your mnemonic be?
First we find the value of -10d (-10 in decimal) in assembly…
The answer is
F6
(Try to do this one yourself, when you think you have it, highlight this, copy it, and paste it into notepad.)
So your opcode would be
mov al,f6
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 8: The Stack
The stack is a very simple concept. Basically, it’s like a pile of values that can be accessed by the esp register. First off, I’ll discuss pushing/popping, then I’ll discuss using the esp register to access the stack. Pushing/Popping help you first understand the stack, then using esp is a bit more advanced.
If you aren’t using esp, then you can do one of two things to mess with the stack, you can push something onto it, or pop something off of it. Push takes a value and puts it on top of the stack; pop takes something off of it and puts that into a value.
For example, let’s say that this is our stack:
Code: |
1 2 3 4 5 6 7 8 9 |
If we did the following command
Code: |
push 0 |
then our stack would look like this:
Code: |
0 1 2 3 4 5 6 7 8 9 |
(Going from our original stack)
If we did the following command
Code: |
pop eax |
our stack would look like this:
Code: |
2 3 4 5 6 7 8 9 |
and eax would contain the value “1”
~~~
Also, a note on this, if you did
Code: |
push eax |
the top of the stack would not be “eax”, it would be whatever was in eax (not the value, but the actual eax).
~~~
Now, for using esp to access the stack:
ESP is the register that points to the top of the stack, so if we had the stack…
Code: |
1 2 3 4 5 |
Then we would have the following equations:
Code: |
[esp+ 0] = 1 [esp+ 4] = 2 [esp+ 8] = 3 [esp+12] = 4 [esp+16] = 5 |
It’s not too complicated, but it can be quite tricky when trying to access the parts of the stack. Just FYI, we increment by 4 each time, because each of the different values on the stack are assumed to be 4 bytes in length.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Section 9: Using Cheat Engine to debug assembly
First, you need to find an application to debug, and of course have Cheat Engine installed.
I’m going to be using Minesweeper (just for lack of a simpler program that isn’t too simple =P).
Now, before we get into any actual debugging, I’m going to explain to you the theory of it. So you’ve got this debugger (Cheat Engine) that has the ability to watch all of the registers, flags, and segments, and to step through code one instruction at a time. The way this works is by setting a breakpoint on a certain address; when that address is come to by the actual program, it turns over control to the debugger. The debugger then has the ability to view all of the registers and what-not, and step through code.
What is stepping through code? It pretty much explains itself; all you’re doing is going through the addresses one at a time, so you can completely understand what the code is doing to the program.
Now, so you can watch it in action, take the following code and inject it into Minesweeper:
Code: |
alloc(codecave,1024) label(return) 01002ff5: codecave: mov eax,4 cmp eax,3 cmp eax,4 pop eax jmp return |
Cheat Engine will pop up a message box saying
Code: |
The code injection was successfull codecave=x |
Where x is a number (mine is usually 00980000, if it’s the first thing injected).
Click ok, and close out of the window. Then browse to the address x (yes, you had to remember it), and you should see your assembly.
What I have there are just some simple instructions that you should know the results of, just so you can simply watch.
Select the top address (x), and click on Debug -> Toggle Breakpoint. Cheat Engine will probably ask you about attaching itself t the process as a debugger, blah blah, just say yes. Address x should turn green, and now go back into Minesweeper and click a tile (to activate that address).
Going back to our theory, let’s look at how Minesweeper was doing:
Code: |
user clicks blah blah, time increasing starts increase time address (01002ff5) says jmp x goto x Cheat Engine has breakpoint at x STOP |
and now that’s exactly what you have. Minesweeper is frozen (if you try to close it, Windows will pop up telling you that it’s Not Responding and some bull shit).
Notice back in Cheat Engine, that some labels have turned red. Those are the things that have changed since the last thing that you’ve seen (nothing).
Next, click Debug -> Step (F7), and just keep doing that. Watch how the labels change to fit what’s going on in the code.
Can you truthfully tell me that that is not amazing? 😉
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Credits:
Intel – Creating the instruction set reference pdfs:
Dark Byte – Creating Cheat Engine, along with Auto Assembly, and helping me out with LoadBinary (PM, though there is the manual, that I forgot about)
x0r – For helping me understand the CRC bypass for MapleStory (forum post here).
This website for helping me out with flags.
Wikipedia – for clearing up something about the parity flag here
_________________
Wiccaan wrote: |
Oh jeez, watchout I’m a bias person! Locked. |