The most important about ASM for most people, or any programming language for that matter, is not to let the syntax get to you. Its cryptic, disorderly, ugly, and many times very hard to read, but, if you understood most of the ideas in the previous section of this little tutorial pretty well, the underlying principles behind the code are easy and simple.
The Real Deal
Believe it or not, the two pseudo assembly instructions I introduced at the end of the last section are enough to make some pretty fancy "spells" for Starcraft unit actions to carry out. For example, just with the ability to "move" variables from memory to registers and back and the ability to "add" things to our variables, we can already make a spell that, say, increases the amount of hitpoints/shield points/energy that a unit has. Even change the unit's color, kill count, and an array of other things. And it makes sense, since all the variables -- hit points, shields, kill count, even color -- are just numbers somewhere in memory. So, let's get started by translating our pseudo-code into real 80x86 ASM code. Here's "Move value to destination":
MOV is an instruction that takes two "arguments," a location and a value (e.g., a number). It moves the value into the destination, just like our pseudo "Move" instruction. Here are some more concrete specifications:
The first valid use of the MOV instruction moves a constant value (e.g., 0010h or 1ABh or 200h, etc.) into a register (e.g., AX or BX or EAX or CL, etc.). The second valid use is to move a value from a certain place in memory (an offset) into a register. Remember that the [...] brackets mean we are referencing the number inside as an offset and not using the value itself. So MOV eax,00001000h moves the literal value 1000h into EAX (as a dword, since EAX holds dword values) and MOV eax,[00001000h] moves the dword value that starts at offset 0x1000 in memory into EAX.
The third example just moves a value from one register to another, like this: MOV eax,ebx (moving the value in ebx into eax). The fourth and fifth examples move a constant into a memory location, and move the value of a register into a memory location, respectively. Note that when we mov constants into memory we have to specify what size our constant is: E.G., MOV [0x6800],dword 10h moves 00000010h to the 4-byte value starting at 0x6800 while MOV [0x6800],byte 10h moves 10h into the 1-byte value at 0x6800. Think of it like this: memory is a long array of boxes, each of which hold one byte exactly. An offset is a reference to one particular box. When we move a dword into the boxes starting at a certain offset, we need to use the box at that offset, and 3 additional boxes after that, since a dword always takes up 4 bytes (no matter the value, if the number is smaller than 8 digits, then zeroes are just tacked onto the front), and each box can only hold 1 of those bytes. When do we have to explicitly quantify how large our constants are? A good rule of thumb is that if you're moving a constant into memory, you have to quantify how big it is, since memory doesn't know if 10h means the byte 10h or the word 0010h. If you want to always quantify your constants, you can do that too. Here are some more concrete examples:
MOV [0x0010],byte 10h
MOV [0x6800],dword 010Ah
MOV [0x006837010],word 1h
Now, we can also use some fancy syntax that we developed in our pseudo-code in the last section. For example:
or, in more concrete examples:
Here we are assuming that EBX holds the value of a memory offset we want to go to. Like our pseudo example, the first one of these take the value inside EBX, use it as an actual reference to a memory offset, goto that memory offset, grab the dword there and place it in EAX. The second one does the same, but adds 10h to the value inside EBX before going to that particular offset (so it would end up grabbing a value 16t bytes ahead of the first call). The last example just moves the value inside EAX (a dword remember) into the memory location EBX+10h. You can even do:
Which adds the values inside EBX and ECX and processes the sum as the offset to grab the value from. Nothing new, just different syntax.
A final note: You can NOT move values from memory into memory. E.G., this would be illegal: MOV [0x0010],[0x0020]. You must first move the value into a register and then move it from the register to the second memory offset. Most other calls, if they resemble one above, are legal.
The other instruction we used in our pseudo-code was "Add" which added two values together and left the result in the second location we specified. E.G., "Add 10h to EAX" would first add 10h and the value inside EAX together, and then leave the resulting sum inside EAX. The actual ASM code is similar:
ADD takes the value inside the location, the first argument, and the value of the second argument, adds them together, and leaves the result in the location (first argument). For example:
ADD [0x1000], dword 10h
The first example adds 10h to the value inside EAX and leaves the sum in EAX. The second adds the values of EAX and EBX together and leaves the sum in EAX. The third adds the dword value located at memory offset 0x1000 to EAX and leaves the sum in EAX. The fourth adds the dword value of EAX to the dword at memory and leaves the result at offset 0x1000. (These last two add a dword value because EAX is a dword register and thus implicitly, the other argument is also read as a dword, since you must have equally sized arguments to add them together) The fifth example is just like the fourth, but this time we've used the offset that is constructed by the value inside EBX added to 10h. Just like in the MOV offset reference combinations. In general, you can always reference offsets like this. The last example adds the dword value 00000010h to the dword value at memory offset 0x1000 and leaves the result at that offset. Again, in this last example we had to explicitly declare that 10h was a dword value because it could just as easily have been a byte value or a word. (And the value starting at 0x1000 is read as a dword by implication)
Also, just like the MOV instruction, it is illegal to add values if they are both from memory. You must move the second value to a register first and then add it to the first one.
The First "Program"
Now that was pretty in-depth coverage of those two instructions, but they really are very simple, and most of ASM is. But before we move on, let's actually make a real "program" you can actually compile and put into SG to get a working action. :) Here it is:
ADD [EBX+8fh],byte 10h
O.K. There's a couple instructions that are in here that I haven't touched on yet, but we'll just ignore those for now. (I've grayed them out) Let's look at what we do know: MOV and ADD. First I should tell you that 0x68d310 holds the offset (a dword) to all the data of the currently selected unit (remember, that means this offset just holds the value of another offset, not the actual data). So we move the value (which is an offset) from 0x68d310 into EBX. Got it? Now, if we went to [EBX] we would find all the unit data of the currently selected unit. But there's a lot of data there, and it isn't all that well organized (by our standards). If we go 8fh bytes past this offset (which is [EBX+8hf] for those of you keeping score) we will find a byte that holds the value of the current unit's kill score. Thus, we are referencing the byte value at this offset, [EBX+8fh], adding 10h to it, and then leaving the resulting sum at that offset. So what we've actually done is add 16t to the currently selected unit's kill score. And if you assembled that code above into binary, copied it into a new SG Action, linked a button to it, and then played the game and pressed the button, that would actually occur.
Now let me briefly explain the three other instructions and then we'll actually assemble this with an assembler. PUSH register is an instruction that tells your CPU to to take the current value inside that register and place it on the top of a place called the stack. What this is isn't that important (actually, its just a reserved place in memory), but the reason we use it is. First I should explain POP. POP register is just the opposite of PUSH. It takes the value on top of the stack and puts it into the register. (And that value gets removed from the stack) So, what we essentially did was move the value of EBX onto the stack when we started, and then moved that same value back onto EBX when we ended.
Why do we do this? Well, remember what Starcraft the program is doing right now. Starcraft is running executing code normally, and then you suddenly press a button. It has to stop executing its code it was in the process of doing and goto this code here you just wrote -- the "button code." But after it finishes executing the button code, it will return (hint: that's what RET is for :) to what it was doing before the button was pressed.
But wait a minute. Before we interrupted SC with our button press, it was executing code just as it executed our code. Just like in our code, it was moving stuff around registers and memory, adding stuff, etc. When we interrupted it, who's to say there wasn't already a value in EBX? In fact, there usually are values inside the registers when we interrupt the game by pressing a button. But what have we done in our code? We've moved a new value into EBX, thus destroying the value that was in EBX before our code started. When the game resumes normally, after our code completes, Starcraft expects that it will find its original value of EBX in EBX, but (if we didn't POP the stack) instead it finds our offset we moved in there. And thus, likely, it will error and crash. Thus, we PUSH the value of EBX onto the stack to save it, and POP it back out into EBX just before we return control to the regular game, so that all the registers are in the same state we found them in. Kind of like cleaning up after ourselves. :)
Here are some other sample calls to PUSH and POP:
All of these are legal, but each move different sized values onto the stack. You can also PUSH multiple values onto the stack (and thus POP multiple values out). Generally, you want to PUSH all the registers you plan to use before starting your code and then POP all of them when you finish. Note that the order you want to POP them is opposite the order in which you originally PUSHed them (like above). This is because the last value you PUSH will the the value on top of the stack, so when you POP the stack, that's the first value you're going to get out. Here's an analogy: You have 3 blocks in 3 containers, a red one, a blue one, and a green one. Each block matches the color of its container. Now, lets first PUSH the blue container onto our "stack", then PUSH the red onto the stack (on top of the blue one now), and lastly PUSH the green one. Now if we POP our stack, we're going to get the block closest to the top, the green one. So the first block we POP goes in the last container we Pushed The next block we POP from the stack will be the red one, and, similarly, the red container is the second to last one we Pushed And so on.
Now the last instruction: RET. As you've probably already guessed, this instruction "returns" the game back to the usual game code that it was running before it began executing your button code. For now, that's all you really need to know about it. Just remember that this should always be your last instruction to indicate that your action is finished.
Your First real SG Action
Now, let's compile that program above and try it in Stargraft! First of all, you'll need an assembler to turn the text code above into bits and bytes that your computer can understand. I haven't done much research into this area, but the freeware one I used is NASM, which you can download from here: http://www.web-sites.co.uk/nasm/. (See the "Where is it?" link on that page)
NASM is a command line program, so you need to go to run it in a MS-DOS Prompt (or use the "Run" option in your Windows Start menu, but that's not very convenient). Here's a quick tutorial on how to run it:
1) First install it somewhere. :) [All you really need from the ZIP is nasmw.exe, or it may be called just nasm.exe]
2) Write out the assembly code above into a plain text file with a program like Notepad. Here's the code again:
ADD [EBX+8fh],byte 10h
3) Above all this code, put the words: BITS 32
This tells the assembler to use 32-bit code instead of 16-bit code (windows programs like Starcraft vs. DOS programs)
4) Save your text file somewhere. For ease, I suggest placing it into the directory where you installed nasmw.exe. Be sure to give it an extension. Something like *.asm would be nice to denote that its an assembly file, though you can use *.txt or whatever you want.
5) To run nasmw.exe in DOS, go to the directory where you installed it. Then type in:
where C:....> is just the prompt (DON'T TYPE IT), and nasmw is what you typed. Replace with the name of your assembly file you just made. E.G., if you named it mySgAction.asm, then you'd type:
6) If all was good an well, you should just get another prompt back. If NASM gave you errors, then check your file's syntax and try again. Once you've got it to work, you should notice a new file in the the directory where you installed NASM that is has an identical file name to your assembly file, but doesn't have a file extension. E.G., my file would be called mySgAction without the *.asm extension. Open that file up with a hex editor (like Hex Workshop) and you will see your assembled code (its the hex).
7) Copy all that hex you see in your assembled file. Open up Stargraft, make a new patch/open an old one, make a new Action, and then paste your hex in there (press Control-Shift-V). Now, just link a button to your new action, save, patch up Starcraft, and you're set to go! Try it! :)
When you actually use this action in the game, you may have to press the button and then deselect the unit and then reselect it before you see any results, because the status bar may not auto refresh. But in general, it should work. If Starcraft crashed, first check that everything you did in Stargraft was legal and A.O.K., and then check the syntax of your assembly file again. (Note that now when windows crashes a program and it gives you that illegal page fault thing-a-ma-gig, you can click on the "> details" button and actually make some sense of that gibberish it gives you -- those are the registers and the values inside them on your CPU at the time!) This shouldn't give you any trouble... if you really can't get it to work, post on our forum and I'll try to fix it for you. Oh, and make sure you're running 1.07 Broodwar for Windows. The memory offset we used to get the unit data is specific to that version.
Hm. That took a little longer than I expected. :) Guess I'll have to add one more chapter to this little tutorial, but now that you know all the fundamentals, its just a matter of teaching you a couple more instructions so you can manipulate the values more. I'll also expose all the other values in memory that are known so you can mess with those as well (like hitpoints, shields, etc.). For now, if you want to take a quick break, you can try messing around with what you know already -- e.g., try adding 100t kills to a units' kill count. Here's a little wif of the next section: in addition to ADD, there is a very similar instruction called SUB that subtracts the second argument from the first and stores the difference in the location of the first argument (SUB location,value = location - value and put the result in location). Have fun!