Rendition Script Mechanics
When I first designed Rendition Script, I chose to use C# over any other language because of its garbage management and MSIL opcode generation features. After programming most of the features, I ran into some roadblocks. Mainly, I wasn't able to free a generated program from memory once it wasn't needed. This was a huge memory leak of about 1MB every time a script was compiled. While writing the C# version of Rendition Script, I was becoming more familiar with C++ because of my college courses and personal studies. As I gained a strong knowledge of pointers, I realized that I could generate hexadecimal opcodes and store them as a pointer to an array of 'chars'. I could then use C++'s inline assembly feature to change the CPU's instruction pointer register to point to my script's address, therefore executing my instructions. This was much faster and more efficient than the MSIL method C# uses and I had complete control at the lowest level of programming. Another advantage of the C++ version is that it was much more portable.
Below is an outline of the compilation process. To further demonstrate what each stage does, I will take a single instruction and show how the data structure changes from stage to stage. The instruction begins as basic text - “copy input, output”
Stage 1: Text
Compilation
The first stage of the Rendition Script compilation process is text compilation. At this point, the script exists only as basic text. Text is very inefficient and useless to later stages of the compilation process. This stage takes the text and creates new data structures from the parsed text. When this stage is complete each instruction will now be contained in a data structure defining the name of the instruction, the number of parameters, and the parameters themselves. This stage throws errors if there is an unknown instruction, incorrect format, or any other error that can be determined by examining the basic text.
Instruction after this stage: Data structure – Name: “copy”
Number of parameters: 2
Parameters: “input, output”
Stage 2: Encapsulation
This stage is much more advanced than the previous one. This stage makes sure that the parameters are valid for the instruction being called. Errors are thrown when parameters are used incorrectly. For example, the copy instruction copies the first parameter into the address of the second parameter. Therefore, the second parameter can’t be a constant. When this stage is complete, each instruction’s parameters are no longer basic text, but data structures. Because the next stage takes the data structures and turns them into CPU opcodes, everything must be in terms of memory addresses including strings which must reference memory addresses.
Instruction after this stage: Data Structure: Name: “copy”
Number of parameters: 2
Parameters:
1. Parameter Type: Input (Memory address)
Parameter Name: “input”
Parameter Data: 0x50000000 (4 byte address)
2. Parameter Type: Output (Memory address)
Parameter Name: “output”
Parameter Data: 0x50000000 (4 byte address)
Stage 3: Opcode
Compilation
Opcode compilation is the final stage of Rendition Script’s compilation process. It takes each instruction data structure and outputs a series of opcodes readable by x86 CPUs. The tricky part about this stage is that each series of opcodes that are outputted must work seamlessly with other opcodes that are outputted by other instruction data structures.
Instruction after this stage: Opcodes: a1 c0 0b 32 00 a3 00 0c 32
00 c3 00
The result of the compilation may look like nonsense, but in fact, this is how every program looks when it is loaded into memory (only much longer). Here is an explanation of what the above opcodes mean:
First two bytes: Tells the computer to copy the contents of one memory address to another
Next four bytes: The address where the contents will be copied to (destination)
Next four bytes: The address that contains the contents to be copied (source)
Last
two bytes: Tells the computer to return to the instructions that it was executing
before this Rendition Script