Lab 4: Assembler
When first starting this lab I was excited to dive in to assembler as it gives a granular understanding of all steps involved in processes that are taken for granted in higher level languages. So after reading through the basic instructions used in assembler I began on the task of writing a program that loops from 0 - 30 printing "Loop: X" with X being the current loop count.
I immediately realized the significant difference between assembler and every other language I have coded with before. Instead of simply declaring variables and doing calculations however I want and dumping the results to the screen, its important to line all of the data I want up before executing any command and this lab really highlights this process. So we know the end result is a printed string 30 times, step 1 of which is getting a string to print. I created a .data section to hold both my string and the length of that string. This is important because unlike in C I can't replace X with a placeholder for the iterator in my string which is replaced in my loop. Instead I had to create a string "Loop: \n" that had enough white space for me to manually replace with the current loop count. The next step was to determine what that count was. This too presents a problem as if the count is greater than 9 then I would need to stuff 2 digits into my string in the proper positions. This meant that the first task to tackle is determining if I had a single or double digit number. To do so I divided my counter by 10 to isolate each digit into quotient and remainder. This required me to line up my counter value and a constant of 10 into the required division registers based on the architecture. If it was a single digit the flow from there was straight forward, replace the right-most white space with the digit, done by moving the ascii value of that number into the pointer to the message plus the number of characters to reach that space. Once the replacement was done all that was left was to set up my registers with the message value, message length and type of system call I wanted to make and invoke the system with a syscall. Finally to complete the simple path I just incremented my counter and compared it to a max value, if they didn't match I jumped to a flag called loop at the start of the program. The slightly more complex path was when I had both a quotient and remainder to fill. The best solution I could come up with for dealing with this situation was creating another flag called pastten which I would jump to when quotient was greater than 0. In this flag I simply converted both quotient and remainder to ascii then swapped both into my string. This also meant I had to create another flag called print so that from 0-9 I could unconditionally jump over the two digit replacement and go right to printing.
The other challenge of coding in assembler is the architecture specific differences in the language. x86_64 assembler and Aarch64 assembler have a number of significant syntactical differences that can make it easy to have slight mistakes when going from one to another. For example, in Aarch64 when doing division, I provide all the registers I want it to use to perform the operation and it puts the quotient in the first register I give it and nothing else, if I want the remainder I then need to perform a separate calculation of subtracting the quotient multiplied by the divisor from the counter value. In contrast x86_64 will calculate both and store the quotient in rax and remainder in rdx. This makes it easier to code as it eliminates a calculation but it also requires me to set up rax with the counter value and set rdx to 0 before invoking div for it to function properly. thus they both even out in terms of complexity over-all. Another difference that gave me no small amount of headaches when rewriting my code from x86_64 to Aarch64 is that x86 is read from left to right in terms of registers so mov %0, %rdx moves 0 into rdx, while Aarch64 is read from right to left so mov x15, 0 moves 0 into x15. While this seems like a fairly small difference, when trying to translate from one style to the other it was all too easy to leave one line the wrong way around and have the entire program break.
Overall my experience with coding in assembler was a good one. I learned a lot about what higher level code actually boils down to and appreciated the level of control given by a lower level language that is obfuscated in higher level ones. In the end though I will still stick to higher level languages as I can see a more complex task being written in assembler could become a very messy operation of register management.
I immediately realized the significant difference between assembler and every other language I have coded with before. Instead of simply declaring variables and doing calculations however I want and dumping the results to the screen, its important to line all of the data I want up before executing any command and this lab really highlights this process. So we know the end result is a printed string 30 times, step 1 of which is getting a string to print. I created a .data section to hold both my string and the length of that string. This is important because unlike in C I can't replace X with a placeholder for the iterator in my string which is replaced in my loop. Instead I had to create a string "Loop: \n" that had enough white space for me to manually replace with the current loop count. The next step was to determine what that count was. This too presents a problem as if the count is greater than 9 then I would need to stuff 2 digits into my string in the proper positions. This meant that the first task to tackle is determining if I had a single or double digit number. To do so I divided my counter by 10 to isolate each digit into quotient and remainder. This required me to line up my counter value and a constant of 10 into the required division registers based on the architecture. If it was a single digit the flow from there was straight forward, replace the right-most white space with the digit, done by moving the ascii value of that number into the pointer to the message plus the number of characters to reach that space. Once the replacement was done all that was left was to set up my registers with the message value, message length and type of system call I wanted to make and invoke the system with a syscall. Finally to complete the simple path I just incremented my counter and compared it to a max value, if they didn't match I jumped to a flag called loop at the start of the program. The slightly more complex path was when I had both a quotient and remainder to fill. The best solution I could come up with for dealing with this situation was creating another flag called pastten which I would jump to when quotient was greater than 0. In this flag I simply converted both quotient and remainder to ascii then swapped both into my string. This also meant I had to create another flag called print so that from 0-9 I could unconditionally jump over the two digit replacement and go right to printing.
The other challenge of coding in assembler is the architecture specific differences in the language. x86_64 assembler and Aarch64 assembler have a number of significant syntactical differences that can make it easy to have slight mistakes when going from one to another. For example, in Aarch64 when doing division, I provide all the registers I want it to use to perform the operation and it puts the quotient in the first register I give it and nothing else, if I want the remainder I then need to perform a separate calculation of subtracting the quotient multiplied by the divisor from the counter value. In contrast x86_64 will calculate both and store the quotient in rax and remainder in rdx. This makes it easier to code as it eliminates a calculation but it also requires me to set up rax with the counter value and set rdx to 0 before invoking div for it to function properly. thus they both even out in terms of complexity over-all. Another difference that gave me no small amount of headaches when rewriting my code from x86_64 to Aarch64 is that x86 is read from left to right in terms of registers so mov %0, %rdx moves 0 into rdx, while Aarch64 is read from right to left so mov x15, 0 moves 0 into x15. While this seems like a fairly small difference, when trying to translate from one style to the other it was all too easy to leave one line the wrong way around and have the entire program break.
Overall my experience with coding in assembler was a good one. I learned a lot about what higher level code actually boils down to and appreciated the level of control given by a lower level language that is obfuscated in higher level ones. In the end though I will still stick to higher level languages as I can see a more complex task being written in assembler could become a very messy operation of register management.
Comments
Post a Comment