Hand made linking

This post is a very simple explanation of the job of a linker with an example that helps to demonstrate some basic concepts. I image linking like sewing: bring pieces of shaped fabric (the procedures) and join them together to form a dress (the whole program).

The topic of developing any kind of software, including an operative system of any kind and complexity can be reduced at its core to tree main blocks: assembling, linking and loading.

Assembling

Assembling is the process of producing machine code. In a very bare-bone way, it is possible to use the 8086 INTEL Manual at chapter 4, page 18, paragraph "Machine Instruction Encoding and Decoding" and figure it out which sequence of bytes encodes an instruction like MOV AX, CX and finally use the edit command e of DEBUG.EXE and write this sequence immediately in RAM. It works, but it is not really a feasible way, even if I kept doing like so in many different places in the previous chapters. One can use a software program as support to convert a human meaningful sequence of characters such as MOV AX, CX into a CPU executable sequence of bytes such as 0x 89 C8. The name for the human meaningful sequence of characters is MNEMONIC. The name for the CPU executable sequence of bytes is operational code or OPCODE for short. The name of the process that converts MNEMONICS into OPCODES is assembling. This is the core of it. Nothing more than this. We know this process already and I used DEBUG.EXE as a software to support me with the assembly process. This happens every time when I use the command a (which stands for assembly).

Loading

Loading is the process of preparing the software into RAM and passing control to it. Indeed preparing can be a very complex sequence of operations with a lot of overheads. Each operative system expects the program to be prepared into RAM in a certain way before passing control to it, as a consequence, the loader is specific to each operative system. the loader takes care of all preparation steps that are necessary before the program can run. I have created my own very basic loader in chapter III with the space shuttle. This Loader looks into the ROOT folder searching for a file with name SOFTWARE.BIG and if found it loads it in RAM starting from absolute address 0x 7C 00 and passes control to it.

Linking

Linking, as well as Loading and assembling, is a huge topic that I oversimplify in order to get the very basic of it. At the very basic, programmers would like to find a way to reuse code that they have already written and tested as building blocks of programs which gets more and more complex. To achieve this target one can place in memory all the building blocks (the service software) that are needed, take note of the entry point of each service software and link the address of the entry point to each call in the main program that needs this address. As said, this topic is huge and complex and I am saying nothing about how exactly this operation actually is implemented, but regardless of the complexity about the many possible ways one can take note of all entry point addresses and substitute them for the calls, at the very basics this is all about it: take note of the addresses of all entry points and substitute them for the calls (and jumps). We are already familiar with a linking technique and we use it all the time even if we may not be aware of it: the interrupt. Your code can call an interrupt from any place in RAM. Your code never knows, and never needs to know the address of the entry point of the service function it needs. Your code just calls the required interrupt. Upon execution of the interrupt instruction, the target address is read from RAM starting from address 0x 0 00 00 and stepping 4 bytes ahead (CS:IP = 4 bytes) by each interrupt's number. So INT 04H gets the 4 bytes for CS:IP starting from address 0x 0 00 10. The BIOS manufacturer knows the exact location of the service routine inside the BIOS-ROM and loads the addresses in the "real mode interrupt vector table" (addresses from 0x 0 00 00 to 0x 0 03 FF) during bootstrap. So, during bootstrap, a very first linking is taking place.

In practice

As usual, the best way to get it clear is to practice and make an example. For this purpose, I created a service procedure called creg_str (convert register in string) and I performed the so-called static-linking1 by hand. Writing code in DEBUG.EXE is an iterative approach where I had to fix the addresses by each iteration until everything fitted. I show you what I mean using this very example too. Here you see the source code at the first iteration step:

creg_str.npp at the first iteration
f 7c00 7fff 3f a 7c00 ;234567890123456789012345678901234567890123456789012345678901234567890123456789 ;-------10--------20--------30--------40--------50--------60--------70-------79 ;------------------------------------------------------------------------------ ; CREG_STR: Convert REGister in STRing ; ; Copyright (C) 2020 - Michele Musci ; Distributed under the GNU Affero General Public License version 3. ; See https://www.gnu.org/licenses/agpl-3.0.txt ; ; This procedure converts the content of BX into the corresponding ASCII code ; and store it in the memory location pointed by ES:[DI]. ; ; INPUT: BX -> WORD to be converted in ASCII. ; CX -> number of nibbles to be converted. ; ES:[DI] -> memory address to save the ASCII equivalent ; of the content of BX. ; OUTPUT: --- ; REG. USAGE: AX, BX, CX, DI. ; THIS CALLS: OUT_NBL.BIN, NBL_ASC.BIN ; ; Build it with command: ; debug < out_nbl.npp > out_nbl_dbg.npp ;------------------------------------------------------------------------------ ;------------------------------------------------------------------------------ ; Load service functions into memory ;------------------------------------------------------------------------------ n OUT_NBL.BIN l 7c10 n NBL_ASC.BIN l 7c30 n SHOW_STR.BIN l 7c40 a 7c50 ;------------------------------------------------------------------------------ ; CODE ;------------------------------------------------------------------------------ ;------------------------------------------------------------------------------ ; START of procedure ;------------------------------------------------------------------------------ ; ; loop_start: <--- ; ;--------------------------------------- call 7c10 ; call OUT_NBL.BIN: ; take a nibble ; bx contains already the source word to select ; the nibble from. ; cx is the nibble index changing from 4 to 1 ; in the for loop. ; the function OUT_NBL.BIN can be called ; immediatelly ; ;------------------------------------- call 7c30 ; call NBL_ASC.BIN ; convert the nibble to ascii ; al contains already the nibble to be converted ; the function NBL_ASC.BIN can be called ; immediatelly ; ;------------------------------------- stosb ; save the ascii in the string ; al contains already the ascii code to be stored ; es:[di] = al; di = di + 1 ; ;------------------------------- ; NEXT cx = ?, to 0, dec cx ;------------------------------- loop ??? ; loop loop_start: ---> ; dec cx ; jnz ; ;------------------------------------------------------------------------------ ; END of procedure ;------------------------------------------------------------------------------ ret ; ; d 7c00 7c5f u 7c50 7c5a rbx 0 rcx a r n CREG_STR.BIN w 7c50 q

At the very start, I began with the command f 7C00 7FFF 3F which tells DEBUG.EXE to fill the RAM from SEG:7C00 to SEG:7FFF with the byte 3F. This byte has become my favourite choice because it is interpreted as the symbol "?" in the ASCII view on the right of the memory dump and, at the same time, it is interpreted as the single byte instruction AAS when I use the u command in DEBUG.EXE (I will tell few more words about this later on). This helps me to see in a glance where the code is in the ASCII view (that is whereas no "?" occurs).

Soon after that, I used the command assembly at location 0x7C00 (a 7C00). This opened the assembly mode in DEBUG.EXE. I used it to write just the comments because the symbol semicolon is active only in assembly mode. This is what I use as my standardized header part. Additionally, you may have noticed that I didn't write any instruction here this time. I did it so because creg_str is a service procedure and not the main program.

I choose to write code in DEBUG.EXE starting at address 0x7C00 because this is the address of the code in memory after being loaded with the space shuttle. I did it so because I hope that this will simplify my job in the future rather than manually adjusting the addresses in DEBUG.EXE before creating the binary output as I did until now (see the previous chapter II).

With a couple of returns, I exited the assembly mode and I went back again in the command mode of DEBUG.EXE (you will tell it better by observing the presence of the "-" prompt in the final protocol some more lines down in this post). Then I used the commands n and l to create by hand something very close to what a "symbolic table" of a linker is. I loaded all the necessary service procedures aligned to the begin of a new paragraph. This is a bit of waste in terms of bytes of memory, but I found it a little bit easier for me to keep a trace in the memory dump and see where the procedures were. Moreover, I had the space shuttle that could load SOFTWARE.BIG so I was no longer limiter to 512 bytes.

Finally, I started writing code from address 0x7C50 which was the relative position of this new procedure. In this section, I saw the entry points of the different service procedures needed and I could call them immediately. It is important to observe that all these services procedures were coded to stay at the specific address where I put them with the command l. They had to stay there because DEBUG.EXE requires absolute addressing in the MNEMONIC-syntax of calls (loops and jumps) even if the OPCODE uses a relative address from the Instruction Pointer (IP). The consequence is that the code is NOT re-allocatable during development, in order to be re-allocatable once development is finished (just keep in mind this now and I will clarify better during the rest of this post).

At the beginning of the code, I put the comment:

; loop_start: <---

This was a help for me. It worked almost as a symbolic label for the address (which was unknown to me while I wrote the source code) and it helped my eyes to see re-entry points in code. This was the re-entry point for the loop at the end of the code. The problem was that I didn't know the addresses while I was coding so I had to perform a blind jump that I corrected later on. I marked this blind jump with three question marks after the LOOP instruction (Fig. A).

Fig. A - three question marks after the LOOP instruction
Fig. A - three question marks after the LOOP instruction

As I said already, writing code with DEBUG.EXE is an iterative process. First, I wrote the file "creg_str.npp" and then I created the file "creg_str_dbg.npp" with the following command at window console:

DEBUG < CREG_STR.NPP > CREG_STR_DBG.NPP

After that, I reopened the "creg_str_dbg.npp" file looking for the error, then I fixed it in the file "creg_str.npp" and re-assembled it again with the command just used before. The protocol of the first iteration looked like the one in Fig. B.

Fig. B - Correction of the address
Fig. B - Correction of the address

As you can see in Fig. B, at line SEG:7C572 there was an error (as expected). The good part of it was: I knew at which address the label loop_start: was. I admit that in this case, it was trivial since I started to code at SEG:7C50, but generally speaking this is not the case and I wanted to show you the kind of iteration process you have to go thru: you keep assembling and fixing the addresses until it fits.
Finally, you see here the last protocol "creg_str_dbg.npp" in the full length with the last two sections that follow the end of the code: the dump and the disassembly3.

the last protocol "creg_str_dbg.npp"
-f 7c00 7fff 3f -a 7c00 137B:7C00 ;234567890123456789012345678901234567890123456789012345678901234567890123456789 137B:7C00 ;-------10--------20--------30--------40--------50--------60--------70-------79 137B:7C00 ;------------------------------------------------------------------------------ 137B:7C00 ; CREG_STR: Convert REGister in STRing 137B:7C00 ; 137B:7C00 ; Copyright (C) 2020 - Michele Musci 137B:7C00 ; Distributed under the GNU Affero General Public License version 3. 137B:7C00 ; See https://www.gnu.org/licenses/agpl-3.0.txt 137B:7C00 ; 137B:7C00 ; This procedure converts the content of BX into the corresponding ASCII code 137B:7C00 ; and store it in the memory location pointed by ES:[DI]. 137B:7C00 ; 137B:7C00 ; INPUT: BX -> WORD to be converted in ASCII. 137B:7C00 ; CX -> number of nibbles to be converted. 137B:7C00 ; ES:[DI] -> memory address to save the ASCII equivalent 137B:7C00 ; of the content of BX. 137B:7C00 ; OUTPUT: --- 137B:7C00 ; REG. USAGE: AX, BX, CX, DI. 137B:7C00 ; THIS CALLS: OUT_NBL.BIN, NBL_ASC.BIN 137B:7C00 ; 137B:7C00 ; Build it with command: 137B:7C00 ; debug < out_nbl.npp > out_nbl_dbg.npp 137B:7C00 ;------------------------------------------------------------------------------ 137B:7C00 ;------------------------------------------------------------------------------ 137B:7C00 ; Load service functions into memory 137B:7C00 ;------------------------------------------------------------------------------ 137B:7C00 - -n OUT_NBL.BIN -l 7c10 - -n NBL_ASC.BIN -l 7c30 - -n SHOW_STR.BIN -l 7c40 - - -a 7c50 137B:7C50 ;------------------------------------------------------------------------------ 137B:7C50 ; CODE 137B:7C50 ;------------------------------------------------------------------------------ 137B:7C50 ;------------------------------------------------------------------------------ 137B:7C50 ; START of procedure 137B:7C50 ;------------------------------------------------------------------------------ 137B:7C50 ; 137B:7C50 ; loop_start: <--- 137B:7C50 ; 137B:7C50 ;--------------------------------------- 137B:7C50 call 7c10 ; call OUT_NBL.BIN: 137B:7C53 ; take a nibble 137B:7C53 ; bx contains already the source word to select 137B:7C53 ; the nibble from. 137B:7C53 ; cx is the nibble index changing from 4 to 1 137B:7C53 ; in the for loop. 137B:7C53 ; the function OUT_NBL.BIN can be called 137B:7C53 ; immediatelly 137B:7C53 ; 137B:7C53 ;------------------------------------- 137B:7C53 call 7c30 ; call NBL_ASC.BIN 137B:7C56 ; convert the nibble to ascii 137B:7C56 ; al contains already the nibble to be converted 137B:7C56 ; the function NBL_ASC.BIN can be called 137B:7C56 ; immediatelly 137B:7C56 ; 137B:7C56 ;------------------------------------- 137B:7C56 stosb ; save the ascii in the string 137B:7C57 ; al contains already the ascii code to be stored 137B:7C57 ; es:[di] = al; di = di + 1 137B:7C57 ; 137B:7C57 ;------------------------------- 137B:7C57 ; NEXT cx = ?, to 0, dec cx 137B:7C57 ;------------------------------- 137B:7C57 loop 7c50 ; loop loop_start: ---> 137B:7C59 ; dec cx 137B:7C59 ; jnz 137B:7C59 ; 137B:7C59 ;------------------------------------------------------------------------------ 137B:7C59 ; END of procedure 137B:7C59 ;------------------------------------------------------------------------------ 137B:7C59 ret 137B:7C5A ; 137B:7C5A ; 137B:7C5A - -d 7c00 7c5f 137B:7C00 3F 3F 3F 3F 3F 3F 3F 3F-3F 3F 3F 3F 3F 3F 3F 3F ???????????????? 137B:7C10 89 D8 89 CA 49 D1 E1 D1-E1 D3 E8 25 0F 00 89 D1 ....I......%.... 137B:7C20 C3 3F 3F 3F 3F 3F 3F 3F-3F 3F 3F 3F 3F 3F 3F 3F .??????????????? 137B:7C30 24 0F 04 30 3C 39 76 02-04 07 C3 3F 3F 3F 3F 3F $..0<9v....????? 137B:7C40 B4 0E BB 07 00 AC 3C 00-74 04 CD 10 EB F7 C3 3F ......<.t......? 137B:7C50 E8 BD FF E8 DA FF AA E2-F7 C3 3F 3F 3F 3F 3F 3F ..........?????? - -u 7c50 7c5a 137B:7C50 E8BDFF CALL 7C10 137B:7C53 E8DAFF CALL 7C30 137B:7C56 AA STOSB 137B:7C57 E2F7 LOOP 7C50 137B:7C59 C3 RET 137B:7C5A 3F AAS - - -rbx BX 0000 :0 -rcx CX 000F :a -r AX=0000 BX=0000 CX=000A DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000 DS=137B ES=137B SS=137B CS=137B IP=0100 NV UP EI PL NZ NA PO NC 137B:0100 0000 ADD [BX+SI],AL DS:0000=CD - -n CREG_STR.BIN -w 7c50 Writing 0000A bytes -q

In the disassembly section (created with command u 7c50 7c5a) I want you to observe all the calls and jumps (we jump using the LOOP 7C50 instruction in this case). Let us take the Loop. The MNEMONIC is LOOP 7C50, the OPCODE is 0x E2 F7. The way we "talk" to DEBUG.EXE is with a MNEMONIC that describes the target as an absolute address to go to (LOOP 7C50), the OPCODE for the CPU describes the target as a relative address with respect of the IP after instruction execution. In fact, the next instruction starts at SEG:7C59, and 0xF7 is a two complement signed integer that converted in decimal is -9. If we count 9 bytes backwards from SEG:7C59 we end et SEG:7C50. I shortly announced before that the code is NOT re-allocatable during development, in order to be re-allocatable once development is finished. I can explain a portion of it now, the conclusion will come some few lines ahead in this chapter. As you can observe, DEBUG.EXE expects addresses in absolute way (for instance LOOP 7C50), but the OPCODE produced uses relative addressing mode from instruction pointer IP. So during the writing of the code, I was forced to use absolute addresses for jumps and calls to the service procedure, but the OPCODES created behind the MNEMONICS are relative addresses which allowed me to take the full binary package "as is" and relocate anywhere else in memory and it worked the same way. There are still a few details that I have to explain so that this works exactly as I said, but I hope that you have got the core of it for now. If you want you can repeat the same for your training with the code at SEG:7C53. It has the following MNEMONIC CALL 7C30 corresponding to the following OPCODE 0xE8 DA FF. Just remember that 0xE8 is the OPCODE for CALL and 0x FF DA is the delta address for IP (remember it is in little-endian).

At the very last section, I produced the binary code. I set in BX::CX4 a 32bit number telling DEBUG.EXE how many bytes to write in the file starting from the offset address given with the command w. Here the byte 0x3F comes useful again. I said that it was good in the ASCII view because it is interpreted as the char "?", but here I used it because in the disassembly view it produces a single byte instruction (it doesn't matter that this is the instruction AAS, what really matters is that it is a single byte instruction as it was with that single byte instruction 0x90 before but, unlike 0x90, it produces a nice recognizable char in the ASCII view). You must remember that the x86 CPUs decodes instructions with a variable length depending on the instruction itself. When I wrote the corresponding binary file I took care to get the exact number of bytes of code (sometimes I did mistakes and I cut the binary just in the middle of the last instruction). To stay on the safe side I let display always one or two instruction more after the last very one instruction when I used the command u. In this way, I could immediately read (instead of calculating) the start address of the very next instruction (SEG:7C5A in this case). Then I performed the subtraction from this value to the start of assembly (0x7C5A - 0x7C50 = 0x0A) and I put it in BX::CX for the write. After that, I set the write command with the initial address of the code: w 7C50.

Here I can bring you another small piece of clarification for the statement "the code is NOT re-allocatable during development, in order to be re-allocatable once development is finished". As long as I develop, I keep the service procedures as separate modules. For example, I did it so when I produced the binary file just for the portion of code that goes from SEG:7C50 to SEG:7C59 (remember to count also the byte at 0x7C50 so there are 0x0A bytes in total). This gives me the flexibility I need during development to change, and re-package things (if needed). Every time I re-package I have to re-fix the linking table. This is of course annoying and a potential source of errors (linking errors). This is the job of a linker, but I didn't have any linker and I was learning linking by hand. At the very end of the development, I will create a binary that includes everything (I will describe it in the next post). A binary that goes from address SEG:7C00 to SEG:7C59 and includes all service procedures OUT_NBL, NBL_ASC and SHOW_STR. This final product is fully re-allocatable (well,... almost,… still one more small detail is yet to come) and fully statically linked.

I hope that you too, together with me are learning a lot about the core of the job of a linker and having fun at the same time. I put all the necessary source files "out_nbl.npp", "nbl_asc.npp" and "show_str.npp" in the DOWNLOAD AREA, so that you can build the binaries with the redirection of input and output (example debug < show_str.npp > show_str_dbg.npp) and finally build CREG_STR.BIN.



  1. Here follows my definition based on my understanding of what static- and dynamic-linking is, but I also want to remember you that I am a hobbyist and not a professional in this field. Linking is STATIC when all pieces of software are available and known at linking time so that all addresses are resolved during the creation of the program file. Later on, the loader loads this statically linked program file all at once. Linking is DYNAMIC when at least one piece of software is unknown at linking time so that all addresses are known only after all pieces of software are loaded into RAM. The loader loads all pieces of software in RAM from different files and then performs a linking stage before running the main program. [click back]
  2. All "*_dbg.npp" files created with DEBUG.EXE ("creg_str_dbg.npp" in this case) have the memory address on the left of each new line in assembly mode. This characteristic makes me use the addresses (such as SEG:7C57) as if those were numbers of lines. [click back]
  3. As mentioned before, you can see in the last protocol "creg_str_dbg.npp" the transition from assembly-mode and command-mode in DEBUG.EXE by looking at the prompt "-". You should also compare with the "creg_str.npp" source file and notice how I used the "new-line" to exit the assembly-mode and return to command-mode in DEBUG.EXE. [click back]
  4. The more I look into this topic on the internet the more I find and I see small differences. Sometimes I find just one colon (for instance SS:SP) and sometimes I find double colon (for instance BX::AX). My interpretation of this difference in the notation is the following: every time a single colon is used the meaning is a segmented addition to produce a 20bit address, whereas every time a double colon is used the meaning is a concatenation to produce a 32bit value. Example:
    0x1234:ABCD means 0x01 2340 + 0xABCD equal to 0x01 CF0D,
    meanwhile
    0x1234::ABCD means 0x1234 0000 + 0xABCD equal to 0x1234 ABCD.
    I am not 100% sure that my interpretation is the right one, but I will use this as a convention from now on. [click back]

<PREV.  -  ALL  -  NEXT>

Comments