High Level Languages:

Subroutines and stack relative addressing mode

1 Review of JSR and RTS

You'll remember that the first problem to solve when implementing procedures and functions in assembly language is finding a way to transfer control to the first instruction in the subroutine and then returning control to the instruction following the call when the subroutine is complete.

There need to be special instructions for these steps, since they involve changing the value of the program counter. On the Pep/6, subroutine calls use JSR to push the return address onto the runtime stack and set the PC to the address of the first instruction in the subroutine. The return from the subroutine uses RTS to pop the saved return address off of the run-time stack and into the PC, so that the next instruction to be fetched and executed will be the instruction after the call to the subroutine.

Here's an example of a program which includes nested calls (demonstrating why we need a stack for the return addresses) (example_15_java)

class Program
{
// Example of nested simple procedures in Java.
  
// The program simulates most users's impression
// of almost all software:  it inputs a number--
// which it ignores--and prints two error messages.
 
static final byte newLine = 10;
static void punctuate ()
{
    Io. charOutput ((byte) '!');
}
 
static void printError ()
{
    Io. charOutput ((byte) 'E');
    Io. charOutput ((byte) 'r');
    Io. charOutput ((byte) 'r');
    Io. charOutput ((byte) 'o');
    Io. charOutput ((byte) 'r');
    Io. charOutput (newLine);
    punctuate ();
}
 
byte inputToIgnore;
 
public static void main (String [] args)
{
    inputToIgnore = Io. decimalInput ();
    printError ();
    printError ();
}
}
and the Pep/6 assembly language that implements it (example_15_assembl)
-------------------------------------------------------------------------------
      Object
Addr  code   Symbol   Mnemon  Operand    Comment
-------------------------------------------------------------------------------
             ;
             ; Assembly language version of program containing
             ; nested simple procedures.
             ;
0000  70001F          BR      main
             ;
             newline: .EQUATE d#10
             ;
             ; ---- procedure punctuate ----
             ;
0003  E00021 punctuat:CHARO   c#/!/,i
0006  E0000A          CHARO   newline,i
0009  C8              RTS   
             ;
             ; ---- procedure printError ----
             ;
000A  E00045 printErr:CHARO   c#/E/,i    ; put "Error" ..
000D  E00072          CHARO   c#/r/,i
0010  E00072          CHARO   c#/r/,i
0013  E0006F          CHARO   c#/o/,i
0016  E00072          CHARO   c#/r/,i
0019  C00003          JSR     punctuat   ; Call punctuate
001C  C8              RTS   
             ;
             ; ---- main program ----
             ;
             ; Data for main program:
001D  0000   inputTI: .WORD   d#0
001F  E9001D main:    DECI    inputTI,d
0022  C0000A          JSR     printErr
0025  C0000A          JSR     printErr
0028  00              STOP  
             ;
0029                  .END   
-------------------------------------------------------------------------------
 
 
Symbol table
--------------------------------------
Symbol    Value        Symbol    Value
--------------------------------------
inputTI   001D         main      001F
newline   000A         printErr  000A
punctuat  0003         
--------------------------------------
No errors.  Successful assembly.
Let's follow the execution of the first call to printError, seeing what happens to the stack. The instruction corresponding to this call is the JSR instruction at address 0022. So the value of the PC is h#0022 when we start tracing the call. The user stack will still be empty, so the stack pointer, SP, will still contain its initial setting at the bottom of the user stack 0BE6 (although we don't really need to know its exact starting value to trace the subroutine calls, since manipulations of the stack pointer are always relative to its old value).
  1. JSR printErr

  2.  

     

    The CPU fetches the instruction at address 0022 into the instruction register, increments to PC to 0025, and then executes the instruction it has just fetched.

    To execute the JSR, the CPU pushes the value in the just-incremented PC onto the stack, as the address to return to when the procedure is finished, and then puts the operand of the JSR instruction, which is the address of the first instruction implementing the printError procedure, into the PC.

    The stack now looks like this (the locations above the current top-of-stack are labeled "junk" to indicate that they contain some value left over from previous computations):
    Address Contents
    0BE0 ? (junk) 
    0BE2 ? (junk) 
    top -> 
    0BE4 0025 (first return address) 

    PC: 000A, SP: 0BE4
     
     

  3. Execution proceeds through the code implementing the PrintError procedure, printing "Error". Then we reach the instruction that calls the procedure punctuate.

  4.  
  5. JSR punctuat

  6.  

     

    The CPU fetches this JSR instruction from the address 0019 and increments the PC to 001C. Then it pushes that new value of the PC onto the stack as the return address for this function call, and sets the PC to the operand of the JSR, which is the address of the first instruction of the punctuate procedure.

    The stack now looks like:
    Address Contents
    0BE0 ? (junk) 
    top-> 
    0BE2 001C (second return address) 
    0BE4 0025 (first return address) 

    PC: 0003, SP: 0BE2
     

  7. The punctuate procedure prints the exclamation mark and the new line character. Then we reach the instruction to return from the procedure.

  8.  
  9. RTS

  10.  

     

    The CPU fetches the RTS instruction from 0009, and increments the PC.

    Then, to execute the RTS instruction, it pops the return address off the top of the stack and puts that value into the program counter. As a result, the next instruction to be fetched and executed will be the instruction after the JSR instruction that called this subroutine.

    The stack is back the way it was before the call to punctuate:
    Address Contents
    0BE0 ? (junk) 
    0BE2 001C (junk) 
    top-> 
    0BE4 0025 (first return address) 

    PC: 001C, SP: 0BE4

    Notice that the memory location above the new top-of-stack continues to contain what used to be the return address from the nested procedure call. Popping a value off the stack just involves moving the stack pointer so that the new top of stack is below the discarded value; the contents of the popped memory locations remain unchanged until they are rewritten by other instructions (probably as a result of the stack growing again later). This can sometimes be confusing when you are debugging a program.
     

  11. The CPU fetches the instruction at the address it has just returned to. In this case, the fetched instruction is another RTS, the one returning control to the main program from the printError procedure.

  12.  

     

    So after incrementing the PC, the CPU executes the RTS by immediately replacing the value in the PC with the return address it pops off the stack. That leaves the stack empty, and returns control to the instruction in the main program that follows the JSR that we started with.
    Address Contents
    0BE0 ? (junk) 
    0BE2 001C (junk) 
    0BE4 0025 (junk) 
    top-> 
    0BE6

    PC: 0025, SP: 0BE6

So the last-in-first-out nature of the stack serves to implement nested procedure calls. Each RTS instruction returns control to the instruction after the most recently executed JSR.

2 Problems with static allocation for locals, parameters, and return values

One way (not the best) to implement a function's local variables, parameters, and return values is through static allocation. Static allocation involves setting aside memory at a fixed address for each value at assembly time, in the same way as we have allocated memory for other variables.

Static allocation works as long as there can be only one invocation of each subroutine active at a time. The implementation of recursion, though, requires the ability to have more than one active invocation of the same subroutine. If you tried to use static allocation to implement a recursive subroutine, then the nested calls to the subroutine would overwrite the parameter and local values of earlier calls.

Futhermore, static allocation of storage is rather wasteful, even in these days of cheap memory. Suppose there are 16 utility subroutines each of which needs a local array of 250Kb. If none of them is executing, then there could be 4Mb of storage completely unused. Whereas, if the local variables of a subroutine are only allocated when the subroutine is actually executing, then when none is in use, no memory is being used for their locals; and if (at most) one is in use at a time, then at most 1/4 Mb is needed.

Lastly, in an object-oriented environment, data are associated with classes and objects, not with a fixed 'program' with main memory. There is literally no place to allocate global static storage. In Java, static (class) variables are allocated from a system 'pool' when a class is loaded by the Java runtime environment. These can be referred to in methods of that class, but cannot be preallocated at the assembler level.

So to implement recursion, efficient use of memory, and dynamic object languages, we need some way to associate with each invocation of a subroutine a separate region of memory for its parameters and local variables. To do that, we need a new addressing mode - the stack relative addressing mode introduced during the last class.

3 Local variables and parameters

The stack-relative addressing mode allows us to use different memory locations for the variables of different invocations of a subroutine, even though the operand specifiers within the machine instructions implementing the subroutine remain the same. The operand specifiers are relative to the stack top, which changes as subroutines are called and return. It also allows memory for the different invocations to be allocated (on the stack) only when needed &endash; so that subroutines do not need to have memory statically allocated for them when they are not in use.

Before a procedure or function is called, the caller pushes the values of the parameters it is passing onto the stack. Then the first thing that the code implementing the subroutine does is to push onto the stack any space it requires for local variables. The beauty of the stack implementation of this is that there is essentially no overhead in doing so: the cost of allocating uninitialized data on the stack is zero. This is because it only requires adjusting the stack pointer, which is a single instruction. So, during the execution of the subroutine, the stack looks like this (we no longer show the addresses of the stack, since everything is relative to the top of the stack, i.e., whatever value is stored in the SP register):

4 Procedure call conventions and stack frames

The part of a program that calls a procedure or function (the caller) and the called subroutine (the callee) need to agree on how the stack is being used. The rules for using the stack are known as procedure call conventions.

They are conventions because they are not rules enforced by the hardware, but a protocol that different parts of the program use to communicate and cooperate. In a production environment, a program will frequently call procedures in program libraries written at other times by other programmers, so it is important that everybody follow the same convention. When you're developing in a high-level language, the compiler looks after the call conventions when it generates code implementing subroutines and subroutine calls.

Here is the call convention on the Pep/6. Remember that it is a matter of convention: some of the decisions are arbitrary (e.g., which order you put parameters on the stack) but it is important that everybody agree on the same arbitrary convention:
Caller
Callee
     
  • If the subroutine being called is a function, pushes storage for the returned value onto the stack. 
  • Pushes the values of parameters onto the stack (using stack-relative addressing mode), pushing the leftmost parameter first. 
  • JSR subroutine (pushes the return address onto the stack) 
     
  • Pushes onto the stack the space required for its local variables. 
  • Does its work, accessing parameters using stack-relative addressing mode. 
  • If the callee is a function, it stores the return value into the correct point on the stack. 
  • Pops the local variables off of the stack. 
  • RTS (pops return address off the stack). 
     
  • Pops the space allocated for the parameters and return value (if the callee was a function). 
  • If the called subroutine was a function, accesses the return value just above the current top of stack. 

The set of memory locations on the stack corresponding to a particular subroutine invocation is called a stack frame (the the term activation record is also used).

When a function is executing (i.e., from the perspective of the callee), the stack frame looks like this:
SP -> 
localn
local2
local1
return address
paramm
param2
param1
function's return value

 When a procedure is executing (i.e., from the perspective of the callee), the stack frame looks like this:
Contents
SP -> 
localn
local2
local1
return address
paramm
param2
param1

To see this in action we will soon see  an example of a simple procedure that takes two parameters, and has a single local variable (the index variable in the for loop).