There need to be special instructions for these steps, since they involve changing the value of the program counter. On the Pep/6, subroutine calls use JSR to push the return address onto the runtime stack and set the PC to the address of the first instruction in the subroutine. The return from the subroutine uses RTS to pop the saved return address off of the run-time stack and into the PC, so that the next instruction to be fetched and executed will be the instruction after the call to the subroutine.
Here's an example of a program which includes nested calls (demonstrating why we need a stack for the return addresses) (example_15_java)
and the Pep/6 assembly language that implements it (example_15_assembl)class Program {// Example of nested simple procedures in Java. // The program simulates most users's impression // of almost all software: it inputs a number-- // which it ignores--and prints two error messages. static final byte newLine = 10; static void punctuate () { Io. charOutput ((byte) '!'); } static void printError () { Io. charOutput ((byte) 'E'); Io. charOutput ((byte) 'r'); Io. charOutput ((byte) 'r'); Io. charOutput ((byte) 'o'); Io. charOutput ((byte) 'r'); Io. charOutput (newLine); punctuate (); } byte inputToIgnore; public static void main (String [] args) { inputToIgnore = Io. decimalInput (); printError (); printError (); }}
Let's follow the execution of the first call to printError, seeing what happens to the stack. The instruction corresponding to this call is the JSR instruction at address 0022. So the value of the PC is h#0022 when we start tracing the call. The user stack will still be empty, so the stack pointer, SP, will still contain its initial setting at the bottom of the user stack 0BE6 (although we don't really need to know its exact starting value to trace the subroutine calls, since manipulations of the stack pointer are always relative to its old value).------------------------------------------------------------------------------- Object Addr code Symbol Mnemon Operand Comment ------------------------------------------------------------------------------- ; ; Assembly language version of program containing ; nested simple procedures. ; 0000 70001F BR main ; newline: .EQUATE d#10 ; ; ---- procedure punctuate ---- ; 0003 E00021 punctuat:CHARO c#/!/,i 0006 E0000A CHARO newline,i 0009 C8 RTS ; ; ---- procedure printError ---- ; 000A E00045 printErr:CHARO c#/E/,i ; put "Error" .. 000D E00072 CHARO c#/r/,i 0010 E00072 CHARO c#/r/,i 0013 E0006F CHARO c#/o/,i 0016 E00072 CHARO c#/r/,i 0019 C00003 JSR punctuat ; Call punctuate 001C C8 RTS ; ; ---- main program ---- ; ; Data for main program: 001D 0000 inputTI: .WORD d#0 001F E9001D main: DECI inputTI,d 0022 C0000A JSR printErr 0025 C0000A JSR printErr 0028 00 STOP ; 0029 .END ------------------------------------------------------------------------------- Symbol table -------------------------------------- Symbol Value Symbol Value -------------------------------------- inputTI 001D main 001F newline 000A printErr 000A punctuat 0003 -------------------------------------- No errors. Successful assembly.
The CPU fetches the instruction at address 0022 into the instruction register, increments to PC to 0025, and then executes the instruction it has just fetched.
To execute the JSR, the CPU pushes the value in the just-incremented PC onto the stack, as the address to return to when the procedure is finished, and then puts the operand of the JSR instruction, which is the address of the first instruction implementing the printError procedure, into the PC.
The stack now looks like this (the locations above the current top-of-stack are labeled "junk" to indicate that they contain some value left over from previous computations):
Address | Contents | |
. | . | |
. | . | |
. | . | |
0BE0 | ? (junk) | |
0BE2 | ? (junk) | |
top ->
|
0BE4 | 0025 (first return address) |
PC: 000A, SP: 0BE4
The CPU fetches this JSR instruction from the address 0019 and increments the PC to 001C. Then it pushes that new value of the PC onto the stack as the return address for this function call, and sets the PC to the operand of the JSR, which is the address of the first instruction of the punctuate procedure.
The stack now looks like:
Address | Contents | |
. | . | |
. | . | |
0BE0 | ? (junk) | |
top->
|
0BE2 | 001C (second return address) |
0BE4 | 0025 (first return address) | |
PC: 0003, SP: 0BE2
The CPU fetches the RTS instruction from 0009, and increments the PC.
Then, to execute the RTS instruction, it pops the return address off the top of the stack and puts that value into the program counter. As a result, the next instruction to be fetched and executed will be the instruction after the JSR instruction that called this subroutine.
The stack is back the way it was before the call to punctuate:
Address | Contents | |
. | . | |
. | . | |
0BE0 | ? (junk) | |
0BE2 | 001C (junk) | |
top->
|
0BE4 | 0025 (first return address) |
PC: 001C, SP: 0BE4
Notice that the memory location above the new top-of-stack continues
to contain what used to be the return address from the nested procedure
call. Popping a value off the stack just involves moving the stack pointer
so that the new top of stack is below the discarded value; the contents
of the popped memory locations remain unchanged until they are rewritten
by other instructions (probably as a result of the stack growing again
later). This can sometimes be confusing when you are debugging a program.
So after incrementing the PC, the CPU executes the RTS by immediately replacing the value in the PC with the return address it pops off the stack. That leaves the stack empty, and returns control to the instruction in the main program that follows the JSR that we started with.
Address | Contents | |
. | . | |
. | . | |
0BE0 | ? (junk) | |
0BE2 | 001C (junk) | |
0BE4 | 0025 (junk) | |
top->
|
0BE6 | ? |
PC: 0025, SP: 0BE6
Static allocation works as long as there can be only one invocation of each subroutine active at a time. The implementation of recursion, though, requires the ability to have more than one active invocation of the same subroutine. If you tried to use static allocation to implement a recursive subroutine, then the nested calls to the subroutine would overwrite the parameter and local values of earlier calls.
Futhermore, static allocation of storage is rather wasteful, even in these days of cheap memory. Suppose there are 16 utility subroutines each of which needs a local array of 250Kb. If none of them is executing, then there could be 4Mb of storage completely unused. Whereas, if the local variables of a subroutine are only allocated when the subroutine is actually executing, then when none is in use, no memory is being used for their locals; and if (at most) one is in use at a time, then at most 1/4 Mb is needed.
Lastly, in an object-oriented environment, data are associated with classes and objects, not with a fixed 'program' with main memory. There is literally no place to allocate global static storage. In Java, static (class) variables are allocated from a system 'pool' when a class is loaded by the Java runtime environment. These can be referred to in methods of that class, but cannot be preallocated at the assembler level.
So to implement recursion, efficient use of memory, and dynamic object languages, we need some way to associate with each invocation of a subroutine a separate region of memory for its parameters and local variables. To do that, we need a new addressing mode - the stack relative addressing mode introduced during the last class.
Before a procedure or function is called, the caller pushes the values of the parameters it is passing onto the stack. Then the first thing that the code implementing the subroutine does is to push onto the stack any space it requires for local variables. The beauty of the stack implementation of this is that there is essentially no overhead in doing so: the cost of allocating uninitialized data on the stack is zero. This is because it only requires adjusting the stack pointer, which is a single instruction. So, during the execution of the subroutine, the stack looks like this (we no longer show the addresses of the stack, since everything is relative to the top of the stack, i.e., whatever value is stored in the SP register):
They are conventions because they are not rules enforced by the hardware, but a protocol that different parts of the program use to communicate and cooperate. In a production environment, a program will frequently call procedures in program libraries written at other times by other programmers, so it is important that everybody follow the same convention. When you're developing in a high-level language, the compiler looks after the call conventions when it generates code implementing subroutines and subroutine calls.
Here is the call convention on the Pep/6. Remember that it is a matter of convention: some of the decisions are arbitrary (e.g., which order you put parameters on the stack) but it is important that everybody agree on the same arbitrary convention:
|
|
|
|
|
|
|
The set of memory locations on the stack corresponding to a particular subroutine invocation is called a stack frame (the the term activation record is also used).
When a function is executing (i.e., from the perspective of the callee), the stack frame looks like this:
. | |
SP ->
|
localn |
. | |
. | |
. | |
local2 | |
local1 | |
return address | |
paramm | |
. | |
. | |
. | |
param2 | |
param1 | |
function's return value |
When a procedure is executing (i.e., from the perspective of the callee), the stack frame looks like this:
Contents | |
. | |
SP ->
|
localn |
. | |
. | |
. | |
local2 | |
local1 | |
return address | |
paramm | |
. | |
. | |
. | |
param2 | |
param1 |
To see this in action we will soon see an example of a simple procedure that takes two parameters, and has a single local variable (the index variable in the for loop).