A jump table has two principal advantages over a sequence of conditional branch instructions.
Warford shows how the indexed addressing mode can be used to implement a case statement with a jump table.
Here's a Java program illustrating a switch statement. Recall that in Java (as in most languages with this feature) (not all, though) the value governing the choice (indicated as the argument to switch) must be of a numeric type (including character). Even though the usual explanation of the statement is as a sequence of ifs, the implementation is not, and makes essential use of the fact that the arugument is numeric. Also recall that in Java, execution continues from the selected case label to the end of the switch block and not just to the next case label. (That's true in C and C++ too, but not in Turing or Ada or COBOL.) (example_21_java)
Here follows the PEP/6 implementation of the above. It uses a jump table containing, for each case, the address of its first instruction. Indexed addressing mode allows the jump table to be anywhere in memory: its location is loaded into the base register. (Just remember to jump around it if necessary.) The trick is then to compute an appropriate value into the X register to indicate the particular choice. The example program also includes the error checking of the input - only letters within the range 'A' to 'F' are accepted and processed by the case statement. Notice what is done for case 'E'. The beginning address of that table is placed into the base register.class CaseExample {// Illustrate case statement. // This program converts a letter grade into a number grade. public static void main (String []) { short marknum; byte marklet; marklet = Io. charInput (); switch (marklet) { case 'A': marknum = 95; break; case 'B': marknum = 75; break; case 'C': marknum = 65; break; case 'D': marknum = 55; break; case 'F': marknum = 0; break; default: marknum = -1; break; } Io. decimalOutput (marknum); }}
The ".ADDRSS symbol" pseudo-operation allocates a word in memory and initializes it with the value of symbol. So the effect of the six .ADDRSS pseudo-operations in the example are to create an array, called swtab0 (for "switch table") whose six elements are the addresses of the first instruction to be executed in each of the six cases. This was also used to initialize reference variables in the lecture on arrays.
For this program, The value in the X register needs to be one of the following : 0, 2, 4, 6, 8 or 10 ( these are number of bytes counted from the beginning of the jump table). What we have to start with to differentiate the cases is letters in upper case, which really are ASCII codes for those letters. Letter 'A' is 65 (dec), letter 'B' is 66 (dec), letter 'C' is 67 (dec) and so on. Initially we place the letter code that was input into register X. If we subtract from X the offset of the first case (= code for 'A' = 65 (dec)), then we are going to have the following possible values in our register X: 0, 1, 2, 3, 4, 5. This is getting us closer to the desired set of values for register X. All we need to do is to multiply value of X by two (using ASLX instruction). Why by two? Because that is the length in bytes of each element of the jump table (and this element is an address of each individual case. Memory addresses on PEP/6 take 2 bytes).
Essentially, the switch argument is an index into an array of cases, and the above algorithm is computing the offset from the array index in the usual way. But instead of loading indirectly, as for a data array, we jump indirectly: the BR ,x instruction sets the PC value to be that of memory [ base register + index register]. (example_21_assembl)
------------------------------------------------------------------------------- Object Addr code Symbol Mnemon Operand Comment ------------------------------------------------------------------------------- ; Illustrate case statement translation ; 0000 700006 BR main ; 0003 0000 marknum: .WORD d#0 ; int marknum; 0005 00 marklet: .BYTE d#0 ; char marklet; ; 0006 D90005 main: CHARI marklet,d ; input marklet 0009 600020 LOADB swtab0,i 000C 0C0000 LOADX d#0,i 000F 550005 LDBYTX marklet,d ; load marklet char into X reg 0012 BC0046 COMPX c#/F/,i ; check marklet is not > 'F' 0015 A00059 BRGT default ; if so, default case 0018 240041 SUBX c#/A/,i ; subtract offset of first case 001B 800059 BRLT default ; if negative, char was < 'A' 001E 44 ASLX ; multiply X by size of table element 001F 73 BR ,x ; take case 0020 002C swtab0: .ADDRSS caseA ; case 'A' 0022 0035 .ADDRSS caseB ; case 'B' 0024 003E .ADDRSS caseC ; case 'C' 0026 0047 .ADDRSS caseD ; case 'D' 0028 0059 .ADDRSS default ; there is no case 'E' 002A 0050 .ADDRSS caseF ; case 'F' 002C 08005F caseA: LOADA d#95,i 002F 110003 STOREA marknum,d ; marknum = 95; 0032 70005F BR endcase ; break; 0035 08004B caseB: LOADA d#75,i 0038 110003 STOREA marknum,d ; marknum = 75; 003B 70005F BR endcase ; break; 003E 080041 caseC: LOADA d#65,i 0041 110003 STOREA marknum,d ; marknum = 65; 0044 70005F BR endcase ; break; 0047 080037 caseD: LOADA d#55,i 004A 110003 STOREA marknum,d ; marknum = 55; 004D 70005F BR endcase ; break; 0050 080000 caseF: LOADA d#0,i 0053 110003 STOREA marknum,d ; marknum = 0; 0056 70005F BR endcase ; break; 0059 08FFFF default: LOADA d#-1,i 005C 110003 STOREA marknum,d ; marknum = -1; 005F F10003 endcase: DECO marknum,d ; output marknum 0062 00 STOP 0063 .END ------------------------------------------------------------------------------- Symbol table -------------------------------------- Symbol Value Symbol Value -------------------------------------- caseA 002C caseB 0035 caseC 003E caseD 0047 caseF 0050 default 0059 endcase 005F main 0006 marklet 0005 marknum 0003 swtab0 0020 -------------------------------------- No errors. Successful assembly.
You can study program 6.15 in your text to find out a more efficient way of writting and reading strings. The print subroutine in that program uses the indexed addressing to print out in a loop all the characters included in a long string of characters specified using the .ASCII pseudo op (instead of a sequence of inline CHARO statements).
For reading or modifying string data in C and C++, the programmer allocates a buffer, which is just another name for a byte array. To initialize or place values in the buffer, one specifies the usual array operations. For example (this is in C):
(In the foregoing program, malloc(n) means allocate a new string buffer of 15 bytes' length, and return a pointer (reference) to it; and strlen(s) means the length in characters of a string: that is, the number of characters up to (not including) the terminating null-character.)buffer = malloc (15); mystring = "example"; ... for (i = 0; i < strlen (mystring); ++i) buffer [i] = mystring [i]; mystring [i] = 0; // mustn't forget the terminating null-character
Normally, C programmers don't work at this low level, but use utility routines (e.g. in the strings.h package) to perform typical operations. The above code implements "string copying" and would normally be just coded as
In Java, however, String is a builtin class type, and quoted strings are of class type String. One of the key facts about the type String is that it's 'immutable', that is, there are no methods to change the characters in a string object. For example, when you programbuffer = strcpy (malloc (15), "example");
what happens isString x = "this is"; x = x + " a test";
This is of course why in Java one must be careful to remember to code .equals rather than == to compare two Strings. Recall that == means "is equal" only for value types; for class types == means "refers to the same obejct as". And there is no guarantee that, for example, "this is" + " a test" should be the same obejct as "this is a test" -- only that the two objects have the same contents (and length).
In this course we wish to avoid issues relating to object-orientation as much as possible. In our example programs we will therefore use only character buffers, represented as byte arrays (byte []). When we use a quoted string (which in Java is defined to be of type String) we will always write .getBytes() after it, which is a standard String method to return an equivalent byte array. The resulting code isn't very realistic, but it does help to keep the concepts closer to the machine level.
Recall also that Java uses 16-bit (Unicode) characters; the method getBytes converts them to ASCII in a suitable way.
Here is a translation into Pep/6 assembler. (example_22_assembl)class StringExample {static void readStr (byte [] inStr) { short c; byte inch; for (c = 0; c < inStr. length; ++c) { inch = Io. charInput (); if (inch == '\n') { inStr [c] = 0; // mark end of the string return; } else inStr [c] = inch; } } static void writeStr (byte [] outStr) { short c; for (c = 0; c < inStr. length && inStr [c] != 0; ++c) Io. charOutput (outStr [c]); } public static void main (String []) { byte [] buffer = new byte [100]; writeStr ("Please enter something:". getBytes ()); readStr (buffer); writeStr ("You typed:". getBytes ()); writeStr (buffer); }}
; StringExample ; ; This program contains the code of two useful routines ; for reading and writing byte arrays. ; ; Byte arrays are represented as a data structure consisting ; of the number of elements (positive binary integer) and then ; the actual storage of the array. ; A pointer to the array conventionally points to the actual storage. ; BR main ; branch to main routine ; ; This equate is the indirect offset of the length of an array. length: .EQUATE d#-2 ; ; Static allocations for main method. ; ; 1: new byte [100] .WORD d#100 ; new byte [100] byte100: .BLOCK d#100 ; ; 2: "Please enter something:". getBytes () .WORD d#23 PleaseEn: .ASCII /Please enter something:/ ; ; 3: "You typed:". getBytes () .WORD d#10 YouTyped: .ASCII /You typed:/ ; ; 4; byte [] buffer buffer: .WORD h#0000 ; readStr: ADDSP d#-3,i ; static void readStr ( inStr: .EQUATE d#5 ; byte [] inStr) c1: .EQUATE d#0 ; { short c; inch: .EQUATE d#2 ; byte inch; LOADA d#0,i ; for (c = 0; STOREA c1,s LOADB inStr,s ;start accessing array inStr loop1: LOADX length,i COMPA ,x ; c < inStr. length; BRGE end1 ; { CHARI inch,s ; inch = Io. charInput (); LOADX c1,s ;for indexing array LOADA h#0000,i ;clear A for byte compare LDBYTA inch,s COMPA h#000A,i ; if (inch == '\n') BRNE else1 ; { LDBYTA d#0,i STBYTA ,x ; inStr [c] = 0; ADDSP d#3,i RTS ; return; else1: STBYTA ,x ; } else inStr [c] = inch; LOADA c1,s ADDA d#1,i STOREA c1,s ; ++c) BR loop1 ; } end1: ADDSP d#3,i RTS ; } ; writeStr: ADDSP d#-2,i ; static void writeStr ( outStr: .EQUATE d#4 ; byte [] outStr) c2: .EQUATE d#0 ; { short c; LOADA d#0,i ; for (c = 0; STOREA c2,s LOADB outStr,s ;start accessing array outStr loop2: LOADX length,i COMPA ,x ; c < inStr. length BRGE end2 LOADX c2,s ; && inStr [c] != 0; LOADA h#0000,i ;clear A for byte compare LDBYTA ,x COMPA d#0,i BREQ end2 CHARO ,x ; Io. charOutput (outStr [c]); LOADA c2,s ADDA d#1,i STOREA c2,s ; ++c) BR loop2 end2: ADDSP d#2,i RTS ; } ; ; ; public static void main (String []) ; ; { main: LOADA byte100,i; byte [] buffer = new byte [100]; STOREA buffer,d ADDSP d#-2,i ; writeStr ( LOADA PleaseEn,i; "Please en...".getBytes () STOREA d#0,s JSR writeStr ; ); LOADA buffer,d ; readStr ( STOREA d#0,s ; buffer JSR readStr ; ); LOADA YouTyped,i; writeStr ("You typed:".getBytes () STOREA d#0,s JSR writeStr ; ); LOADA buffer,d ; writeStr ( STOREA d#0,s ; buffer JSR writeStr ; ); STOP ; } .END
Here is a Java method that uses boolean data. Note that the value of flag is not dependent on x throughout, just at the point of assignment.
We prefer to translate true as h#FFFF (i.e. d#-1) and false as h#0000 (i.e. d#0) because the logical operations of the Pep/6 machine (AND, OR, NOT) are rather directly translated for these values. Warford uses d#1 for true instead, but this fails to have simple logical properties like NOT(true) = false. Of course, a boolean only really requires one bit, but it is more convenient to operate on 16-bit values on the Pep/6.void flagger () { short val; boolean flag; val = Io. decimalInput (); // read a value for val flag = (val > 7); // test it by assigning flag val = Io. decimalInput (); // read another value for val Io. decimalOutput (val); // display that value if (flag) // try the test on the old value Io. charOutput ('>'); }
flagger: ADDSP d#-4,i ; void flagger () { val: .EQUATE ; short val; boolean flag; x = Io. decimalInput (); // read a value for x flag = (x > 7); // test it by assigning flag x = Io. decimalInput (); // read another value for x Io. decimalOutput (x); // display that value if (flag) // try the test on the old value Io. charOutput ('>'); }
; Illustrate bool type translation ; ; input x ; flag = (x > 7) ; input x again ; output x ; if flag output > ; 0000 700007 BR main ; TRUE: .EQUATE d#-1 FALSE: .EQUATE d#0 ; 0003 0000 flag: .WORD d#0 ; int flag; 0005 0000 x: .WORD d#0 ; int x = 7; ; 0007 E90005 main: DECI x,d ; read x 000A 0CFFFF LOADX TRUE,i ; flag = (x > 7); assume true 000D 090005 LOADA x,d 0010 B80007 COMPA d#7,i ; compare to 7 0013 A00019 BRGT setflag ; gt so true was right 0016 0C0000 LOADX FALSE,i ; not gt so flag should be false 0019 150003 setflag: STOREX flag,d 001C E90005 DECI x,d ; read x again 001F F10005 DECO x,d ; output it 0022 090003 LOADA flag,d ; if (flag) 0025 88002B BREQ notflag ; eq 0, flag is false 0028 E0003E CHARO c#/>/,i ; ne 0, flag is true, output > 002B 00 notflag: STOP 002C .END ------------------------------------------------------------------------------- Symbol table -------------------------------------- Symbol Value Symbol Value -------------------------------------- FALSE 0000 TRUE FFFF flag 0003 main 0007 notflag 002B setflag 0019 x 0005 -------------------------------------- No errors. Successful assembly.
Here is a Java program which includes a (nested) class playing the role of an aggregate type: (example_23_java)
Here is an assembler version of the same program. Note that the two Student objects have to be constructed by initializing the members (number, name). This is done in Java when new instances of the class are allocated, and would probably done by a subroutine call rather than inline as shown. (example_23_assembl)class Example {class Student { byte [] number = new byte [7]; byte [] name = new byte [30]; int final; } public static void main (String [] args) { Student studentA = new Student (); Student studentB = new Student (); studentA. final = Io. decimalInput (); studentB. final = Io. decimalInput (); if (studentA.final > studentB.final) Io. charOutput ('>'); }}
BR main ; ; class Student ; ; { ; Member variable offsets. number: .EQUATE d#0 ; byte [] number = new byte [7]; name: .EQUATE d#2 ; byte [] name = new byte [30]; final: .EQUATE d#4 ; int final; ; ; } ; ; Static allocation of main storage. ; Student0: .BLOCK d#6 ; new Student () Student1: .BLOCK d#6 ; new Student () .WORD d#7 ; new byte [7] byteArr0: .BLOCK d#7 .WORD d#30 ; new byte [30] byteArr1: .BLOCK d#30 .WORD d#7 ; new byte [7] byteArr2: .BLOCK d#7 .WORD d#30 ; new byte [30] byteArr3: .BLOCK d#30 studentA: .BLOCK d#2 studentB: .BLOCK d#2 ; ; public static void main (String []) { main: LOADB Student0,i ;prepare to construct new Student () LOADX number,i LOADA byteArr0,i STOREA ,x ;number=new byte [7]; LOADX name,i LOADA byteArr1,i STOREA ,x ;name=new byte [30]; LOADA Student0,i STOREA studentA,d ; studentA = new Student (); LOADB Student1,i ;prepare to construct new Student () LOADX number,i LOADA byteArr2,i STOREA ,x ;number=new byte [7]; LOADX name,i LOADA byteArr3,i STOREA ,x ;name=new byte [30]; LOADA Student1,i STOREA studentB,d ; studentB = new Student (); LOADB studentA,d LOADX final,i DECI ,x ; studentA.final = Io. decimalInput (); LOADB studentB,d DECI ,x ; studentB.final = Io. decimalInput (); LOADB studentA,d LOADA ,x LOADB studentB,d COMPA ,x ; if (studentA.final > studentB.final) BRLE end1 CHARO c#/>/,i ; Io. charOutput ('>'); end1: STOP ; } .ENDIn this example indexed addressing mode is used again. The base register contains the starting address of an object (here it is referred to through either studentA or studentA). The X register contains the offset of a given member of the class (field of the structure) from the beginning of the structure. As usual, when programming by hand in assmbler it is best to use symbolic names for the offsets.
The fuller complexity of Java classes requires considerable additional information. The actual class of a given object (and hence the actual layout in memory) cannot be determined at compile time (because of inheritance). So it must be deferred until run time. In addition, the actual code executed when a method is called depends on the actual type of the calling object, also not determined at compile time. Hence objects are laid out in storage with an additional pointer to a jump table which can be used to find the right method code. At this point we are really getting away from the course subject matter.
Pointer types are usually implemented by storing the address of the memory containing the data pointed to. This kind of implementation is really the same as our implementation of class types for Java. Some languages (Turing and PL/I) provide pointers as a kind of index into a storage area where the things pointed to are actually stored. Such an index-based implementation allows the system to verify that a pointer value is valid (points somewhere) when it's dereferenced.
Languages in the design tradition of C, including BCPL, B, C, and C++, allow pointers to be a kind of arithmetic type -- after all, the underlying machine uses addresses, which are just numbers. This allows the programmer to perform address arithmetic just as is done at the assembler level. For example, if A points to a string of characters in memory, then A+2 is also a pointer, and points to the suffix of that string starting after the second character. This sort of feature brings the level of programming much closer to the actual machine.
An important use of pointers is to handle dynamic allocation of memory. When the equivalent of a new operation is executed, the result is a pointer value which points to the newly allocated memory and newly constructed object. It's the same as what happens in Java, except that the pointer (reference) value is made more explicit. There is no practical way to illustrate dynamic allocation at the Pep/6 assembler level; this can be discussed in later courses. Furthermore, on a machine with a tiny memory (like the Pep/6) static allocation is really the only practice that makes sense. Even in a high-level language, the programmer must calculate carefully how much memory is used ast all points in his program.
In Java, a class with this property typically designates some static final int values, that is, some constant class-wide integers, each given a different value, and each given a different name, as in
Of course, these values are just integers, and nothing prevents the user from specifying the integers literally (color = 3): or even specifying values which aren't intended as proper colour names at all (color = -1).class ScreenColour { static final int RED = 0; static final int BLUE = 1; static final int GREEN = 2; static final int MAGENTA = 3; ... } ... int colour; colour = ScreenColour. BLUE;
In many other languages, including C++ and Pascal, this kind of situation is handled by a special kind of type declaration. The compiler can then check that the values used are always only the right ones, and are only referred to by name. Also, the programmer is relieved of specifying the generally necessary but particularly irrelevant individual values. In C++, for example, one writes:
Where the line beginning "typedef" declares a new type of the "enumerated" kind and declares its constants. The compiler decides on the values (normally just integers starting at 0). Literal or invalid assignments to an enumerated type variable are syntax errors.typedef enum {RED, BLUE, GREEN, MAGENTA} ScreenColour; ... ScreenColour colour; colour = BLUE;
At the assembler level, enumerated types are implemented just as successive
integers. In this respect, Java uncharacteristically is closer to the machine
than C++ is, because in Java the programmer has to assign the values and
types for "enumerated" constants.