I/O Devices, Software and Hardware Interrupts

The Operating System: Trap Services and I/O devices

References:

Traps (synchronous interrupts) -- how user programs access OS services. (Warford 8.2,Tanenbaum p 379)
I/O Devices (Tanenbaum Ch 2, Hennessy & Patterson Ch. 8, Hamacher et. al. 2.6, 4.1, 4.5.2);
Program-controlled I/O, or Polling (Hamacher et. al. 2.6, Hennessy & Patterson 8.5, Tanenbaum 5.5.7)
Asynchronous interrupts (Warford, pages 332-348, Tanenbaum 5.6.5, Hennessy & Patterson, 8.5, Hamacher et. al. 4.2).

Roles of an Operating System

The key roles of the operating system are to:

provide services to application programs that create a more convenient programming interface than the bare machine's instruction set. For example, the Pep/6's operating system adds the ability to input and output decimal numbers (through the DECI and DECO facilities) to the character input and output instructions provided in the level 3 instruction set.
manage system resources, including:
- sharing CPU time among application processes (A process is a running instance of a program. Warford's definition is "a program during execution"; including the word "instance" emphasizes that multiple processes may be executing the same program.)
- allocating memory.
- managing input/output devices.

1. Traps (software or synchronous interrupts)

Traps provide the mechanism used by application programs to request the services of the operating system, as well as a mechanism for the operating system to regain control of the machine when an application program causes something to go wrong.

Whichever word you choose to use for it, a trap, exception, or synchronous interrupt occurs when some action taken by an application program causes the hardware to transfer control away from the user program, into a piece of code called a trap handler (or exception handler, or interrupt handler).

The sort of exceptional conditions which can cause a trap include:

an attempt to execute an unimplemented opcode;
an integer or floating point overflow or underflow (on some machines);
an attempt to divide by zero;
a misaligned memory reference;
memory protection violation;
system calls. (There are instructions that exist for the express purpose of requesting a service from the operating system.)

The transfer of control to the trap handler is like a call to a regular procedure, except that it results not from an explicit procedure call in the application program, but some exceptional condition caused by the program and detected by the hardware.

System calls are not really an exceptional condition, but they use the trap mechanism to transfer control from the application program running in user mode to the operating system kernel running in supervisor mode. The kernel then provides the service corresponding to the requested system call.

These are software interrupts because they are caused by actions of the software (hardware interrupts occur when an I/O device has to get the CPU's attention, or when the power goes out).

They are synchronous interrupts because if you run the same program with the same data, the same exceptions will occur at the same points every time: the exceptions are synchronous with the program.

Besides being the result of an exceptional condition rather than an explicit procedure call, invocations of a trap or interrupt handler also differ from an invocation of a regular procedure in that they need to save more context in order for the trap handler to return control back to the user program.

When an exception occurs on the Pep/6

On the Pep/6, the only form of trap or interrupt that exists is the attempt to execute an unimplemented opcode (one of the opcodes 11101, 11110, or 11111 which do not correspond to any instruction in the Pep/6 hardware's machine language).

Just to confuse matters, the Pep/6 uses attempts to execute unimplemented opcodes to implement system calls. After the unimplemented opcode traps into the operating system's interrupt service routine, the interrupt service routine uses the number of the unimplemented opcode to determine which OS service (DECI, DECO or HEXO) it should perform.

Here's what the Pep/6 hardware does when an application program executes an unimplemented instruction:

Pushes onto the system stack:
1. the instruction specifier from the IR
2. the contents of SP,
3. PC,
4. B,
5. X,
6. A,
7. and NZVC
Sets SP to the top of system stack.
Sets the PC to the address of the interrupt service routine.

The base of the system stack and the address of the interrupt service routine are both read from fixed points in the Pep/6 ROM, the machine vectors h#0FFA and h#0FFE. The locations of these vectors are wired into the Pep/6 hardware.

When all of these steps are completed, the effect is that the next instruction that the hardware executes will be the first instruction in the interrupt service routine, and that the Pep/6 will be using the system stack rather than the user stack for implementing procedure calls.

The contents of the machine's registers at the point that the application program caused the exception (the attempt to execute an unimplemented instruction) are stored on the system stack. These registers, together with the contents of the application's portion of main memory, encapsulate the state of the application program's computation at the moment of the exception, also known as the process's context. The block of memory where the register contents are saved is called a process control block or PCB.

When the interrupt service routine is finished doing whatever needs to be done to provide the requested service, it executes the the Pep/6 instruction RTI to return from the interrupt.

To implement RTI, the Pep/6 CPU:

Pops the following values back off the system stack into the corresponding registers:
1. NZVC
2. A
3. X
4. B
5. PC
6. SP

If the Interrupt Service Routine changed none of the saved values of the registers, the effect is to return to the application program to the state it was in when the trap occurred. The PC will contain the address of the next instruction the application was to execute, and the application program will go on as if nothing has happened. This property of transparency is important when handling asynchronous, hardware interrupts which require the CPU to divert its attention from the running program to take care of some other business, generally an I/O device which needs attention. In the case of a system call, though--which is what all traps/interrupts on the Pep/6 amount to--the application program expects that something will have changed as a result of the exception: the requested service should have been performed.

2. I/O Devices: Controllers, Polling and Asynchronous Interrupts

We just discussed traps, also known as synchronous interrupts or software interrupts. The next logical topic is that of asynchronous, hardware interrupts. Input and output devices provide the main source of hardware interrupts, though, so we first need to talk about devices and how they are organized in the computer system.

The diversity of input/output devices

The characteristics of I/O devices vary widely, as illustrated by this table.

Device

Behaviour

Partner

Data rate (KB/sec)

Keyboard

input

human

0.01

Mouse

input

human

0.02

Scanner

input

human

400.00

Line printer

output

human

1.00

Laser printer

output

human

200.00

Graphics display

output

human

60,000.00

Network interface

input/output

machine

500.00-6,000.00

Floppy disk

storage

machine

100.00

Hard disk

storage

machine

2,000.00-10,000.00

"Storage" devices are those where data, once read, can be reread and usually rewritten.

The "Partner" column indicates whether the entity supplying the data to the input device, or reading the product of an output device, is a person or another machine.

The data transfer rates range from a few bytes per second, for a keyboard, to megabytes per second for graphics displays, network interfaces, and disk drives.

Slower devices, like terminals and line printers, are generally connected to the computer via a serial port. In serial communications, data is transmitted between the computer and the device one bit at a time.

Faster devices, like scanners and laser printers, frequently use some kind of parallel connection, where data is transmitted 8 bits (or maybe even more) at a time. The cable connecting your PC to a printer is thicker than the cable connecting it to the keyboard because there are more wires in it, to transmit multiple bits at once.

The range of transfer rates in the table illustrate two difficulties that must be dealt with when implementing input and output:

how do you handle I/O from slow devices like keyboards and mice without forcing the CPU to wait around between each byte of input. Think about a mouse that, every 100 milliseconds, sends to the CPU a few bytes describing its movement since the last report. If the CPU runs at a clock rate of 100 megahertz and takes, on average, 5 clock cycles to complete each instruction (a not particularly fast CISC-ish CPU), then it would be capable of 2 million instructions in the interval between two reports from the mouse.
how do you prevent I/O from fast devices, like a hard disk drive, from swamping the CPU, preventing it from doing anything but transfer data to and from the drive.

Device organization

Most computer systems use a bus to connect the central processing unit to the memory and to the input/output devices. A bus is a set of parallel wires that the different devices use to communicate.

Recall that when we looked at the diagrams of machine architectures corresponding to instruction sets with different numbers of operands, I talked about the memory address and data registers (MAR and MDR) within the CPU. These registers form the interface between the CPU and the bus. To read the data at a particular location in memory:

the CPU puts the address into the MAR. Then the contents of the MAR are put onto the address lines of the bus, together with some combination of control lines to signal that the CPU wants to read from memory.
the memory responds by putting the data at the requested address onto the data lines of the bus.
the data on the data lines are transferred into the MDR of the CPU.

The CPU is a bus "MASTER". It can initiate data transfers between itself and memory or I/O.
Memory and I/O devices are bus "SLAVES". They cannot initiate data transfer themselves but rather can only respond to data transfer requests initiated by the CPU.
Computer systems generally also include one or more Direct Memory Access (DMA) Controllers. It is an auxiliary device which the CPU can initialize to perform block (multiple byte) data transfers between I/O devices and memory directly, without the intervention of the CPU. A DMA Controller is also a bus "MASTER" as it can initiate data transfers. Control of the system bus must be arbitrated (by bus arbitration hardware) to decide whether the CPU or DMA controller is allowed to control the address, data and control signals of the system bus at a particular instant in time.

Buses

"A bus is a common electrical pathway between multiple devices" (Tanenbaum, p. 156).

Typically the signal lines making up the bus include:

a set of address lines
a set of data lines
a set of control lines

The control lines carry information specifying things like the size of the data transfer (8/16/32 bit), the direction of the transfer (READ or WRITE) in relations to the processor, the timing of the transfer, interrupt signals, and so on.

Each line is a wire which carries a binary signal corresponding to one bit of information. The signal uses two different voltages (or ranges of voltages) to represent each possible binary state (a binary 1 or a binary 0).

1. I/O controllers

The device controllers control the devices' mechanisms, and act as the intermediary between the devices and the bus. So they include the circuitry required to follow the bus protocol, as well as the circuitry required to direct the device to carry out the requests from the rest of the system.

For a clear example of the sort of things that I/O controllers need to do, think about the controller that manages a device attached to the computer with a serial cable (one where data is transmitted one bit at a time). If the device is an input device, one of the controller's jobs is to collect the individual bits received from the input device and group them into bytes or words for transmission on the system bus. Conversely, if the serial device is an output device, the controller has to split up data words received on the bus, transmitting the corresponding bits to the output device one at a time.

What's in the I/O controller?

An I/O controller contains:

an address decoder which is connected to the address lines on the bus. It recognizes whether the address is intended for this device.
control circuits which govern the timing of data transfers and route data to and from specific registers in the controller.
data, control, and status registers which store information temporarily within the controller.

The data and status registers provide the interface to the device seen by the CPU. So the next question is how the CPU addresses these registers.

Port-mapped versus memory-mapped I/O

There are two methods used to address the device registers. The method chosen by a computer's designers determines the form of instructions used to perform I/O on that system.

Port-mapped I/O: the I/O device registers are mapped into a separate address space, and given port numbers. The instruction sets of port-mapped machines include special input/output commands which take a port number as an operand. The Intel architecture, for example, is port-mapped and provides the instructions IN and OUT to perform I/O:
```
 IN register, port
 OUT port, register
 
```
The port numbers will be multiplexed onto the address lines of the bus, with a control line indicating whether a port number or a memory address is on the bus at a given moment.
memory-mapped I/O: the I/O device registers are mapped into regular memory addresses. All the instructions that reference memory are available for dealing with the device registers.
When I/O device addresses appear on the bus, the memory ignores them, and the appropriate device recognizes that "this means me".

A system can mix the two models. For example, the Intel family of processors have port-mapped I/O, but the frame buffer in the graphics controller (which contains numbers encoding what colour should be shown at each pixel on the screen) is still mapped to particular memory locations. Computer graphics programmers like to be able to perform arbitrary operations on the pixel values in memory.

Is the Pep/6 port-mapped or memory-mapped?

The Pep/6 looks like a port-mapped machine, because it has special instructions for performing I/O-- CHARI and CHARO--with an implicit port number (since there is only one input and one output device, it isn't necessary to have an explicit port number to distinguish between devices)

The I/O controller as seen by the CPU

Whether port-mapped or memory-mapped, the interface that the device controller presents to the CPU will consist of data registers, status and control registers.

Data registers: are read or written to transfer data from or to the device.
Status registers: are read to check whether the device is ready for another I/O operation.
Control registers: are sometimes included to allow device and controller configuration and control

A keyboard status register would have a bit indicating whether or not a character was available to read.

A status register for the serial port attached to a printer would have a bit indicating whether or not it was ready to receive another byte.

2. Program-controlled I/O, or polling

In program-controlled I/O, the CPU continually checks the device status register to see if the device is ready for more data. The version bvelow includes the suggestion received in class.

;
; What program-controlled I/O might 
; look like on the Pep/6, if
; it were a memory-mapped machine.
;
kbStatus:.EQUATE h#0A00     ; Address of keyboard status register.
kbData:  .EQUATE h#0A01     ; Address of keyboard data register.
readyBit:.EQUATE h#8000     ; Leftmost bit in status register
                            ;  indicates if keyboard has a character.
buffer:  .BLOCK  d#10       ; Character into which to read the input.
bufLen:  .EQUATE d#10       ; Length of array.
;     .
;     .
;     .
         LOADB   buffer,i
         LOADX   d#0,i      ; x = 0
         LOADA   d#0,i      ; acc = 0 (clears upper byte too)
                           
poll:    LDBYTA  kbStatus,d ;  start of "polling" loop
         ANDA    readyBit,i ;  mask to test ONLY the ready bit
         BREQ    poll       ;  no data ready yet, go back and check again
         LDBYTA  kbData,d   ;   buffer[x] = input char
	                    ;  the keyboard controller hardware detects the reading of the
			    ;  keyboard data register and in response, automatically resets
			    ;  the ready bit in the status register in preparation for the next
			    ;  key depression.	
         STBYTA  buffer,x
         ADDX    d#1,i      ;   x = x + 1
         COMPX   bufLen,i   ;   exit when x >= bufLen
         BRLT    poll       ; end loop 
         LOADA   d#0,i      ; Null terminate
         STOREA  ,x         ;  the string

Program-controlled I/O provides an example of the first of the two difficulties we were hoping to avoid when handling I/O. The CPU is "busy waiting" for the device. If it is a slow device, the CPU will burn a lot of cycles in that tight loop, waiting for the status bit to be set by the device.

Think about a 33.3 kbps modem connection. It might present 4000 characters per second. A processor that executes 10 million instructions per second would be wasting roughly 2500 instructions checking the status register for every character it read.

What's the answer? Have the I/O device interrupt the CPU when it is ready, so the CPU can do useful work in the meantime.

3. Asynchronous , or Hardware Interrupts

(Section 8.3 in the Warford text describes the causes of the asynchronous interrupts and the action taken upon them. Then it continues on to state the situations where an interrupt should not be allowed to happen and a few techniques for implementing the inhibition of interrupts ( critical sections). All this material will be covered in more detail in the operating system course, but is presented here to give you a sense of the possible issues that the OS needs to deal with.)

In the program-controlled I/O the interaction with the device is managed entirely by the program, without hardware support. There is a tight loop that keeps checking the status register to see if the output device is ready to output another byte, or if the input device is able to provide another byte.

The difficulty with program-controlled I/O is that the CPU is spending lots of cycles doing nothing but waiting for the device to be ready. This is called "busy waiting".

By using more hardware support for I/O, the CPU can tell the device what to do, then go on to do some other computing. When the device is finished and needs to receive or send more data, it interrupts the CPU to regain its attention.

These hardware interrupts use essentially the same mechanism as traps (software interrupts). The main difference is that since their source is external to the CPU executing programs, they are not synchronized with the programs. Hence the name asynchronous interrupts.

Time line of an keyboard interrupt

This diagram is an attempt to show the big ideas of interrupt handling without too many details.

The basic idea is the same across all computer systems. The I/O device signals to the CPU that it needs some attention. When the CPU is ready to handle the interrupt, it transfers control from whatever program is executing at the time the interrupt occurred to the interrupt service routine--also known as an interrupt handler--appropriate for managing an interrupt from this device. When the transfer occurs, information required to resume the interrupted program is saved someplace. When the interrupt handler has finished dealing with the interrupt, it restores that saved information to resume execution of the interrupted program.

Where systems differ is in the details used to manage the transition from the interrupted program to the interrupt handler, and back again.

Interrupt handling details: who interrupted?

Unless there is only one I/O device, the CPU needs a way to figure out which device raised the interrupt.

If there are multiple interrupt lines on the bus, and each device has a line to itself, the CPU can tell immediately which device is interrupting.
If all the devices share a single interrupt line on the bus, the CPU can poll the devices when an interrupt occurs, checking all their status registers, looking for the device that is ready for input or output.
The bus protocol might require the CPU to signal acknowledgement of the interrupt by asserting appropriate lines on the bus, and then require the device to reply by putting an identifying number onto the bus.

On many systems, a separate control unit actually monitors the interrupt lines on the bus, determines which is the interrupting device, and then interacts with the CPU. It can also prioritize interrupts for cases when multiple devices require service at the same time. This is the case, for example, for PCs using Intel processors.

Interrupt handling details: choosing a handler

When the CPU has determined which device interrupted, the number identifying the interrupting device is used to index a table containing the addresses of interrupt handlers:

interrupt handler 0

interrupt handler 1

interrupt handler 2

.

.

.

The hardware needs to be hardwired to look in some particular place for code to execute when an interrupt occurs; having that place be a table of addresses preserves flexibility: you can easily change the handlers by changing where the vectors point.

Some machines have the machine index a table whose address is held in some register, rather than being at a fixed point in memory. On these machines, you can even change to a new table of vectors by changing the value of this vector table register.

The details of how the device's number translates into a particular interrupt handler varies from machine to machine. The PowerPC, for example, actually has 40 bytes of code in each entry of the table indexed by the interrupt number.

Textbooks differ in what they refer to with the term "interrupt vector". At one point, Tanenbaum calls the number identifying the device controller the interrupt vector. Hamacher et. al. say that the address of the interrupt handler is the interrupt vector. It should be clear from the context what different authors are talking about, but be aware that the terminology varies in different texts.

Interrupt handling details: what to save?

Before it starts handling the interrupt, the CPU needs to save the information needed to resume the interrupted program.

Machines differ in how much is saved automatically by the hardware, and how much must be saved in software by the interrupt handler.

hardware could save the complete set of registers before calling the interrupt handler, like the Pep/6 does when handling a trap.
more commonly, hardware saves the minimal set of registers which can't be managed in software:
- the PC
- the status register, before switching to supervisor mode and disabling interrupts

Minimizing the set of automatically saved registers is way to avoid memory accesses when the interrupt handler does not need to use all of the machine registers.

The PC must be saved in hardware, in order to implement the transfer of control

In a machine that has two privilege levels: user and supervisor modes, the machine needs to switch to supervisor mode when it handles the interrupt. There will be a flag in the status register which indicates which mode the CPU is in. Saving the register on entry to the interrupt routine, and restoring it on return from interrupt, ensures that the CPU returns to the appropriate privilege level when it is finished handling the interrupt.

The processor hardware is likely to disable at least some interrupts when it executing the interrupt handler (we're about to see some reasons why; the interrupt handler may reenable interrupts in software before returning). So there will also be bits in the status register which indicate which interrupts are being accepted, and which are being ignored. Saving and restoring the status register thus also restores the set of interrupts being accepted to what it was before the interrupt arrived.

Interrupt handling details: whether to pay attention?

How soon after a device raises an interrupt does the CPU respond?

The processor must finish the currently executing instruction before entering the interrupt handler. This distinguishes an interrupt from a trap. If a trap is caused by a problem with the currently executing instruction--such as an unimplemented opcode or a memory protection violation--the trap handler needs to take control immediately since the currently executing instruction cannot be completed.

Sometimes interrupts are disabled:

to avoid having the interrupt handler itself interrupted.
to protect data structures shared by the interrupt handler and other parts of the operating system.
to ensure that some crucial task is not interrupted.

Many processors automatically disable interrupts before executing the interrupt handler. This is, first of all, a straightforwared way to ensure that the bus signal from the device whose interrupt is being serviced does not immediately interrupt its own interrupt handler (the device will continue to assert the interrupt signal on the bus until the CPU acknowledges the interrupt; other methods of ensuring that an interrupt does interrupt its own handler do exist).

When there are multiple devices, it is possible for another device raise an interrupt while the CPU is still executing the interrupt handler for a previous interrupt. Disabling interrupts while the CPU is executing an interrupt handler allows the CPU to finish handling one interrupt before starting the next.

Some devices cannot afford to wait for others. For example, most computers have a real-time clock that produces interrupts at regular intervals (perhaps 100 times per second). When the clock interrupts occur, counters are updated to keep track of the time of day, to manage any timed alarms requested by programs, and--on a multiprogramming system--to decide when to switch the CPU to executing a different user process. You don't want to lose a "clock tick" because it happens to occur while the CPU is handling an interrupt from another device. It would be better if the clock could interrupt the interrupt handlers of other devices.

To deal with this problem, a system can give different priorities to interrupts from different devices. The CPU's status register will include a field for its current priority level, which is set equal to that of whatever interrupt it is handling. If a new interrupt has a higher priority than the one being handled, the interrupt handler is interrupted. In effect, interrupts are stacked.

There is no point to allowing requests from equal priority devices to interrupt each other: the result would be to finish handling those interrupts in reverse chronological order, which may increase the possibility of losing data from the devices whose interrupt handlers are delayed (as well as seeming unfair).

Seeing interrupts happen

You can use one of the UNIX commands for monitoring system usage to see how interrupts are related to I/O:

Log into a machine in the lab and run the command "vmstat 5" (the 5 means "report every 5 seconds").
Watch the output for about half a minute while the machine is comparitively idle, so you have an idea what the machine's steady state is. You're interested in the numbers under the columns "in", "sy", and "cs", which show the number of interrupts, system calls, and context switches per second, respectively (a context switch occurs when the OS changes the process running on the CPU).
Twirl the mouse around on the mouse pad as quickly as you can, and watch the numbers climb.

Here's an example of the output:

quidnunc % vmstat 5
procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s3 -- -- --   in   sy   cs us sy id
 0 0 0  21932  1540   0   0  0  0  0  0  0  0  0  0  0   43  171  100  0  0 100
 0 0 0  21932  1512   0   1  6  1  1  0  0  2  0  0  0   50   77   59  1  1 99
 0 0 0  21932  1496   0   0  0  0  0  0  0  0  0  0  0   24   14   19  0  0 100
 0 0 0  21932  1512   0   0  0  3  3  0  0  0  0  0  0   22   11   16  0  0 100
 0 0 0  21932  1516   0   0  0  0  0  0  0  0  0  0  0   20   10   16  0  0 100
 0 0 0  21932  1516   0   0  0  0  0  0  0  3  0  0  0   33   11   20  0  0 100
 0 0 0  21932  1516   0   0  0  0  0  0  0  0  0  0  0   24   27   20  0  0 100
 0 0 0  21924  1508   0   0  0  0  0  0  0  0  0  0  0   23   36   20  0  0 100
 0 0 0  21924  1508   0   0  0  0  0  0  0  0  0  0  0  117  315   76  1  1 97
 0 0 0  21916  1500   0   0  0  0  0  0  0  0  0  0  0  271  970  181  5  4 91
 0 0 0  21916  1500   0   0  0  0  0  0  0  0  0  0  0  269  758  153  5  2 93
 0 0 0  21908  1492   0   0  0  0  0  0  0  0  0  0  0  270  804  160  5  4 91
 0 0 0  21908  1492   0   0  0  0  0  0  0  1  0  0  0  164  396   89  2  2 96
 0 0 0  21900  1484   0   0  0  0  0  0  0  0  0  0  0   23    8   16  0  0 100
 0 0 0  21900  1484   0   0  0  0  0  0  0  0  0  0  0   22    9   16  0  0 100
 0 0 0  21892  1476   0   0  0  0  0  0  0  0  0  0  0   22    6   16  0  0 100
 0 0 0  21892  1476   0   0  0  0  0  0  0  0  0  0  0   20    6   14  0  0 100
 0 0 0  21884  1468   0   0  0  0  0  0  0  0  0  0  0   93   98   39  0  1 99
 0 0 0  21884  1468   0   0  0  0  0  0  0  0  0  0  0   23   27   20  0  0 100

Device	Behaviour	Partner	Data rate (KB/sec)
Keyboard	input	human	0.01
Mouse	input	human	0.02
Scanner	input	human	400.00
Line printer	output	human	1.00
Laser printer	output	human	200.00
Graphics display	output	human	60,000.00
Network interface	input/output	machine	500.00-6,000.00
Floppy disk	storage	machine	100.00
Hard disk	storage	machine	2,000.00-10,000.00