Chapter 5

1.2 Instruction Cycle

Fetch execute cycle
1. Fetch an instruction from memory
2. Decode the instruction
3. Fetch operand (if required)
4. Execute the instruction
5. Write back the result generated by the operation (if applicable)
6. Repeat

The address of the instruction to be fetched within each cycle is given by the contents of the program counter. Thus after fetching an instruction, the contents of PC must be updated to point to the next instruction

More detailed fetch execute cycle
Fetch:
 - Copy instruction pointed to by PC from memory into instruction register
 - Increment PC to address of the next instruction
Execute:
 - Decode the op-code into set of signals needed to control the ALU and other components
 - If instruction requires operands, get them (from memory, IO) and load into CPU's internal registers
 - Execute the instruction under controlling signals from the CU
 - If required, write result into appropriate register, memory address or IO
 - Recognise pending interrupts

1.3 Control Unit (CU)

The circuitry that decodes and implements each instruction held in the instruction register (IR), by controlling the functional elements that internally compose a CPU. There are two different approaches which have been used to implement the CU and instructions:

1.3.1 Hardwiring

Each instruction is executed directly by logic circuit - hardware.

1.3.2 Microprogramming

Each assembly instruction could be translated into a sequence of even more primitive instructions called microinstructions. They specify step-by-step control signals for the hardware circuitry to implement the instruction.

The microinstructions associated with an assembly instruction are called a microprogram.

1.3.3 Microprogramming vs hardwiring

Microprogramming:

  • Allows arbitrarily complex instructions to be built-up
  • More flexible

Hardwiring:

Capable of decoding and executing an instruction in one clock period. This is a lot faster the microprogrammed equivalent, which would usually require more than one cycle to complete an instruction

2.1 Memory Hierarchy

Each decreasing level in the hierarchy consists of modules of larger capacity, slower access time, and lower cost/bit. The goal of the memory hierarchy is to try to match the processor speed with the rate of information transfer from the highest elemtn in the hierarchy.

2.2 Cache Memory

Temporal locality: The word referenced now is likely to be referenced again soon. Hence it is wise to keep the currently accessed word handy for a while
Spatial locality: Words near the currently referenced word are likely to be referenced soon. Hence it is wise to prefetch words near the currently reefrenced word and keep them handy for a while.

A cache is a small fast memory between the processor andmain memory. It contains a subset of the contents of the main memory.

Because of the high speeds involved with the cache, manamgent of the data transfer and storage in the cache is done in hardware - the OS doesn't know about the cache.

2.3 Main Memory

A collection of words used for storing programs or data. Each word may consist of one or more bytes. Physically, a memory of N words can be constructed using an N word SRAM or DRAM.

A byte in memory is often called a memory cell. Each cell in the memory can be located uniquely by its address

2.3.2 Endianism

The arragement of multi-byte data in memory follows one of two conventions set independently by Intel and Motorola.

When transferring multi-byte data from one machine to another with different byte ordering, the byte order of the target data has to be appropriately reversed before they can be correctly interpreted by the target machine. Ignoring this will cause serious errors.

Little Endian

Store the low-order byte first and the high-order byte last.

Big Endian

Store the high-order byte first and the low-order byte last.

2.4 Memory Mapped IO

Same address bus to address both memory and IO devices. The memory and registers of the IO devices are mappted to address values. So when an address is accessed by the CPU, it may refer to a portion of physical RAM, but it can also refer to memory of the IO devce. 

This, the CPU instructions used to access the memory can also be used for accessing devices. To accommodate the IO devices, areas of the addresses used by the CPU must be reserved for IO and must not be available for normal physical memory.

3.1 Interrupt and IO

Busy status programming: Keeping the CPU busy in a loop which repeatedly tests the interface for a change, wasting much time and performance waiting for something that may not happen.

Interrupt request signal: Let the keyboard tell the CPU by sending an interrupt request signal that is has been used and needs attention. This is called interrupt programming.

An interrupt is an event that forces the CPU to break the sequence of actions to jump to execute an interrupt-service routine (a piece of program stored at a different location, designed to handle the interrupt condition.

Interrupts occur at a time not controlled by the programmer. They can be generated by hardware or software. 

3.3 IO Modules (IO Interfaces)

External devices are not generally connected directly into the bus structure of the computer. A wide variety of devices require their own respective logic interfaces because of things like mismatch of data rates, different data representations.

The IO module or interface provides a standard interface to the CPU and the bus, which is tailored to specific IO devices and its interface requirements. It relieves the CPU of the management of the IO devices. 

The interface consists of control signals; status signals; data signals.

Actions taken by CPU when an interrupt request is accepted:

  1. Stop current execution
  2. Push PC and CCR on system stack
  3. Set new PC to start interrupt-service routine

The reason for storing the old PC is so that the CPU knows from which instruction to resume the interrupted main program, after the interrupt is serviced. The reason for saving the old CCR is to prevent important information for the main program from being altered by the interrupt, as an interrupt can occur at any point in a program.

4.2 Bus system

Typical buses consist of 100-150 lines and are divided into 3 parts.

Bus performance is limited by data propagation delay through the bus itself. Longer buses require longer delays. Aggreate demand for access to the bus from all devices connected to the bus.

To avoid bottlenecks, multiple buses are used in most systems, which is hierarchical, provides high-speed limited access buses close to the processor, and slower-speed general access buses farther away from the processor.

Address bus
  • Used to pass addresses for memory or peripherals - specifying source/destination of data transfer
  • Width determines the capacity of the system
Data bus
  • Used to pass program instructions and data
  • Width is key in determining overall performance
Control bus
  • Used to pass control signals to control access to and use of the memory and peripherals