MIPS assembler and ISA

MIPS - Microproessor without Interlocked Pipeline Stages

All important ISAs today are based on registers. MIPS has 32 GPRs and 32 floating point registers (FPRs). The MIPS ISA is a load-store architecture. This means that the only operations that interact with memory are loaded and stored.

ARM processors, very common in mobile devices are also a load-store ISA.

x86 is a register-memory ISA because it has other kinds of instructions that interact with memory. For example, there may be an instruction that adds the contents of a register to the contents at a memory address and store that in a register.

MIPS operations can work on 8/16/32/64 bit integers and 32/64 bit IEEE 754 floating point. MIPS is a Reduced Instruction Set Architecture (RISC) because it has a relatively small set of instructions. x86 is called a Complete Instruction Set Architecture (CISC) because it has a large number of instructions.

CISC and RISC

CISC

The primary goal of a CISC architecture is to complete a task in as few line of assembly as possible. This is achieved by building processor hardware that is capable of understanding and executing a series of operations. The CPU hardware is more complex.

RISC

RISC processors only use simple instructions that can be executed, if possible, within one clock cycle. By implicication, more lines of assembly are needed to perform a task.

MIPS memory model

Memory is composed byte addressable units laid end to end. Frequently hex is used rather than decimal to number them. Each location is 8-bits and each location has an address.

MIPS operations

MIPS operations can be classified into the following types:

  • Data transfer - load, store
  • Arithmetic and logical - add, sub, mul, div, comp
  • Floating point arithmetic - add, sub, div, mul
  • Control - if, branch

Accessing memory

To load a value from memory, copy the data from memory into the register. To store a value to memory, copy  the data from a register to memory.

Examples:

  • LB - Load Bytes
  • LH - Load half word - 16 bits/two bytes)
  • LW - Load word - four bytes
  • LD - Load double word¬† -eight bytes

Addressing memory

The following is typical of a MIPS assembler program. It is loading one word (32 bits) from memory into register 4.

lw $4, 12($29)

Note how the location in memory is specified:

base address in register ($29) plus the offset in bytes (12 bytes in this case)

This is known as displacement mode addressing. Note how memory is byte addressable.

Register conventions

There are very few restrictions imposed by the actual hardware on how the register set is uesd. However, in order to impose some commonality among programmers, some calling conventions are defined. This is important when subroutines are written by dufferent programmers or are provided in a library.

Common conventions for MIPS are

  • O32
  • N32
  • N64

A high level language must say which convention it uses. Compilers can't be mixed.

Execution model for MIPS

Processors promise that the execution of instructions will appear to be atomic and sequential.

Atomic - an instruction is executed completely in one unit. Sequential - list of instructions executed one after another.

There is a sequence to how every instruction is executed. Instructions are different so that details of each step are specific to the type of instruction.

  1. Instruction FETCH from memory
    Send out the address now in the program counter. Fetch the instruction from memory into the Instruction register.
  2. Instruction DECODE and read registers concurrently
    Decode the instruction and access the register file to read present registers
  3. EXECUTION on the operands
    The ALU operates on operands prepared in step 2
  4. MEMORY access and branch completion
    Access memory if needed, perform load or store
  5. WRITE back
    Write result back into the register file

If each step takes one clock cycle, complete instruction takes a maximum of five clock cycles. CPI - Cycles per instruction.

Execution time of a program

The number of instructions is determined by ISA and compiler. Clock cycles per Instruction (CPI) is determined by the implementation microarhitecture - hardware.

Performance = (Instructions/Program) * (Cycles/Instruction) * (Time/Cycle)

Pipelining

Although computer processors give the appearance of executing instructions in a way that is atomic and sequential, in fact, such execution would be too slow.

Techniques are used to speed things up internally to the CPU. One such technique is pipelining. It is an implementation in which multiple instructions are overlapped in execution. It takes advantage of parallelism that exists among the steps needed to execute an instruction.