Pico CPU: An Introduction

Brief overview

A CPU is basically made out of 3 main parts:
  1. DataPath Unit: consists of an Arithmetic Logic Unit (ALU) some registers and our instructions will be executed in the DPU.
  2. Control Unit: Control Unit takes care of fetching instructions from Instruction Memory, decoding them and generating control signals for DataPath Unit and address for Data Memory.
  3. Memory: Memory, keeps instructions and also holds data. We might divide memory unit into two distinct memories (one for instructions and one for data) or keep them together as one (but we have to some how secure the Instruction Memory).

Harvard V. Von Neumann architecture

Harvard and Von Neumann are two main classes of computer architectures and they differ in the following aspects:
  1. in Von Neumann architecture, there is only one single memory for data and instruction memory but in Harward architecture these memories are separate memories.
  2. in Von Neumann architecture, there is one data and address bus but in Harward architecture there are 2 buses (one for data memory and one for instruction).

Micro-Programing

Micro-programming began to be used during 1940's. By using micro-programming, the program instructions and data could be stored in memory. These micro-operations are close to machine language but are not exactly it. there should be some micro-assemble program that compiles it and turns it into machine code that would be later programed into instruction memory of the processor.

Register Transfer Language

The Micro-operations can be described using Register Transfer Language (RTL). There a few rules for writing these micro-operations by RTL:
  1. Register names use capital letters: PC (for program counter), R0...
  2. A control function is a boolean condition that enables a transfer: P: R1 ← R2
  3. M[AR] is the contents of a memory block at address "AR"
  4. we can have a mathematical operation inside one RTL: M[AR] ← R1 + R2
  5. program jumps happens by manipulation of program counter (PC)

Instruction Format

The instructions usually contain one part of definition of operation (called OPCODE) and one or more operand. These operands will be either data or address. When we are selecting an instruction format, we have to keep the following factors in mind:
  1. The length of the up-code determines the number of instructions you can possibly have (five bit upcode means you can have up to 32 instructions)
  2. The addressing modes, and amount of addressability your instruction provides. For accessing instructions you can have the following addressing modes:
    1. Absolute or direct: in this mode, you directly state the address you want to use (example: jump #000111001 will result in PC ← 000111001)
    2. PC-relative: this mode will add the offset operand to the pc. the result would be PC ← PC + offset
    3. Register indirect: Will use the contents of a register as address.
    For accessing data you can have the following addressing modes:
    1. Register: reads data from a register
    2. Base plus offset: you have a base address in our instruction and you add an offset to it and then value at MEM(base+offset) would be used as operand.
    3. Immediate: you directly input data from your instruction
    4. Implicit: you dont have an operand but the opcode defines clearly what is going to be done on what data (example: increment accumulator)
  3. How easy would be the decoding of instructions
  4. Is the Instruction length fixed or not

Instruction Set Architecture

We can classify computers based on their Instruction Set Architecture(ISA). There are two main ISA classes which we will discuss in the following.

RISC

In Reduced Instruction Set Computing (RISC) architecture our instructions are relatively simple and to perform a complex process, we need to execute multiple instructions. RISC system has a small set of instructions and usually we have separate instructions for load and store instructions.

CISC

Complex Instruction Set Computer(CISC) is an architecture where an instruction is more complex than a RISC processor. An instruction can fetch from memory, execution, write back to the memory all together. This means that each instruction takes longer to execute which will decrease Instruction per clock cycle that the processor will execute. In return, complex instructions, makes programming much more easier. We can summarize some of RISC and CISC architecture pros and cons in Table 1.
Table 1: RISC and CISC comparison
RISC CISC
Simple instructions Complex instructions
More complex software More complex hardware
More control on code efficiency Less control on code efficiency
Simple addressing modes More complex addressing modes
Fixed instruction size Different instruction size according to instruction type

Micro-Architecture

Micro Architecture is the implementation scheme of a given Instruction Set Architecture (ISA). For Any Given ISA, there might be many different Micro Architectures. These Micro Architectures differ based on design requirements. These requirements might be:
  1. Chip area
  2. power consumption
  3. Performance, etc.

Measuring Performance

To measure an architecture's performance, one can think of the number of instructions executed per clock cycle (IPC) or the number of clock cycles that architecture needs for execution of one instruction (CPI). Another way of looking into measuring performance would be taking the implementation and technology also into account and measure the number of instructions that a specific processor can execute per second. This metric would be called IPS and is usually used as Million-IPS (MIPS) or Giga-IPS (GIPS).

Addressing modes

The addressing a memory cell can be done in different ways. (we have briefly described it in the instruction format)
  1. You can directly write the memory address of the destination into the instruction, or,
  2. You can use relative (or displacement) addressing, which means that you are addressing the destination based on the current address that you are at.
  3. The other choice would be "indirect addressing" meaning you have stored the address in a memory block and you give the address of that memory block to the processor. The processor will go and fetch the contents of that memory block and use it as the destination memory. This would be very similar to pointers in common programming languages.
  4. Register addressing is when you don't fetch the operand from the memory but you put them in a register file and give the address of that specific register to the processor. Since your register file (RF) size is limited, you can not store all your data in RF.
  5. You can combine "register addressing" with "indirect addressing" and will end up with "Register Indirect Addressing".
  6. In case you have a base address and you add a displacement to it and use it as destination address, you will get "Indexed Addressing". This addressing mode comes very handy if you are using Arrays.

Additional Material

  1. Awesome lecture of Richard Feynman on Computer Heuristics, here
  2. Machine Architecture course at Virginia Tech
  3. Computer Architecture Course at UBC
  4. Computer Architecture course at Coursera from Princeton University