EE380 Assignment 5 Solution

100% Consider the following two MIPS subset implementations:

Which of the following four statements about how pipelining changes the architecture is false?
The Data Memory module could be the same circuit in both implementations
The Instruction Memory module could be the same circuit in both implementations
The ALU used to add 4 to the PC could be the same circuit in both implementations
The ALU used for operations like add and xor could be the same circuit in both implementations
None of the above four statements is false; in fact, all of the modules can be the same circuits in both implementations because pipelining only adds buffers, changes/adds some datapaths, and modifies the control logic
This is the whole point of doing the single-cycle design...
50%, 50% Pipelined designs generally achieve higher performance than similar single-cycle designs by allowing a higher clock rate, but the clock rate with a 5-stage pipeline is generally somewhat less than 5X the speed of the single-stage design it was derived from (e.g., compare the two MIPs implementations given in question 1). Give one reason why the clock rate is less than 5X.
The most common reason is dividing the single-cycle design into stages that have somewhat different delays -- the clock rate is determined by the stage with the slowest delay. There is also the need to introduce buffers between stages, and they add some delay, slowing the clock somewhat too.
100% When a modern processor executes a particular conditional branch instruction, it attempts to guess whether the branch is taken of not taken. Which hardware structure allows the processor to make its guess using how that branch instruction behaved when it was executed previously?
BLT
Bacon, Lettuce, and Tomato? No!
BHB
Branch History Buffer
BTB
Branch Target Buffer holds target address, not taken/not-taken info
Data Cache
Instruction Cache
50%, 50% Consider executing each of the following code sequences on the pipelined MIPS implementation given below (which does not incorporate value forwarding):

Incidentally, both code sequences produce the same final results. Which of the following statements best describes the execution times you would expect to observe?
```
(A)  lw   $t2,0($t0)
     ori  $t1,$t0,4
     add  $t2,$t2,$t3 depends on lw

(B)  ori  $t1,$t0,4
     lw   $t2,0($t0)
     add  $t2,$t2,$t3 depends on lw
```
(A) would be faster than (B)
By one clock cycle -- the one occupied by the ori instruction instead of a nop
(B) would be faster than (A)
(A) would take the same number of clock cycles as (B)
Which is faster depends on the values being added and ored
100% Consider executing each of the following code sequences on the pipelined MIPS implementation given below:

Also consider executing them on this design with value forwarding logic and datapaths added. Which of the following statements best describes how the forwarding logic would alter the execution times?
```
(A)  lw   $t1,4($t0)
     sw   $t2,16($t3)
     beq  $t0,$t3,lab

(B)  lw   $t1,4($t0)
     sw   $t1,16($t2) depends on lw
     beq  $t1,$t3,lab depends on lw
```
Neither (A) nor (B) is affected by forwarding
(A) is not affected, (B) would be faster using forwarding
Because A has no dependencies, it isn't helped by forwarding
(A) would be faster using forwarding, (B) is not affected
Both (A) and (B) would be faster using forwarding
The execution time improvements due to forwarding depend on the values in the registers, not on the instructions being executed; thus, it is impossible to say how execution times for (A) and (B) are affected

Computer Organization and Design.