ECE463/563 – Microprocessor Architecture
Homework #1
Problem 1 [10 points]
Answer the following questions. In all cases, show how you obtained the result.
1. Assume to include in a processor a new fast floating-point
...
ECE463/563 – Microprocessor Architecture
Homework #1
Problem 1 [10 points]
Answer the following questions. In all cases, show how you obtained the result.
1. Assume to include in a processor a new fast floating-point unit that speeds up floating-point operations by
an average factor 2×. In addition, assume that floating-point operations take 20% of the original program
execution. What is the overall speedup? [3 points]
Apply Amdahl’s law with fENH=20%, sENH=2.
Speedup = 1/(0.8+0.2/2) = 1.11
2. Assume that speeding up the floating-point operations causes a slow down in the data cache accesses. For
example, given the additional space taken by the more sophisticated floating-point unit, memory operations
take some extra cycles to get to the cache. Specifically, assume that data cache accesses, which consume
10% of the original program execution time, are now slowed down by a factor 1.5×. What is the overall
speedup? [3 points]
Exec-timeNEW = Exec-timeOLD (0.2 x 1/2 + 0.1 x 1.5 + 0.7)
Speedup = Exec-timeOLD/Exec-timeNEW =1/(0.2 x 1/2 + 0.1 x 1.5 + 0.7) = 1.05
3. After implementing the new floating-point unit, what percentage of the execution time is spent on floatingpoint operations and what percentage is spent on data cache accesses? [4 points]
FP = 0.1/0.95 ~ 10.5%
Memory = 0.15/0.95 ~ 15.8%
Page 2 of 9 ECE463-563 Spring 2022
Problem 2 [30 points]
What would be the expected outcome of the following instructions? [2+2 points]
I1: BEQZ R1 #loop
If value of register R1 is equal to 0, then go to instruction at label #loop. In other words:
if (R1==0) NPC = NPC + #loop
I2: ADD 4(R1) R2 R3 (the first operand is the destination, and the second and third operands are source operands)
The sum of the values of registers R2 and R3 is stored in memory at address (value of R1 + 4). In other words:
Memory[value of (R1)+4] = value of R2 + value of R3
Can the MIPS datapath below handle instructions I1 and I2? [26 points]
If yes, explain how. Specifically:
- Highlight the data lines relevant to each instruction.
- Indicate the data carried by all the highlighted data lines, the values stored in the relevant special purpose
registers, and the values of the control lines of the three multiplexers.
If not, explain why and how you would modify the datapath to support each instruction.
Page 3 of 9 ECE463-563 Spring 2022
I1: BEQZ R1 #loop
Yes, the branch instruction is supported by the datapath given. See diagram below.
Page 4 of 9 ECE463-563 Spring 2022
I2: ADD 4(R1) R2 R3
No, this instruction is not supported by the given datapath. This instruction requires two arithmetic operations: one to
compute the result (R2+R3), and one to compute the address at which the result must be stored (i.e., value(R1)+4).
In addition, the instruction reads three registers: R1, R2 and R3. However, the register file has only 2 output ports,
allowing only two registers to be read in each clock cycle.
To support the instruction, we need:
- An extra adder in the EXE stage. The adder will compute value(R1)+4. There would also be an extra register
(AdderOutput) storing the computed memory address.
- An extra special purpose register C, where the value of register R1 will be stored
- A register file with 3 output ports feeding registers A, B and C (as well as 3 index lines)
The special purpose registers will contain the following information.
IR – instruction
A – value of R2
B – value of R3
C – value of R1
Immediate = 4
LMD would not be utilized
NPC would contain the address of the next instruction in program order
Cond would not be utilized
A and B would be fed to the ALU.
R1 and Immediate would be fed to the extra adder
ALUoutput = value(R2)+value(R3). ALUoutput would be connected to the input data line of memory
AdderOutput = value(R1)+4. AdderOutput would be connected to the address line of memory
Page 5 of 9 ECE463-563 Spring 2022
Assumptions for remaining problems
In all the problems/questions below, assume that:
• You are operating on a 32-bit RISC processor. In particular, your architecture of reference is the MIPS
processor (datapath shown in Question 2). You can use only the instructions and the addressing modes
supported by the MIPS datapath and by the 3 instruction formats seen in class.
• You are dealing only with 32-bit integer values.
• You have available 32 integer registers (R0 to R31). You can assume that register R0 stores value 0.
• You can use only the following instructions:
o Memory/data transfers: LW (load), SW (store).
o Arithmetic/logical: any add, subtract, multiply, divide, and, or, xor instruction (i.e., ADD, SUB,
MULT, DIV, AND, OR, XOR)
o Control: BEQZ (=0), BNEZ (≠0), BGTZ (>0), BGEZ (≥0), BLTZ (<0), BLEZ (≤0), JUMP.
Note that the MOV instruction is not included in the list above.
Page 6 of 9 ECE463-563 Spring 2022
Problem 3 [26 points]
Indicate the MIPS assembly corresponding to the following C blocks of code.
You can assume that the values of variables x and y are stored in registers R1 and R2, respectively.
(a) [6 points]
if (x==1)
then
x+=5;
else
x+=20;
(b)
for (int x=0; x<20; x++)
y=y-2;
For block of code (b), show two solutions: one using only conditional branches, and one using also an unconditional
branch (or jump). [7 for solution #1 + 7 for solution #2]
Which assembly code is “better”? Why? [3 points]
Code (1) is better because more compact (it includes fewer instructions). In addition, in a pipeline processor branches(conditional
and unconditional) lead to the overhead of control hazards handling. Code (1) has only one branch, while code (2) has two of
them.
How can you easily distinguish a loop from an if-else-then block in an assembly program? [3 points]
Loops require backward branches
If-then-else blocks require forward branches
[Show More]