Computer Architecture > SOLUTIONS MANUAL > ECE 563 Solutions - North Carolina State University ECE 563 (All)

ECE 563 Solutions - North Carolina State University ECE 563

Document Content and Description Below

ECE463/563 – Microprocessor Architecture Homework #1 Problem 1 [10 points] Answer the following questions. In all cases, show how you obtained the result. 1. Assume to include in a processor a new fast floating-point unit that speeds up floating-point operations by an average factor 2×. In addition, assume that floating-point operations take 20% of the original program execution. What is the overall speedup? [3 points] Apply Amdahl’s law with fENH=20%, sENH=2. Speedup = 1/(0.8+0.2/2) = 1.11 2. Assume that speeding up the floating-point operations causes a slow down in the data cache accesses. For example, given the additional space taken by the more sophisticated floating-point unit, memory operations take some extra cycles to get to the cache. Specifically, assume that data cache accesses, which consume 10% of the original program execution time, are now slowed down by a factor 1.5×. What is the overall speedup? [3 points] Exec-timeNEW = Exec-timeOLD (0.2 x 1/2 + 0.1 x 1.5 + 0.7) Speedup = Exec-timeOLD/Exec-timeNEW =1/(0.2 x 1/2 + 0.1 x 1.5 + 0.7) = 1.05 3. After implementing the new floating-point unit, what percentage of the execution time is spent on floatingpoint operations and what percentage is spent on data cache accesses? [4 points] FP = 0.1/0.95 ~ 10.5% Memory = 0.15/0.95 ~ 15.8% Page 2 of 9 ECE463-563 Spring 2022 Problem 2 [30 points] What would be the expected outcome of the following instructions? [2+2 points] I1: BEQZ R1 #loop If value of register R1 is equal to 0, then go to instruction at label #loop. In other words: if (R1==0) NPC = NPC + #loop I2: ADD 4(R1) R2 R3 (the first operand is the destination, and the second and third operands are source operands) The sum of the values of registers R2 and R3 is stored in memory at address (value of R1 + 4). In other words: Memory[value of (R1)+4] = value of R2 + value of R3 Can the MIPS datapath below handle instructions I1 and I2? [26 points] If yes, explain how. Specifically: - Highlight the data lines relevant to each instruction. - Indicate the data carried by all the highlighted data lines, the values stored in the relevant special purpose registers, and the values of the control lines of the three multiplexers. If not, explain why and how you would modify the datapath to support each instruction. Page 3 of 9 ECE463-563 Spring 2022 I1: BEQZ R1 #loop Yes, the branch instruction is supported by the datapath given. See diagram below. Page 4 of 9 ECE463-563 Spring 2022 I2: ADD 4(R1) R2 R3 No, this instruction is not supported by the given datapath. This instruction requires two arithmetic operations: one to compute the result (R2+R3), and one to compute the address at which the result must be stored (i.e., value(R1)+4). In addition, the instruction reads three registers: R1, R2 and R3. However, the register file has only 2 output ports, allowing only two registers to be read in each clock cycle. To support the instruction, we need: - An extra adder in the EXE stage. The adder will compute value(R1)+4. There would also be an extra register (AdderOutput) storing the computed memory address. - An extra special purpose register C, where the value of register R1 will be stored - A register file with 3 output ports feeding registers A, B and C (as well as 3 index lines) The special purpose registers will contain the following information. IR – instruction A – value of R2 B – value of R3 C – value of R1 Immediate = 4 LMD would not be utilized NPC would contain the address of the next instruction in program order Cond would not be utilized A and B would be fed to the ALU. R1 and Immediate would be fed to the extra adder ALUoutput = value(R2)+value(R3). ALUoutput would be connected to the input data line of memory AdderOutput = value(R1)+4. AdderOutput would be connected to the address line of memory Page 5 of 9 ECE463-563 Spring 2022 Assumptions for remaining problems In all the problems/questions below, assume that: • You are operating on a 32-bit RISC processor. In particular, your architecture of reference is the MIPS processor (datapath shown in Question 2). You can use only the instructions and the addressing modes supported by the MIPS datapath and by the 3 instruction formats seen in class. • You are dealing only with 32-bit integer values. • You have available 32 integer registers (R0 to R31). You can assume that register R0 stores value 0. • You can use only the following instructions: o Memory/data transfers: LW (load), SW (store). o Arithmetic/logical: any add, subtract, multiply, divide, and, or, xor instruction (i.e., ADD, SUB, MULT, DIV, AND, OR, XOR) o Control: BEQZ (=0), BNEZ (≠0), BGTZ (>0), BGEZ (≥0), BLTZ (<0), BLEZ (≤0), JUMP. Note that the MOV instruction is not included in the list above. Page 6 of 9 ECE463-563 Spring 2022 Problem 3 [26 points] Indicate the MIPS assembly corresponding to the following C blocks of code. You can assume that the values of variables x and y are stored in registers R1 and R2, respectively. (a) [6 points] if (x==1) then x+=5; else x+=20; (b) for (int x=0; x<20; x++) y=y-2; For block of code (b), show two solutions: one using only conditional branches, and one using also an unconditional branch (or jump). [7 for solution #1 + 7 for solution #2] Which assembly code is “better”? Why? [3 points] Code (1) is better because more compact (it includes fewer instructions). In addition, in a pipeline processor branches(conditional and unconditional) lead to the overhead of control hazards handling. Code (1) has only one branch, while code (2) has two of them. How can you easily distinguish a loop from an if-else-then block in an assembly program? [3 points] Loops require backward branches If-then-else blocks require forward branches

[Show More]

Last updated: 2 years ago

Preview 1 out of 32 pages

Buy Now

Instant download

We Accept: