## EZ-17 Corrected Copy ## UNIVERSITY OF LONDON IMPERIAL COLLEGE OF SCIENCE, TECHNOLOGY AND MEDICINE ## **EXAMINATIONS 2005** BEng Honours Degree in Computing Part II MEng Honours Degrees in Computing Part II BEng Honours Degree in Information Systems Engineering Part II MEng Honours Degree in Information Systems Engineering Part II for Internal Students of the Imperial College of Science, Technology and Medicine > This paper is also taken for the relevant examinations for the Associateship of the City and Guilds of London Institute > > PAPER C210=E2.13 ARCHITECTURE II Thursday 5 May 2005, 14:30 Duration: 120 minutes Answer THREE questions Paper contains 4 questions Calculators required - 1a i) Provide a formula showing how the execution time of a program can be computed based on information such as the number of instructions for that program. - ii) Suggest three ways of improving the execution time for a given program. - Two machines, M and N, implement the same instruction set with two classes of instructions, X and Y. M takes x cycles to run each Class X instruction and y cycles to run each Class Y instruction, while N takes $\alpha x$ cycles to run each Class X instruction and $\beta y$ cycles to run each Class Y instruction. M runs at z Hz, while N runs at z Hz. $\alpha$ , $\beta$ and $\gamma$ are constants. - i) State the peak performance, in number of instructions per second, of M and N. - ii) State the execution time of M and M for programs with m Class X instructions and n Class Y instructions. - iii) For M, a program P1 has u1 Class X instructions and v1 Class Y instructions, while another program P2 has u2 Class X instructions and v2 Class Y instructions. The corresponding parameters for N are respectively v1, v2 and v2. - If P1 is run k times more frequently than P2, derive a value for k such that the total execution time for M will be less than that for N. - iv) For the two programs in part iii, what is the ratio of the clock speed for the two machines if P1 is run as frequently as P2, and the total execution time is the same for both machines? The two parts carry, respectively, 30%, and 70% of the marks. - 2a Consider a 4-bit Arithmetic Logic Unit (ALU) which can perform addition and bit-wise logical OR for numbers in two's complement representation. - i) Provide a circuit diagram showing the internal structure of a component A which can be replicated 4 times to form the ALU. Do not show the internal structure of multiplexors and fulladders. - Provide a circuit diagram showing how the ALU can be built from 4 copies of A. Label the signal values on all the wires in the diagram when the ALU adds the two numbers 0011 (3 in base ten) and 1111 (-1 in base ten). - b i) Show how A (in part a) can be modified to become B by adding a single 2-input logic gate, so that the appropriate connection of multiple copies of B can compute either bit-wise OR, addition or subtraction depending on a control signal. - ii) Provide a circuit diagram of this modified ALU which contains 4 copies of B - i) Show how B (in part b) can be modified to become C by adding a single 2-input logic gate and an inverter, so that the appropriate connection of multiple copies of C, together with additional components, can compute either bit-wise OR, addition or the absolute value of one of its inputs. - Provide a circuit diagram of this modified ALU which contains 4 copies of The three parts carry, respectively, 25%, 35%, and 40% of the marks. - A datapath containing m control signals has been developed for a set of instructions with n-bit opcode. A microsequencer, which supports k-bit horizontal microinstructions, has been developed as the control unit for this datapath. The microsequencer contains an incrementer, a ROM (Read-Only Memory), a register, and address select logic. Provide a diagram for this microsequencer, and label the size of components and wires in the diagram where possible. - b Describe how the address select logic in part a can be implemented using ROMs and multiplexors. Explain how the number of ROMs in the address select logic relates to the state diagram for the control unit. - Given that there are $\alpha$ identical ROMs in the address select logic, what is the total ROM size (in bits) that this microsequencer contains? - d Calculate the total ROM size (in bits) if the control unit is implemented as a finite state machine containing a single ROM for both the output logic and the next state logic. The four parts carry, respectively, 30%, 30%, 20%, and 20% of the marks. - What is the total number of bits of storage required for a direct-mapped cache with $2^m$ bytes of data and one-word blocks, given that the address size is n bits and each word contains $2^k$ bytes? - b Give one advantage and one disadvantage of a direct-mapped cache with multi-word blocks. - What is the total number of bits of storage required for a direct-mapped cache with $2^m$ bytes of data and multi-word blocks, given that the address size is n bits, each block contains $2^{\alpha}$ words and each word contains $2^k$ bytes? - d Give one advantage and one disadvantage of a fully-associative cache. - What is the total number of bits of storage required for a fully-associative cache with $2^m$ bytes of data, given that the address size is n bits and each word contains $2^k$ bytes? - Compare your answer to part a and part e, and explain whether there are additional factors that can affect the size of the two caches. The six parts carry, respectively, 20%, 10%, 25%, 10%, 20%, and 15% of the marks. ## Ise 2 - Computer Architecture socutions 2005 | Department of Computing Examinations — 2004–2005 Session Confidential | | | | | |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--| | MODEL ANSW | VER and MARKING SCHEME | | | | | First Examiner wl | Paper Code $C210 = E2.13$ | | | | | Second Examiner om | Question Page out of 4 | | | | | Question labels in left margin | Mark allocations in right margin | | | | | la i) T= nct, T: exectime<br>C: cycles por | of program. n: the instruction count, or instruction, t: cycle time of machine 3 running the program | | | | | - use machine with t | to reduce no<br>lower C<br>reduced t | | | | | peak performance of M = (1) exectime - exectime | Cycles per instr $\frac{Z}{\text{Cycles per instr}}$ Penk reformane of $N = \frac{\gamma Z}{\min(\chi, \gamma)}$ Note that $\frac{Z}{\min(\chi, \gamma)} = \frac{m\chi}{Z} + \frac{n\gamma}{Z} = \frac{m\chi + n\gamma}{Z}$ The virial of $\chi$ instricts $\frac{m\chi}{Z} + \frac{n\gamma}{Z} = \frac{m\chi + n\gamma}{Z}$ | | | | | exectine = $\frac{mdx}{x^2} + \frac{1}{x^2}$ | $\frac{nBY}{YZ} = \frac{\alpha m x + y \delta}{YZ}$ | | | | | for M K. Exec time + exe for Pl on M + for $L(u_1x+v_1y_1) + (u_2x+v_1y_1)$ | We chine $\langle k \rangle$ exec time $\langle$ | | | | | $k\left(\frac{U_1X + V_1Y}{Z} - \frac{V_1 dX + V_2Y}{X}\right)$ $= \frac{V_1 dX + V_2Y}{X}$ $= \frac{V_1 dX + V_2Y}{X}$ $= \frac{V_1 dX + V_2Y}{X}$ $= \frac{V_1 dX + V_2Y}{X}$ | $\frac{s_1p_1}{z} > \frac{s_2p_1}{y_2} = \frac{z}{z}$ $k < \frac{r_2 \alpha_x + s_2p_1 - u_2 \gamma_x - v_2 \gamma_y}{u_1 \gamma_x + v_1 \gamma_y - r_1 \alpha_x - s_1 \beta_y}$ and $k=1$ , | | | | | $1 \times 2 \times 4 \times 84 - 80$ | $-S_{1} \beta y = r_{2} \alpha \chi + S_{2} \beta y - u_{2} \gamma \chi - V_{2} \gamma y$ $-V_{2} \gamma y) = r_{2} \alpha \chi + S_{2} \beta y + v_{1} \alpha \chi + S_{1} \beta y$ $\gamma = \frac{(r_{1} + r_{2}) \alpha \chi + (S_{1} + S_{2}) \beta y}{(u_{1} + u_{2}) \chi + (v_{1} + v_{2}) \gamma}$ | | | | | Department of Computing Examinations — 2004–2005 Session Confidential | | | | | |-----------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|----------|--| | MODEL ANSWER and MARKING SCHEME | | | | | | First Examiner wl Paper Code $C_{210} = E_{2.13}$ | | | | | | Second Examin | er om | Question 4 Page 4 | out of 4 | | | Question labels in left margin Mark allocations in right margin | | | | | | tags total b adv disad | number of words = $(2^m)/(2^k)$ fize = $n - (m-k) - k = n-m$ $size = (num of words) \times (ad)$ $= 2^{m-k} (2^{k+3} + (n-m))$ $= 2^{m+3} + 2^{m-k} (n-m)$ | dr size + tog size + volid 1) +1) 10ck: explicit spatial look: explicit spatial look for instance complex control for to avoid false sharing | 4 (H) | | | C tota tag tota | I rumber of blocks = 2 <sup>m</sup> . size = n-m size = 2 <sup>m-k-d</sup> (2 <sup>d+k</sup> size = 2 <sup>m+3</sup> + 2 dicient use of cache, sin v: need storage for tags. | (n-m+1) (n - m + 1) (ce no miss unless cache | 5 | | | e san<br>I die | re as 4a et mapped cache has I comp | | 3 | | | | | | | | L