Read Chapter 4 sections 4.4.4 to end of Chapter. You need not understand everything about the IJVM (Mic-1, Mic-2, Mic-3) in order to gain useful information. IFU stands for Instruction Fetch Unit, which prefetches IJVM (Integer Java Virtual Machine) instructions. The IJVM operates on a stack almost exclusively, so operands are on the stack. The TOS register contains a duplicate of the top of stack entry pointed at by the SP register. The swap instruction mentioned in the pipeline example swaps the two top words on the stack. The IFLT instruction mentioned stands for “pop word from stack and branch (jump) if less than zero.” (See Fig. 4-11, p. 222, for a brief description of the IJVM instruction set. Note that there is no instruction for storing a value in the area of memory called the “constant pool” whose base address is in the CPP (constant pool pointer) register. The H register is a CPU scratch register; operands are read from or written to memory using the MDR (Memory Data Register), with addresses in the MAR (Memory Address Register); instruction bytes are read from memory using the MBR (memory buffer register), and the PC (program counter register) is used instead of the MAR for addressing instruction bytes. The LV register is a stack frame base register similar to Intel’s BP (ebp) base pointer register. The ALU has six control inputs rather than simply F1 and F0, and can perform more operations than the Mic-1 ALU with which you are already familiar. The additional control signals (three of which are shown in Fig. 3-19 permit the ALU to generate constants of 0, +1, and -1, and also perform subtraction, incrementation, and decrementation directly (see the Table in Fig. 4-2, which you need not memorize).

1. Consider a machine on which 20% of the instructions are conditional jumps and another 10% are loop jumps. The conditional jumps can be predicted with 60% accuracy, and the loop jumps can be predicted with 90% accuracy. The penalty for guessing wrong is four cycles. There is no penalty for unconditional jumps or correct guesses. What is the efficiency of the pipeline on this machine?

2. A cache system has a 95% hit ratio, an access time of 100nsec on a cache hit and an access time of 800nsec on a cache miss. What is the effective access time?

3. A cache is being designed for a computer with $2^{32}$ bytes of memory. The cache will have 2K slots (lines) and use a 16-byte block. Compute for both an associative cache and a direct-mapped cache how many bytes the cache will occupy.

4. For a computer with $2^{24}$ bytes of main memory and a $2^{12}$ byte cache with $2^8$ bytes per slot (cache line, or block) compare the following three cache designs. Cache A is fully associative, cache B is 4-way set associative, and cache C is direct mapped.

   a. For each design give the number of bits needed in each tag register, and the number of hardware comparators needed to check an address for a hit.

   
<table>
<thead>
<tr>
<th>Cache A</th>
<th>Cache B</th>
<th>Cache C</th>
</tr>
</thead>
<tbody>
<tr>
<td># of tag bits, k</td>
<td>#of k-bit comparators</td>
<td></td>
</tr>
</tbody>
</table>

   b. For each design assume that the cache is empty and that a read is requested from address $2A4712_{16}$. Into which cache slot (line) will the block containing this address be loaded?

<table>
<thead>
<tr>
<th>Cache A</th>
<th>Cache B</th>
<th>Cache C</th>
</tr>
</thead>
<tbody>
<tr>
<td>slot #</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

- over -
5. The design of a new computer calls for a main memory of $2^{16}$ words. The design team is in the process of deciding on what sort of cache memory this new machine should have and wishes to compare two different cache designs. Both proposed designs use a block size of 16 words per block and provide a cache that will hold 16 blocks (256 words). Design A is direct-mapped and Design B is block-set-associative, with two blocks per set. The cache is initially empty, the cache replacement policy is LRU (least recently used), and the sequence of memory locations referenced by the program is (all addresses here are in hexadecimal):

- 2731
- 4A17
- 203C
- 4A23
- 4A2A
- 3179
- 2041
- 2731

a. For the direct mapped design, which of the memory accesses will result in a cache hit? Specify (in either hex or binary) the final non-blank contents (tags are sufficient) of the cache.

b. Repeat a. for the block-set-associative design.

6. A computer has 4096 words of main memory in 4-word blocks and a 4-way set-associative cache with sixteen slots (lines) each having four 4 block entries. The cache uses an LRU replacement algorithm and it is initially empty. Which of the following memory accesses (addresses in hex) will result in a cache hits? Specify (in binary) the final non-blank contents (tags are sufficient) of the cache.

- 3A5 730 350 34F 430 531 E33 E30 070 432 5A1 630