5.4 Recall that we have two write policies and write allocation policies, and their combinations can be implemented either in L1 or L2 cache. Assume the following choices for L1 and L2 caches:
https://drive.google.com
...
5.4 Recall that we have two write policies and write allocation policies, and their combinations can be implemented either in L1 or L2 cache. Assume the following choices for L1 and L2 caches:
https://drive.google.com/file/d/1PbmOvJtnFdIkraqltHpnepAodFdIun0c/view?usp=sharing
5.4.1 [5] Buffers are employed between different levels of memory hierarchy to reduce access latency. For this given configuration, list the possible buffers needed between L1 and L2 caches, as well as L2 cache and memory.
5.4.2 [20] Describe the procedure of handling an L1 write-miss, considering the component involved and the possibility of replacing a dirty block.
5.4.3 [20] For a multilevel exclusive cache (a block can only reside in one of the L1 and L2 caches), configuration, describe the procedure of handling an L1 write-miss, considering the component involved and the possibility of replacing a dirty block.
Consider the following program and cache behaviors.
https://drive.google.com/file/d/15ostiXsIERCfNi6pDWFX35NoLg6ZAueq/view?usp=sharing
5.4.4 [5] For a write-through, write-allocate cache, what are the minimum read and write bandwidths (measured by byte per cycle) needed to achieve a CPI of 2?
5.4.5 [5] For a write-back, write-allocate cache, assuming 30% of replaced data cache blocks are dirty, what are the minimal read and write bandwidths needed for a CPI of 2?
5.4.6 [5] What are the minimal bandwidths needed to achieve the performance of CPI = 1.5? - ANSWER https://www.chegg.com/homework-help/questions-and-answers/6-recall-two-write-policies-two-write-allocation-policies-combinations-implemented-either--q19793126
https://www.coursehero.com/file/22662573/Assignment-5/#/doc/qa
Memoria Caché (Arquitectura de Computadoras "A" ) - ANSWER https://www.youtube.com/watch?v=uK5ZpWRCT2s
Estructura de Computadores - 4.2 Memoria de Caché - José Luis Abellán Miguel - ANSWER https://www.youtube.com/watch?v=AeJSk6Q9Tvg
4)
Which block in the cache is replaced by memory block 29?Cache configuration: 4-way set-associative cache with 8-one word blocksReplacement scheme: LRUSequence of previously accessed block addresses: 5, 13, 21, 13, 5(Note: All memory block addresses map to cache set 1) - ANSWER None. An element in set 1 is unused, so Mem[29] is placed in the fourth element of set 1.
The cache has two sets (0 and 1) and 4 blocks per set. The fourth block of set 1 is unoccupied, thus Mem[29] is placed in the fourth block of set 1. The replacement scheme has not yet been used.
Least recently used (LRU): - ANSWER Least recently used (LRU): A replacement scheme in which the block replaced is the one that has been unused for the longest time.
5.7 This exercise examines the impact of different cache designs, specifically comparing associative caches to the direct-mapped caches from COD Section 5.4 (Measuring and improving cache performance). For these exercises, refer to the address stream shown in Exercise 5.2.
5.7.1 [10] Using the sequence of references from Exercise 5.2, show the final cache contents for a three-way set associative cache with two-word blocks and a total size of 24 words. Use LRU replacement. For each reference identify the index bits, the tag bits, the block offset bits, and if it is a hit or a miss.
5.7.2 [10] Using the references from Exercise 5.2, show the final cache contents for a fully associative cache with one-word blocks and a total size of 8 words. Use LRU replacement. For each reference identify the index bits, the tag bits, and if it is a hit or a miss.
5.7.3 [15] Using the references from Exercise 5.2, what is the miss rate for a fully associative cache with two-word blocks and a total size of 8 words, using LRU replacement? What is the miss rate using MRU (most recently used) replacement? Finally what is the best possible miss rate for this cache, given any replacement policy?
Multilevel caching is an important technique to overcome the limited amount of space that a first level cache can provide while still maintaining its speed. Consider a processor with the following parameters:
5.7.4 [10] Calculate the CPI for the processor in the table using: 1) only a first level cache, 2) a second level direct-mapped cache, and 3) a second level eight-way set associative cache. How do these numbers change if main memory access time is doubled? If it is cut in half?
5.7.5 [10] It is possible to have an even greater cache hierarchy than two levels. Given the processor above with a second level, direct-mapped cache, a designer wants to add a third level cache that takes 50 cycles to access and will reduce the global miss rate to 1.3%. Would this provide better performance? In general, what are the advantages and disadvantages of adding a third level cache?
5.7.6 [20] In older processors such as the Intel Pentium or Alpha 21264, the second level of cache was external (located on a different chip) from the main processor and the first level cache. While this allowed for large second level caches, the latency to access the cache was much higher, and the bandwidth was typically lower because the second level cache ran at a lower frequency. Assume a 512 KiB off-chip second level cache has a global miss rate of 4%. If each additional 512 KiB of cache lowered global miss rates by 0.7%, and the cache had a total access time of 50 cycles, how big would the cache have to be to match the performance of the second level direct-mapped cache listed above? Of the eight-way set associative cache? - ANSWER ojo
https://www.chegg.com/homework-help/questions-and-answers/given-following-word-addresses-3-180-43-2-191-88-190-14-181-44-186-253-show-final-cache-co-q27391721
https://www.chegg.com/homework-help/questions-and-answers/exercise-examines-impact-different-cache-designs-specifically-comparing-associative-caches-q29312163
https://www.chegg.com/homework-help/questions-and-answers/given-following-word-addresses-3-180-43-2-191-88-190-14-181-44-186-253-show-final-cache-co-q27391721
https://drive.google.com/file/d/1tB-2AJcFQpuWZacDc_0GL2-SbuckxxLY/view?usp=sharing
https://www.chegg.com/homework-help/questions-and-answers/572-1101-54-using-references-exercise-52-show-final-cache-contents-fully-associative-cache-q6231045
https://www.chegg.com/homework-help/Computer-Organization-and-Design-5th-edition-chapter-5.7-problem-1E-solution-9780124078864
solutions https://www.chegg.com/homework-help/Computer-Organization-and-Design-5th-edition-chapter-5.2-problem-1E-solution-9780124078864 - ANSWER https://www.chegg.com/homework-help/Computer-Organization-and-Design-5th-edition-chapter-5.2-problem-1E-solution-9780124078864
6.1 First, write down a list of your daily activities that you typically do on a weekday. For instance, you might get out of bed, take a shower, get dressed, eat breakfast, dry your hair, brush your teeth. Make sure to break down your list so you have a minimum of 10 activities.
6.1.1 [5] Now consider which of these activities is already exploiting some form of parallelism (e.g., brushing multiple teeth at the same time, versus one at a time, carrying one book at a time to school, versus loading them all into your backpack and then carry them "in parallel"). For each of your activities, discuss if they are already working in parallel, but if not, why they are not.
6.1.2 [5] Next, consider which of the activities could be carried out concurrently (e.g., eating breakfast and listening to the news). For each of your activities, describe which other activity could be paired with this activity.
6.1.3 [5] For 6.1.2, what could we change about current systems (e.g., showers, clothes, TVs, cars) so that we could perform more tasks in parallel?
6.1.4 [5] Estimate how much shorter time it would take to carry out these activities if you tried to carry out as many tasks in parallel as possible. - ANSWER https://drive.google.com/file/d/168ssUJGOqOckuoDdGR7KnlMwYesDJybH/view?usp=sharing
https://www.chegg.com/homework-help/questions-and-answers/5-first-write-list-daily-activities-typically-weekday-instance-get-bed-take-shower-get-dre-q28549457
6.2 You are trying to bake 3 blueberry pound cakes. Cake ingredients are as follows:
1 cup butter, softened1 cup sugar4 large eggs1 teaspoon vanilla extract1/2 teaspoon salt1/4 teaspoon nutmeg1 1/2 cups flour1 cup blueberries
The recipe for a single cake is as follows:Step 1: Preheat oven to 325°F (160°C). Grease and flour your cake pan.Step 2: In large bowl, beat together with a mixer butter and sugar at medium speed until light and fluffy. Add eggs, vanilla, salt and nutmeg. Beat until thoroughly blended. Reduce mixer speed to low and add flour, 1/2 cup at a time, beating just until blended.Step 3: Gently fold in blueberries. Spread evenly in prepared baking pan. Bake for 60 minutes.
6.2.1 [5] Your job is to cook 3 cakes as efficiently as possible. Assuming that you only have one oven large enough to hold one cake, one large bowl, one cake pan, and one mixer, come up with a schedule to make three cakes as quickly as possible. Identify the bottlenecks in completing this task.
6.2.2 [5] Assume now that you have three bowls, 3 cake pans and 3 mixers. How much faster is the process now that you have additional resources?
6.2.3 [5] Assume now that you have two friends that will help you cook, and that you have a large oven that can accommodate all three cakes. How will this change the schedule you arrived at in Exercise 6.2.1 above?
6.2.4 [5] Compare the cake-making task to computing 3 iterations of a loop on a parallel computer. Identify data-level parallelism and task-level parallelism in the cake-making loop. - ANSWER 6.2.1 For this set of resources, we can pipeline the preparation. We assume that
we do not have to reheat the oven for each cake.
Preheat Oven
Mix ingredients in bowl for Cake 1
Fill cake pan with contents of bowl and bake Cake 1. Mix ingredients for
Cake 2 in bowl.
Finish baking Cake 1. Empty cake pan. Fill cake pan with bowl contents for
Cake 2 and bake Cake 2. Mix ingredients in bowl for Cake 3.
Finish baking Cake 2. Empty cake pan. Fill cake pan with bowl contents for
Cake 3 and bake Cake 3.
Finish baking Cake 3. Empty cake pan.
6.2.2 Now we have 3 bowls, 3 cake pans and 3 mixers. We will name them A, B,
and C.
Preheat Oven
Mix ingredients in bowl A for Cake 1
Fill cake pan A with contents of bowl A and bake for Cake 1. Mix ingredients
for
Cake 2 in bowl A.
Finish baking Cake 1. Empty cake pan A. Fill cake pan A with contents of
bowl A for Cake 2. Mix ingredients in bowl A for Cake 3.
Finishing baking Cake 2. Empty cake pan A. Fill cake pan A with contents
of bowl A for Cake 3.
6.2.3 Each step can be done in parallel for each cake. Th e time to bake 1 cake, 2
cakes or 3 cakes is exactly the same.
6.2.4 Th e loop computation is equivalent to the steps involved to make one cake.
Given that we have multiple processors (or ovens and cooks), we can execute
instructions (or cook multiple cakes) in parallel. Th e instructions in the loop (or
cooking steps) may have some dependencies on prior instructions (or cooking
steps) in the loop body (cooking a single cake).
Data-level parallelism occurs when loop iterations are independent (i.e., no
loop carried dependencies).
Task-level parallelism includes any instructions that can be computed on
parallel execution units, are similar to the independent operations involved
in making multiple cakes.
https://www.chegg.com/homework-help/questions-and-answers/56-exercise-look-different-ways-capacity-affects-overall-performance-general-cache-access--q6275583 - ANSWER ojo good https://www.chegg.com/homework-help/questions-and-answers/56-exercise-look-different-ways-capacity-affects-overall-performance-general-cache-access--q6275583
[Show More]