The amount of instruction level parallelism (ILP) that can be exploited depends
greatly on the size of the instruction window and the number of in-flight instructions
the processor can support. However, this requires a register file with a large set of
physical registers for renaming and multiple ports to provide...
The motivation of this research is to study different cache designs for on-chip
caches that improve processor performance and at the same time
minimize the degradation to system performance caused by an increase in
the processor memory traffic. As VLSI technology advances we can have
bigger and more complex on-chip...
The purpose of this thesis is to explore methods which can reduce the power dissipation of a mobile system while decoding MPEG video. MPEG decoding is a microprocessor intensive process that makes heavy use of both the L1 and L2 caches as well as main memory. The heavy load placed...
Conventional register files spread porting resources uniformly across all registers. This paper proposes a method called Asymmetric Clustering using a Register Cache (ACRC). ACRC utilizes a fast register cache that concentrates valuable register file ports to the most active registers thereby reducing the total register file area and power consumption....
Computers using the tagged-token dataflow model are among the best candidates for delivering extremely high levels of performance required in the future. Instruction scheduling in these computers is determined by associatively matching data-bearing tokens in a Waiting-Matching Unit (W-M unit). At the W-M unit, incoming tokens with matching contexts are...