The amount of instruction level parallelism (ILP) that can be exploited depends
greatly on the size of the instruction window and the number of in-flight instructions
the processor can support. However, this requires a register file with a large set of
physical registers for renaming and multiple ports to provide...
Conventional register files spread porting resources uniformly across all registers. This paper proposes a method called Asymmetric Clustering using a Register Cache (ACRC). ACRC utilizes a fast register cache that concentrates valuable register file ports to the most active registers thereby reducing the total register file area and power consumption....
The purpose of this thesis is to explore dependency speculation in Dynamic Simultaneous Multi-Threading (DSMT). DSMT is a microprocessor architecture which attempts to extract Thread Level Parallelism (TLP) from single-threaded programs at run-time. This is accomplished by running multiple iterations of program loops in parallel. The DSMT architecture was originally...