This thesis presents a novel methodology that enables power efficient video decoding
in an embedded system based on MPSoC (Multiprocessor System on Chip). This
methodology is a physical combination of parallel processing which reduces power
consumption of processors by exploiting thread-level parallelism and Dynamic
Voltage Frequency Scaling (DVFS) that allows...
The amount of instruction level parallelism (ILP) that can be exploited depends
greatly on the size of the instruction window and the number of in-flight instructions
the processor can support. However, this requires a register file with a large set of
physical registers for renaming and multiple ports to provide...
NAND flash based solid state drives (SSDs) require out-of-place updating due to the characteristics of flash memories. In addition, due to the mismatched granularity between read/write and erase operations, a cleaning policy involving garbage collection and wear leveling has to perform data migration incurring high overhead. Another challenge is that...
MANETs are known to be useful in situations where mobile nodes need to communicate and coordinate in dynamic environments with no access to fixed network infrastructure. However, connectivity problems can occur when sub-groups within a MANET move out of communication range from one another. The increasingly prolific use of UAVs...
The purpose of this thesis is to explore methods which can reduce the power dissipation of a mobile system while decoding MPEG video. MPEG decoding is a microprocessor intensive process that makes heavy use of both the L1 and L2 caches as well as main memory. The heavy load placed...
Dynamic multithreaded processors attempt to increase the performance of a single
sequential program by dynamically extracting threads from sources such as loop
iterations. The scheduling of instructions in such a processor plays a vital role in the
amount of thread level parallelism that can be extracted and thus the overall...
Conventional register files spread porting resources uniformly across all registers. This paper proposes a method called Asymmetric Clustering using a Register Cache (ACRC). ACRC utilizes a fast register cache that concentrates valuable register file ports to the most active registers thereby reducing the total register file area and power consumption....
General purpose computer systems have seen increased performance potential through the parallel processing capabilities of multicore processors. Yet this potential performance can only be attained through parallel applications, thus forcing software developers to rethink how everyday applications are designed. The most readily form of Thread Level Parallelism (TLP) within any...
The ubiquity of high quality video and proliferation of mobile devices has contributed to an unprecedented rise in video consumption. HTTP, in conjunction with adaptive streaming, has become the de facto mechanism for delivering the vast majority of video as it readily caters to heterogeneous networks and devices. This dissertation...
IO transactions within a computer system have evolved along with other system components (i.e., CPU, memory, video) from programmed IO (PIO). In current mainstream systems (spanning from HPC to mobile) the IO transactions are CPU-centric descriptor-based DMA transactions. The key benefit is that slower IO devices can DMA write system...