The dynamic speculation and performance prediction of parallel loops Public Deposited

http://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/z890rw34f

Descriptions

Attribute NameValues
Creator
Abstract or Summary
  • General purpose computer systems have seen increased performance potential through the parallel processing capabilities of multicore processors. Yet this potential performance can only be attained through parallel applications, thus forcing software developers to rethink how everyday applications are designed. The most readily form of Thread Level Parallelism (TLP) within any program are from loops. Unfortunately, the majority of loops cannot be easily multithreaded due to inter-iteration dependencies, conditional statements, nested functions, and dynamic memory allocation. This dissertation seeks to understand the fundamental characteristics and relationships of loops in order to assist programmers and compilers in exploiting TLP. First, this dissertation explores a hardware solution that exploits (TLP) through Dynamic Speculative Multithreading (D-SpMT), which can extract multiple threads from a sequential program without compiler support or instruction set extensions. This dissertation presents Cascadia, a D-SpMT multicore architecture that provides multi-grain thread-level support. Cascadia applies a unique sustainable IPC (sIPC) metric on a comprehensive loop tree to select the best performing nested loop level to multithread. Results showed that Cascadia can extract large amounts of TLP, but ultimately, only yielded moderate performance gains. The lack of overall performance gains exhibited by Cascadia were due to the sequential nature of applications, rather than Cascadia's ability to perform D-SpMT. In order to fully exploit TLP through loops, some loop level analysis and transformation must first be performed. Therefore, second contribution of this dissertation is the development of several theoretical methodologies to aid programmers and auto-tuners in parallelizing loops. This work found that the inter-iteration dependencies have a two-fold effect on the loop's parallel performance. First, the performance is primarily affected by a single, dominant dependency, and it is the execution of the dominant dependency path that directly determines the parallel performance of the loop. Any additional dependencies cause a secondary effect that may increase the execution time due to relative dependency path differences. Furthermore, this study analyzes the effects of non-ideal conditions, such as a limited number of processors, multithreading overhead, and irregular loop structures.
Resource Type
Date Available
Date Copyright
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Keyword
Subject
Rights Statement
Language
Replaces
Additional Information
  • description.provenance : Submitted by David Zier (zierd@onid.orst.edu) on 2009-05-06T19:39:34Z No. of bitstreams: 1 zier_osu_dissertation_2009.pdf: 1475503 bytes, checksum: f37063c7d5b4df5b2a5fb27b9e84b19c (MD5)
  • description.provenance : Approved for entry into archive by Julie Kurtz(julie.kurtz@oregonstate.edu) on 2009-05-07T23:36:36Z (GMT) No. of bitstreams: 1 zier_osu_dissertation_2009.pdf: 1475503 bytes, checksum: f37063c7d5b4df5b2a5fb27b9e84b19c (MD5)
  • description.provenance : Approved for entry into archive by Laura Wilson(laura.wilson@oregonstate.edu) on 2009-05-13T22:51:09Z (GMT) No. of bitstreams: 1 zier_osu_dissertation_2009.pdf: 1475503 bytes, checksum: f37063c7d5b4df5b2a5fb27b9e84b19c (MD5)
  • description.provenance : Made available in DSpace on 2009-05-13T22:51:09Z (GMT). No. of bitstreams: 1 zier_osu_dissertation_2009.pdf: 1475503 bytes, checksum: f37063c7d5b4df5b2a5fb27b9e84b19c (MD5)

Relationships

In Administrative Set:
Last modified: 08/01/2017

Downloadable Content

Download PDF
Citations:

EndNote | Zotero | Mendeley

Items