Get -PDF- Locality Aware Task Management On Many Core Processors Read Full

Locality-aware Task Management on Many-core Processors

Author	: Richard Myungon Yoo
Publisher	:
Total Pages	:
Release	: 2012
ISBN-10	: OCLC:794826166
ISBN-13	:
Rating	: 4/5 ( Downloads)

GET BOOK

Book Synopsis Locality-aware Task Management on Many-core Processors by : Richard Myungon Yoo

Download or read book Locality-aware Task Management on Many-core Processors written by Richard Myungon Yoo and published by . This book was released on 2012 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: The landscape of computing is changing. Due to limits in transistor scaling, the traditional approach to exploit instruction-level parallelism through wide-issue out-of-order execution cores provided diminishing performance gains. As a result, computer architects now rely on thread-level parallelism to obtain sustainable performance improvement. In particular, many-core processors are designed to exploit parallelism by implementing multiple cores that can execute in parallel. Both industry and academia agree that scaling the number of cores to hundreds or thousands is the only way to scale performance from now on. However, such a shift in design increases processor system demands. As a result, the cache hierarchies on many-core processors are becoming larger and increasingly complex. Such cache hierarchies suffer from high latency and energy consumption, and non-uniform memory access effects become prevalent. Traditionally, exploiting locality was an option to reduce execution time and energy consumption. On the complex many-core cache hierarchy, however, failing to exploit locality may end up having more cores stalled, thereby undermining the very viability of parallelism. Locality can be exploited at various hardware and software layers. By implementing private and shared caches in a multi-level fashion, recent hardware designs are already optimized for locality. However, this would all be useless if the software scheduling does not cast the execution in a manner that promotes locality available in the programs themselves. Especially, the recent proliferation of runtime-based programming systems further stresses the importance of locality-aware scheduling. Although many efforts have been made to exploit locality on a runtime, they fail to take the underlying cache hierarchy into consideration, are limited to specific programming models, and suffer high management costs. This thesis shows that locality-aware schedules can be generated at low costs by utilizing high-level information. In particular, by optimizing a MapReduce runtime on a multi-socket many-core system, we show that runtimes can leverage explicit producer-consumer information to exploit locality. Specifically, the locality on the data structures that buffer intermediate results becomes significantly important. In addition, the optimization should be performed across all the software layers. To handle the case where the explicit data dependency information is not available, we develop a graph-based locality analysis framework that allows to analyze key scheduling attributes while being independent of hardware specifics and scale. Using the framework, we also develop a reference scheduling scheme that shows significant performance improvement and energy savings. We then develop a novel class of practical locality-aware task managers, that leverage workload pattern information and simple locality hints to approximate the reference scheduling scheme. Through experiments, we show that the quality of generated schedules can match that of the reference scheme, and that the schedule generation costs are minimal. While exploiting significant locality, these managers maintain the simple task programming interface intact. We also point out that task stealing can be made compatible with locality-aware scheduling. Traditional task management schemes believed there exists a fundamental tradeoff between locality and load balance, and fixated on one to sacrifice the other. We show that a stealing scheme can be made locality-aware, by trying to preserve the original schedule while transferring tasks for load balancing. In summary, utilizing high-level information allows the construction of efficient locality-aware task management schemes that make programs run faster while consuming less energy.

Locality-aware Task Management on Many-core Processors

Locality-aware Task Management on Many-core Processors Related Books

Locality-aware Task Management on Many-core Processors

Task Scheduling for Multi-core and Parallel Architectures

Locality-aware Cache Hierarchy Management for Multicore Processors

Euro-Par 2018: Parallel Processing

OpenMP: Memory, Devices, and Tasks