Logo image
Reinforcement learning-based inter- and intra-application thermal optimization for lifetime improvement of multicore systems
Conference proceeding   Open access

Reinforcement learning-based inter- and intra-application thermal optimization for lifetime improvement of multicore systems

Anup Das, Rishad A Shafik, Geoff V Merrett, Bashir M Al-Hashimi, Akash Kumar, Bharadwaj Veeravalli and IEEE
2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), pp 1-6
Jun 2014
url
https://eprints.soton.ac.uk/362012/1/dac14.pdfView
Accepted (AM)Open Access (License Unspecified) Open

Abstract

Legged locomotion Linux Reliability Temperature measurement Temperature sensors
The thermal profile of multicore systems vary both within an application's execution (intra) and also when the system switches from one application to another (inter). In this paper, we propose an adaptive thermal management approach to improve the lifetime reliability of multicore systems by considering both inter- and intra-application thermal variations. Fundamental to this approach is a reinforcement learning algorithm, which learns the relationship between the mapping of threads to cores, the frequency of a core and its temperature (sampled from on-board thermal sensors). Action is provided by overriding the operating system's mapping decisions using affinity masks and dynamically changing CPU frequency using in-kernel governors. Lifetime improvement is achieved by controlling not only the peak and average temperatures but also thermal cycling, which is an emerging wear-out concern in modern systems. The proposed approach is validated experimentally using an Intel quad-core platform executing a diverse set of multimedia benchmarks. Results demonstrate that the proposed approach minimizes average temperature, peak temperature and thermal cycling, improving the mean-time-to-failure (MTTF) by an average of 2× for intra-application and 3× for inter-application scenarios when compared to existing thermal management techniques. Furthermore, the dynamic and static energy consumption are also reduced by an average 10% and 11% respectively.

Metrics

4 Record Views
48 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
International collaboration
Web of Science research areas
Automation & Control Systems
Engineering, Electrical & Electronic
Logo image