Logo image
Enabling and Exploiting Partition-Level Parallelism (PALP) in Phase Change Memories
Journal article   Open access   Peer reviewed

Enabling and Exploiting Partition-Level Parallelism (PALP) in Phase Change Memories

Shihao Song, Anup Das, Onur Mutlu and Nagarajan Kandasamy
ACM transactions on embedded computing systems, v 18(5), pp 1-25
01 Oct 2019
url
http://arxiv.org/abs/1908.07966View

Abstract

Computer Science Computer Science, Hardware & Architecture Computer Science, Software Engineering Science & Technology Technology
Phase-change memory (PCM) devices have multiple banks to serve memory requests in parallel. Unfortunately, if two requests go to the same bank, they have to be served one after another, leading to lower system performance. We observe that a modern PCM bank is implemented as a collection of partitions that operate mostly independently while sharing a few global peripheral structures, which include the sense amplifiers (to read) and the write drivers (to write). Based on this observation, we propose PALP, a new mechanism that enables partition-level parallelism within each PCM bank, and exploits such parallelism by using the memory controller's access scheduling decisions. PALP consists of three new contributions. First, we introduce new PCM commands to enable parallelism in a bank's partitions in order to resolve the read-write bank conflicts, with no changes needed to PCM logic or its interface. Second, we propose simple circuit modifications that introduce a new operating mode for the write drivers, in addition to their default mode of serving write requests. When configured in this new mode, the write drivers can resolve the read-read bank conflicts, working jointly with the sense amplifiers. Finally, we propose a new access scheduling mechanism in PCM that improves performance by prioritizing those requests that exploit partition-level parallelism over other requests, including the long outstanding ones. While doing so, the memory controller also guarantees starvation-freedom and the PCM's running-average-power-limit (RAPL). We evaluate PALP with workloads from the MiBench and SPEC CPU2017 Benchmark suites. Our results show that PALP reduces average PCM access latency by 23%, and improves average system performance by 28% compared to the state-of-the-art approaches.

Metrics

14 Record Views
26 citations in Scopus

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#11 Sustainable Cities and Communities

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
International collaboration
Web of Science research areas
Computer Science, Hardware & Architecture
Computer Science, Software Engineering
Logo image