


default search action
PPoPP 2020: San Diego, CA, USA
- Rajiv Gupta, Xipeng Shen:

PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, California, USA, February 22-26, 2020. ACM 2020, ISBN 978-1-4503-6818-6
Research Articles
- Vasilis Gavrielatos, Antonios Katsarakis, Vijay Nagarajan

, Boris Grot
, Arpit Joshi:
Kite: efficient and available release consistency for the datacenter. 1-16 - Hagar Meir, Dmitry Basin, Edward Bortnikov, Anastasia Braginsky, Yonatan Gottesman, Idit Keidar, Eran Meir, Gali Sheffi, Yoav Zuriel:

Oak: a scalable off-heap allocated key-value map. 17-31 - Da Yan

, Wei Wang, Xiaowen Chu
:
Optimizing batched Winograd convolution on GPUs. 32-44 - Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, Torsten Hoefler:

Taming unbalanced training workloads in deep learning with partial collective operations. 45-61 - Gali Sheffi, Dmitry Basin, Edward Bortnikov, David Carmel, Idit Keidar:

Scalable top-k retrieval with Sparta. 62-73 - Jiannan Tian

, Sheng Di, Chengming Zhang
, Xin Liang, Sian Jin, Dazhao Cheng, Dingwen Tao
, Franck Cappello:
waveSZ: a hardware-algorithm co-design of efficient lossy compression for scientific data. 74-88 - Or Ostrovsky, Adam Morrison:

Scaling concurrent queues by using HTM to profit from failed atomic operations. 89-101 - Andreia Correia, Pedro Ramalhete, Pascal Felber

:
A wait-free universal construction for large objects. 102-116 - Arik Rinberg, Alexander Spiegelman, Edward Bortnikov, Eshcar Hillel, Idit Keidar, Lee Rhodes, Hadar Serviansky:

Fast concurrent data sketches. 117-129 - Ruslan Nikolaev

, Binoy Ravindran
:
Universal wait-free memory reclamation. 130-143 - Lai Wei, John M. Mellor-Crummey

:
Using sample-based time series data for automated diagnosis of scalability losses in parallel programs. 144-159 - Yang Xia, Peng Jiang

, Gagan Agrawal:
Scaling out speculative execution of finite-state machines with parallel merge. 160-172 - Sonali Saha, V. Krishna Nandivada:

On the fly MHP analysis. 173-186 - Daniel DeFreez, Antara Bhowmick, Ignacio Laguna, Cindy Rubio-González:

Detecting and reproducing error-code propagation bugs in MPI implementations. 187-201 - Omar Inverso, Catia Trubiani:

Parallel and distributed bounded model checking of multi-threaded programs. 202-216 - Yifan Xu, Kyle Singer

, I-Ting Angelina Lee:
Parallel determinacy race detection for futures. 217-231 - Julian Shun:

Practical parallel hypergraph algorithms. 232-249 - Piyush Sao, Ramakrishnan Kannan, Prasun Gera, Richard W. Vuduc

:
A supernodal all-pairs shortest path algorithm. 250-261 - Ghadeer Alabandi

, Evan Powers, Martin Burtscher:
Increasing the parallelism of graph coloring via shortcutting. 262-275 - Trevor Brown, Aleksandar Prokopec, Dan Alistarh:

Non-blocking interpolation search trees with doubly-logarithmic running time. 276-291 - Blair Archibald

, Patrick Maier, Rob Stewart
, Phil Trinder:
YewPar: skeletons for exact combinatorial search. 292-307 - Chuzhe Tang

, Youyun Wang, Zhiyuan Dong, Gansen Hu, Zhaoguo Wang, Minjie Wang, Haibo Chen:
XIndex: a scalable learned index for multicore data storage. 308-320 - Jaehoon Jung, Daeyoung Park, Youngdong Do, Jungho Park, Jaejin Lee:

Overlapping host-to-device copy and computation using hidden unified memory. 321-335 - Khaled Hamidouche, Michael LeBeane:

<u>G</u>PU <u>i</u>nitiated <u>O</u>penSHMEM: correct and efficient intra-kernel networking for dGPUs. 336-347 - Nian Liu

, Binyu Zang, Haibo Chen:
No barrier in the road: a comprehensive study and optimization of ARM barriers. 348-361 - Mathias Parger, Martin Winter

, Daniel Mlakar, Markus Steinberger
:
spECK: accelerating GPU sparse matrix-matrix multiplication through lightweight analysis. 362-375 - Peng Jiang

, Changwan Hong, Gagan Agrawal:
A novel data transformation and execution strategy for accelerating sparse matrix multiplication on GPUs. 376-388 - Bangtian Liu, Kazem Cheshmi, Saeed Soori, Michelle Mills Strout, Maryam Mehri Dehnavi:

MatRox: modular approach for improving data locality in hierarchical (Mat)rix App(Rox)imation. 389-402
Posters
- Jiajia Li

, Mahesh Lakshminarasimhan
, Xiaolong Wu, Ang Li, Catherine Olschanowsky, Kevin J. Barker
:
A parallel sparse tensor benchmark suite on CPUs and GPUs. 403-404 - Gal Assa, Hagar Meir, Guy Golan-Gueta, Idit Keidar, Alexander Spiegelman:

Nesting and composition in transactional data structure libraries. 405-406 - Shilong Wang, Da Li, Hengyong Yu

, Hang Liu:
ELDA: LDA made efficient via algorithm-system codesign submission. 407-408 - Yuyang Jin, Haojie Wang, Xiongchao Tang, Torsten Hoefler, Xu Liu, Jidong Zhai:

Identifying scalability bottlenecks for large-scale parallel programs with graph analysis. 409-410 - Chaoyang Shui, Xianzhi Yu, Yujin Yan, Yinshan Wang

, Ke Meng, Guangming Tan:
Revisiting linpack algorithm on large-scale CPU-GPU heterogeneous systems. 411-412 - Xiaohui Duan, Ping Gao, Meng Zhang, Tingjian Zhang, Hongsong Meng, Yuxuan Li, Bertil Schmidt

, Haohuan Fu, Lin Gan, Wei Xue, Guangwen Yang, Weiguo Liu:
Neighbor-list-free molecular dynamics on sunway TaihuLight supercomputer. 413-414 - Keren Zhou

, Mark Krentel, John M. Mellor-Crummey
:
A tool for top-down performance analysis of GPU-accelerated applications. 415-416 - Gali Sheffi, Erez Petrank:

Functional faults. 417-418 - Jaume Bosch

, Miquel Vidal
, Antonio Filgueras, Carlos Álvarez
, Daniel Jiménez-González, Xavier Martorell, Eduard Ayguadé:
Breaking master-slave model between host and FPGAs. 419-420 - Wentao Cai, Haosen Wen, H. Alan Beadle, Mohammad Hedayati, Michael L. Scott

:
Understanding and optimizing persistent memory allocation. 421-422 - Nikita Koval, Maria Sokolova, Alexander Fedorov

, Dan Alistarh, Dmitry Tsitelov:
Testing concurrency on the JVM with lincheck. 423-424 - Samuel Thayer, Ganesh Gopalakrishnan, Ian Briggs, Michael Bentley, Dong H. Ahn, Ignacio Laguna, Gregory L. Lee:

ArcherGear: data race equivalencing for expeditious HPC debugging. 425-426 - Abdullah Al-Mamun, Jialin Liu, Tonglin Li, Quincey Koziol, Zhongyi Zhai, Junyan Qian, Haoting Shen, Dongfang Zhao:

Reflector: a fine-grained I/O tracker for HPC systems. 427-428 - H. Alan Beadle, Wentao Cai, Haosen Wen, Michael L. Scott

:
Nonblocking persistent software transactional memory. 429-430 - Aleksey Tyurin, Daniil Berezun

, Semyon V. Grigorev
:
Optimizing GPU programs by partial evaluation. 431-432 - Nikita Koval, Vitaly Aksenov

:
Restricted memory-friendly lock-free bounded queues. 433-434 - Abdullah Al Raqibul Islam, Dong Dai:

Understand the overheads of storage data structures on persistent memory. 435-436 - Fangzhou Liu

, Dong Chen, Wesley Smith, Chen Ding:
PLUM: static parallel program locality analysis under uniform multiplexing. 437-438

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














