default search action
24th PPoPP 2019: Washington, DC, USA
- Jeffrey K. Hollingsworth, Idit Keidar:
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, Washington, DC, USA, February 16-20, 2019. ACM 2019, ISBN 978-1-4503-6225-2 - Joel Hestness, Newsha Ardalani, Gregory F. Diamos:
Beyond human-level accuracy: computational challenges in deep learning. 1-14 - Junmin Xiao, Shijie Wang, Weiqiang Wan, Xuehai Hong, Guangming Tan:
S-EnKF: co-designing for scalable ensemble Kalman filter. 15-26 - Isaac Gelado, Michael Garland:
Throughput-oriented GPU memory allocation. 27-37 - Hao Wang, Liang Geng, Rubao Lee, Kaixi Hou, Yanfeng Zhang, Xiaodong Zhang:
SEP-graph: finding shortest execution paths for graph processing under a hybrid framework on GPU. 38-52 - Troels Henriksen, Frederik Thorøe, Martin Elsman, Cosmin E. Oancea:
Incremental flattening for nested data parallelism. 53-67 - Martin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger:
Adaptive sparse matrix-matrix multiplication on the GPU. 68-81 - Brijesh Dongol, Radha Jagadeesan, James Riely:
Modular transactions: bounding mixed races in space and time. 82-93 - Ryan Yates, Michael L. Scott:
Leveraging hardware TM in Haskell. 94-106 - Ricardo Filipe, Shady Issa, Paolo Romano, João Barreto:
Stretching the capacity of hardware transactional memory in IBM POWER architectures. 107-119 - Mohamed M. Saad, Masoomeh Javidi Kishi, Shihao Jing, Sandeep Hans, Roberto Palmieri:
Processing transactions in a predefined order. 120-132 - Zhaofeng Yan, Yuzhe Lin, Lu Peng, Weihua Zhang:
Harmonia: a high throughput B+tree for GPUs. 133-144 - Muhammad A. Awad, Saman Ashkiani, Rob Johnson, Martin Farach-Colton, John D. Owens:
Engineering a high-performance GPU B-Tree. 145-157 - Xiaokang Hu, Changzheng Wei, Jian Li, Brian Will, Ping Yu, Lu Gong, Haibing Guan:
QTLS: high-performance TLS asynchronous offload framework with Intel® QuickAssist technology. 158-172 - Fabian Gruber, Manuel Selva, Diogo Sampaio, Christophe Guillon, Antoine Moynault, Louis-Noël Pouchet, Fabrice Rastello:
Data-flow/dependence profiling for structured transformations. 173-185 - Qingsen Wang, Pengfei Su, Milind Chabbi, Xu Liu:
Lightweight hardware transactional memory profiling. 186-200 - Ke Meng, Jiajia Li, Guangming Tan, Ninghui Sun:
A pattern based algorithmic autotuner for graph processing on GPUs. 201-213 - Umut A. Acar, Vitaly Aksenov, Arthur Charguéraud, Mike Rainey:
Provably and practically efficient granularity control. 214-228 - Xiuhong Li, Yun Liang, Shengen Yan, Liancheng Jia, Yinghan Li:
A coordinated tiling and batching framework for efficient GEMM on GPUs. 229-241 - Qi Zhao, Zhengyi Qiu, Guoliang Jin:
Semantics-aware scheduling policies for synchronization determinism. 242-256 - Kyle Singer, Yifan Xu, I-Ting Angelina Lee:
Proactive work stealing for futures. 257-271 - Loc Hoang, Matteo Pontecorvi, Roshan Dathathri, Gurbinder Gill, Bozhi You, Keshav Pingali, Vijaya Ramachandran:
A round-efficient distributed betweenness centrality algorithm. 272-286 - Martin Küttler, Maksym Planeta, Jan Bierbaum, Carsten Weinhold, Hermann Härtig, Amnon Barak, Torsten Hoefler:
Corrected trees for reliable group communication. 287-299 - Changwan Hong, Aravind Sukumaran-Rajam, Israt Nisa, Kunal Singh, P. Sadayappan:
Adaptive sparse tiling for sparse matrix multiplication. 300-314 - Martin Bättig, Thomas R. Gross:
Encapsulated open nesting for STM: fine-grained higher-level conflict detection. 315-326 - Herbert Jordan, Pavle Subotic, David Zhao, Bernhard Scholz:
A specialized B-tree for concurrent datalog evaluation. 327-339 - Robert Utterback, Kunal Agrawal, Jeremy T. Fineman, I-Ting Angelina Lee:
Efficient race detection with futures. 340-354 - Simon Doherty, Brijesh Dongol, Heike Wehrheim, John Derrick:
Verifying C11 programs operationally. 355-365 - Burcu Kulahcioglu Ozkan, Rupak Majumdar, Filip Niksic:
Checking linearizability using hitting families. 366-377 - Caleb Voss, Tiago Cogumbreiro, Vivek Sarkar:
Transitive joins: a sound and efficient online deadlock-avoidance policy. 378-390 - Jiawen Sun, Hans Vandierendonck, Dimitrios S. Nikolopoulos:
VEBO: a vertex- and edge-balanced ordering heuristic to load balance parallel graph processing. 391-392 - Kartik Lakhotia, Rajgopal Kannan, Sourav Pati, Viktor K. Prasanna:
GPOP: a cache and memory-efficient framework for graph processing over partitions. 393-394 - Somesh Singh, Rupesh Nasre:
Optimizing graph processing on GPUs using approximate computing: poster. 395-396 - Jinrong Guo, Wantao Liu, Wang Wang, Qu Lu, Songlin Hu, Jizhong Han, Ruixuan Li:
A GPU memory efficient speed-up scheme for training ultra-deep neural networks: poster. 397-398 - Yuki Ito, Haruki Imai, Tung D. Le, Yasushi Negishi, Kiyokuni Kawachiya, Ryo Matsumiya, Toshio Endo:
Profiling based out-of-core hybrid method for large neural networks: poster. 399-400 - Xiao Dong, Lei Liu, Guangli Li, Jiansong Li, Peng Zhao, Xueying Wang, Xiaobing Feng:
Exploiting the input sparsity to accelerate deep neural networks: poster. 401-402 - Peng Jiang, Gagan Agrawal:
Accelerating distributed stochastic gradient descent with adaptive periodic parameter averaging: poster. 403-404 - Putt Sakdhnagool, Amit Sabne, Rudolf Eigenmann:
Optimizing GPU programs by register demotion: poster. 405-406 - Yubin Chen, Zhuocheng Ding, Jin Zhang, Yun Wang, Zhengwei Qi, Haibing Guan:
A distributed hypervisor for resource aggregation: poster. 407-408 - Mohamed Lamine Karaoui, Anthony Carno, Robert Lyerly, Sang-Hoon Kim, Pierre Olivier, Changwoo Min, Binoy Ravindran:
Scheduling HPC workloads on heterogeneous-ISA architectures: poster. 409-410 - Da Yan, Guimu Guo, Md Mashiur Rahman Chowdhury, M. Tamer Özsu, John C. S. Lui, Weida Tan:
T-thinker: a task-centric distributed framework for compute-intensive divide-and-conquer algorithms. 411-412 - Mohammad Mahdi Javanmard, Pramod Ganapathi, Rathish Das, Zafar Ahmad, Stephen L. Tschudi, Rezaul Chowdhury:
Toward efficient architecture-independent algorithms for dynamic programs: poster. 413-414 - Emilio Castillo, Nikhil Jain, Marc Casas, Miquel Moretó, Martin Schulz, Ramón Beivide, Mateo Valero, Abhinav Bhatele:
Optimizing computation-communication overlap in asynchronous task-based programs: poster. 415-416 - Nikita Koval, Dan Alistarh, Roman Elizarov:
Lock-free channels for programming via communicating sequential processes: poster. 417-418 - Naama Ben-David, Guy E. Blelloch, Michal Friedman, Yuanhao Wei:
Making concurrent algorithms detectable: poster. 419-420 - Kunpeng Wang, Shizhen Xu, Hongkun Yu, Haohuan Fu, Guangwen Yang:
GPU-based 3D cryo-EM reconstruction with key-value streams: poster. 421-422 - Athena Elafrou, Georgios I. Goumas, Nectarios Koziris:
BASMAT: bottleneck-aware sparse matrix-vector multiplication auto-tuning on GPGPUs. 423-424 - Avner Elizarov, Guy Golan-Gueta, Erez Petrank:
LOFT: lock-free transactional data structures. 425-426 - Xiang Ni, Scott Schneider, Raju Pavuluri, Jonathan Kaus, Kun-Lung Wu:
Automated multi-dimensional elasticity for streaming runtimes: poster. 427-428 - Marcelo Novaes, Vinicius Petrucci, Abdoulaye Gamatié, Fernando Magno Quintão Pereira:
Compiler-assisted adaptive program scheduling in big.LITTLE systems: poster. 429-430 - Chanyoung Oh, Zhen Zheng, Xipeng Shen, Jidong Zhai, Youngmin Yi:
GOPipe: a granularity-oblivious programming framework for pipelined stencil executions on GPU. 431-432 - Tim Kaler, Brian Wheatman, Sarah Wooders:
High-throughput image alignment for connectomics using frugal snap judgments: poster. 433-434 - Xiaolong Xie, Yun Liang, Xiuhong Li, Wei Tan:
CuLDA_CGS: solving large-scale LDA problems on GPUs. 435-436 - Sharanyan Srikanthan, Princeton Ferro, Sayak Chakraborti, Sandhya Dwarkadas:
Managing application parallelism via parallel efficiency regulation: poster. 437-438 - Emmanuelle Anceaume, Antonella Del Pozzo, Romaric Ludinard, Maria Potop-Butucaru, Sara Tucci Piergiovanni:
Blockchain abstract data type: poster. 439-440 - Ivo Jimenez, Jay F. Lofstead, Carlos Maltzahn:
Creating repeatable, reusable experimentation pipelines with popper: tutorial. 441-442 - Travis Carlson, Eric Van Wyk:
Building parallel programming language constructs in the AbleC extensible C compiler framework: a PPoPP tutorial. 443-446 - Yihan Sun, Guy E. Blelloch:
Implementing parallel and concurrent tree structures. 447-450 - Frank Mueller, Greg Byrd, Patrick Dreher:
Programming quantum computers: a primer with IBM Q and D-Wave exercises. 451 - Dhabaleswar K. Panda, Ammar Ahmad Awan, Hari Subramoni:
High performance distributed deep learning: a beginner's guide. 452-454 - David Beckingsale, Richard D. Hornung, Tom Scogland, Arturo Vargas:
Performance portable C++ programming with RAJA. 455-456
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.