


default search action
28th HiPC 2021: Bengaluru, India
- 28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021, Bengaluru, India, December 17-20, 2021. IEEE 2021, ISBN 978-1-6654-1016-8
- Adam Belay:
Improving Efficiency and Performance Through Faster Scheduling Mechanisms. xxii - Jingren Zhou:
Towards an Integral System for Processing Big Graphs at Scale. xxi - Chi Zhang, Sanmukh Rao Kuppannagari
, Viktor K. Prasanna:
Parallel Actors and Learners: A Framework for Generating Scalable RL Implementations. 1-10 - Michela Taufer:
AI4IO: A Suite of Ai-Based Tools for IO-Aware HPC Resource Management. 1 - Amal Gueroudji
, Julien Bigot
, Bruno Raffin
:
DEISA: Dask-Enabled In Situ Analytics. 11-20 - A. Srinivas Reddy, P. Krishna Reddy, Anirban Mondal, U. Deva Priyakumar:
A Model of Graph Transactional Coverage Patterns with Applications to Drug Discovery. 21-30 - Eliza Wszola, Martin Jaggi, Markus Püschel:
Faster Parallel Training of Word Embeddings. 31-41 - Nariaki Tateiwa, Yuji Shinano
, Keiichiro Yamamura, Akihiro Yoshida, Shizuo Kaji
, Masaya Yasuda
, Katsuki Fujisawa
:
CMAP-LAP: Configurable Massively Parallel Solver for Lattice Problems. 42-52 - Hwajung Kim, Jiwoo Bang, Dong Kyu Sung, Hyeonsang Eom, Heon Y. Yeom, Hanul Sung:
MulConn: User-Transparent I/O Subsystem for High-Performance Parallel File Systems. 53-62 - Ta-Yang Wang, William Chang, Ajitesh Srivastava, Rajgopal Kannan, Viktor K. Prasanna:
Monte Carlo Tree Search for Task Mapping onto Heterogeneous Platforms. 63-70 - Johannes Langguth, Ioannis Panagiotas, Bora Uçar:
Shared-memory implementation of the Karp-Sipser kernelization process. 71-80 - Yuan Meng
, Sanmukh R. Kuppannagari
, Rajgopal Kannan, Viktor K. Prasanna:
How to Avoid Zero-Spacing in Fractionally-Strided Convolution? A Hardware-Algorithm Co-Design Methodology. 81-90 - Jiawen Guan, Rui Fan:
PPBT: A High Performance Parallel Search Tree. 91-100 - Esragul Korkmaz
, Mathieu Faverge, Pierre Ramet, Grégoire Pichon:
Deciding Non-Compressible Blocks in Sparse Direct Solvers using Incomplete Factorization. 101-110 - Athreya Chandramouli, Sayantan Jana, Kishore Kothapalli:
Efficient Parallel Algorithms for Computing Percolation Centrality. 111-120 - André Weißenberger, Bertil Schmidt
:
Accelerating JPEG Decompression on GPUs. 121-130 - Kai Keller
, Adrián Cristal Kestelman, Leonardo Bautista-Gomez:
Towards Zero-Waste Recovery and Zero-Overhead Checkpointing in Ensemble Data Assimilation. 131-140 - Archie Powell, K. Choudry, Arun Prabhakar
, I. Z. Reguly, Dario Amirante, Stephen A. Jarvis
, Gihan R. Mudalige:
Predictive Analysis of Large-Scale Coupled CFD Simulations with the CPX Mini-App. 141-151 - Akihiro Tabuchi, Koichi Shirahata, Masafumi Yamazaki, Akihiko Kasagi, Takumi Honda, Kouji Kurihara, Kentaro Kawakami, Tsuguchika Tabaru, Naoto Fukumoto, Akiyoshi Kuroda, Takaaki Fukai, Kento Sato:
The 16, 384-node Parallelism of 3D-CNN Training on An Arm CPU based Supercomputer. 152-161 - Luk Burchard
, Xing Cai, Johannes Langguth:
iPUG for Multiple Graphcore IPUs: Optimizing Performance and Scalability of Parallel Breadth-First Search. 162-171 - K. P. Arun, Debadatta Mishra, Biswabandan Panda:
Empirical Analysis of Architectural Primitives for NVRAM Consistency. 172-181 - Kazuaki Matsumura, Simon Garcia de Gonzalo, Antonio J. Peña:
JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization. 182-191 - Oded Green, Zhihui Du, Sanyamee Patel, Zehui Xie, Hang Liu, David A. Bader:
Anti-Section Transitive Closure. 192-201 - Xiaojing An, Ümit V. Çatalyürek:
Column-Segmented Sparse Matrix-Matrix Multiplication on Multicore CPUs. 202-211 - Arjun Gopala Krishnan, Dhrubajyoti Goswami:
Multi-Stage Memory Efficient Strassen's Matrix Multiplication on GPU. 212-221 - Md Nahid Newaz
, Md Atiqul Mollah:
Optimizing k-path selection for randomized interconnection networks. 222-231 - Siqin Liu, Avinash Karanth:
Dynamic Voltage and Frequency Scaling to Improve Energy-Efficiency of Hardware Accelerators. 232-241 - Zhe Wang, Pradeep Subedi
, Matthieu Dorier
, Philip E. Davis, Manish Parashar:
Adaptive Placement of Data Analysis Tasks For Staging Based In-Situ Processing. 242-251 - Qihan Wang, Wei Niu, Li Chen, Ruoming Jin, Bin Ren:
HEALS: A Parallel eALS Recommendation System on CPU/GPU Heterogeneous Platforms. 252-261 - Xiang Li, Gagan Agrawal:
Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels. 262-271 - Bharath Ramesh, Jahanzeb Maqbool Hashmi, Shulei Xu, Aamir Shafi
, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
Towards Architecture-aware Hierarchical Communication Trees on Modern HPC Systems. 272-281 - Yuntian He, Saket Gurukar, Pouya Kousha
, Hari Subramoni, Dhabaleswar K. Panda, Srinivasan Parthasarathy:
DistMILE: A Distributed Multi-Level Framework for Scalable Graph Embedding. 282-291 - Jinlai Xu, Balaji Palanisamy:
Model-based Reinforcement Learning for Elastic Stream Processing in Edge Computing. 292-301 - Kaushik Kandadi Suresh, Bharath Ramesh, Chen-Chun Chen, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Aamir Shafi
, Hari Subramoni, Dhabaleswar K. Panda:
Layout-aware Hardware-assisted Designs for Derived Data Types in MPI. 302-311 - Xu T. Liu
, Jesun Firoz, Andrew Lumsdaine
, Cliff A. Joslyn
, Sinan Aksoy, Brenda Praggastis, Assefaw H. Gebremedhin:
Parallel Algorithms for Efficient Computation of High-Order Line Graphs of Hypergraphs. 312-321 - Sunwoo Lee
, Qiao Kang, Kewei Wang, Jan Balewski, Alex Sim
, Ankit Agrawal
, Alok N. Choudhary, Peter Nugent
, Kesheng Wu
, Wei-keng Liao
:
Asynchronous I/O Strategy for Large-Scale Deep Learning Applications. 322-331 - Srinivasan Ramesh, Robert B. Ross, Matthieu Dorier
, Allen D. Malony, Philip H. Carns, Kevin A. Huck
:
SYMBIOMON: A High-Performance, Composable Monitoring Service. 332-342 - Ke Fan, Duong Hoang, Steve Petruzza, Thomas Gilray, Valerio Pascucci
, Sidharth Kumar:
Load-balancing Parallel I/O of Compressed Hierarchical Layouts. 343-353 - Madhav Poudel, Michael Gowanlock:
CUDA-DClust+: Revisiting Early GPU-Accelerated DBSCAN Clustering Designs. 354-363 - Leonel Toledo, Pedro Valero-Lara, Jeffrey S. Vetter, Antonio J. Peña:
Static Graphs for Coding Productivity in OpenACC. 364-369 - Madhav Aggarwal, Bingyi Zhang, Viktor K. Prasanna:
Performance of Local Push Algorithms for Personalized PageRank on Multi-core Platforms. 370-375 - Jacob Tronge, Patricia Grubel, Timothy Randles, Quincy Wofford, Rusty Davis, Steven Anaya, Qiang Guan:
BEE Orchestrator: Running Complex Scientific Workflows on Multiple Systems. 376-381 - Hércules Cardoso da Silva, Marco Aurelio Stefanes, Vinícius Capistrano:
OpenACC Multi-GPU Approach for WSM6 Microphysics. 382-387 - Nick Sarkauskas, Mohammadreza Bayatpour, Tu Tran
, Bharath Ramesh, Hari Subramoni, Dhabaleswar K. Panda:
Large-Message Nonblocking MPI_Iallgather and MPI Ibcast Offload via BlueField-2 DPU. 388-393 - Yuanjian Liu
, Sheng Di, Kai Zhao, Sian Jin, Cheng Wang, Kyle Chard, Dingwen Tao
, Ian T. Foster, Franck Cappello:
Optimizing Multi-Range based Error-Bounded Lossy Compression for Scientific Datasets. 394-399 - Jiwoo Bang, Chungyong Kim, Kesheng Wu
, Alex Sim
, Suren Byna
, Hanul Sung, Hyeonsang Eom:
An In-Depth I/O Pattern Analysis in HPC Systems. 400-405 - Anshuj Garg, Purushottam Kulkarni, Umesh Bellur, Sriram Yenamandra:
FaaSter: Accelerated Functions-as-a-Service with Heterogeneous GPUs. 406-411 - Salman Salloum
, Joshua Zhexue Huang:
RSP-Hist: Approximate Histograms for Big Data Exploration on Hadoop Clusters. 412-417 - Shuangsheng Lou, Gagan Agrawal:
A Programming API Implementation for Secure Data Analytics Applications with Homomorphic Encryption on GPUs. 418-423 - Jia Guo, Radu Teodorescu, Gagan Agrawal:
A Fused Inference Design for Pattern-Based Sparse CNN on Edge Devices. 424-429 - Edigley Fraga, Ana Cortés, Tomàs Margalef
, Porfidio Hernández:
Cloud-Based Urgent Computing for Forest Fire Spread Prediction under Data Uncertainties. 430-435 - Mostafa Eghbali Zarch, Reece Neff
, Michela Becchi:
Exploring Thread Coarsening on FPGA. 436-441 - John Ravi, Tri Nguyen
, Huiyang Zhou
, Michela Becchi:
PILOT: a Runtime System to Manage Multi-tenant GPU Unified Memory Footprint. 442-447 - S. Chandra Sekhara Rao, Rabia Kamra
:
A computational technique for parallel solution of diagonally dominant banded linear systems. 448-453

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.