


default search action
33rd SBAC-PAD 2021: Belo Horizonte, Brazil
- 33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021, Belo Horizonte, Brazil, October 26-29, 2021. IEEE 2021, ISBN 978-1-6654-4301-2

Session 1: Accelerated Computing
- Hao Zhou, David Troendle, Byunghyun Jang:

DACHash: A Dynamic, Cache-Aware and Concurrent Hash Table on GPUs. 1-10 - Raúl Taranco

, José-María Arnau, Antonio González:
A Low-Power Hardware Accelerator for ORB Feature Extraction in Self-Driving Cars. 11-21 - Dominik Ernst, Georg Hager, Matthias Knorr, Gerhard Wellein

, Markus Holzer:
Opening the Black Box: Performance Estimation during Code Generation for GPUs. 22-32 - Jude Haris

, Perry Gibson
, José Cano, Nicolas Bohm Agostini
, David R. Kaeli:
SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference. 33-43
Session 2: Memory Systems
- Catalina Munoz Morales, Bruno C. Honorio, Alexandro Baldassin

, Guido Araujo:
Improving Phased Transactional Memory via Commit Throughput and Capacity Estimation. 44-53 - Jonathas Silveira, Lucas Wanner:

Design and evaluation of associative processing kernels. 64-73 - João Vicente Souto, Márcio Castro, Pedro Henrique Penna:

A Task-based Execution Engine for Distributed Operating Systems Tailored to Lightweight Manycores with Limited On-Chip Memory. 74-83
Session 3: Computer Architecture
- Rafael C. F. Sousa, Byungmin Jung, Jaehwa Kwak, Michael Frank, Guido Araujo:

Efficient Tensor Slicing for Multicore NPUs using Memory Burst Modeling. 84-93 - Ehsan Atoofian:

Sparsity-aware Power Gating for Tensor Cores. 94-103 - Vanderson Martins do Rosario, Raphael Zinsly, Sandro Rigo, Edson Borin:

Employing Simulation to Facilitate the Design of Dynamic Binary Translators. 104-113 - Hikaru Takayashiki, Masayuki Sato, Kazuhiko Komatsu, Hiroaki Kobayashi:

Register Flush-free Runahead Execution for Modern Vector Processors. 114-125
Session 4: Scheduling and Distributed Systems
- Anne Benoit, Louis-Claude Canon, Redouane Elghazi, Pierre-Cyrille Héam:

Shelf schedules for independent moldable tasks to minimize the energy consumption. 126-136 - Zeina Houmani, Daniel Balouek-Thomert, Eddy Caron, Manish Parashar:

Enabling microservices management for Deep Learning applications across the Edge-Cloud Continuum. 137-146 - Michael Guilherme Jordan, Guilherme Korol, Mateus Beck Rutzig, Antonio Carlos Schneider Beck:

FAIR: Fully-Adaptive Framework for Improving Resource Provisioning in Collaborative CPU-FPGA Cloud Environments. 147-156 - André Ramos Carneiro, Jean Luca Bez, Carla Osthoff, Lucas Mello Schnorr, Philippe O. A. Navaux:

HPC Data Storage at a Glance: The Santos Dumont Experience. 157-166 - Wilton Jaciel Loch, Guilherme Piêgas Koslovski:

Sparbit: a new logarithmic-cost and data locality-aware MPI Allgather algorithm. 167-176
Session 5: Applications
- Marco Barbone

, Andreas Wetscherek
, Thomas Yung, Uwe Oelfke, Wayne Luk, Georgi Gaydadjiev
:
Efficient Online 4D Magnetic Resonance Imaging. 177-187 - Lucas Reis, Lucas Wanner:

Functional Approximation and Approximate Parallelization with the ACCEPT compiler. 188-197 - Gangyi Zhu, Gagan Agrawal:

Sampling-based Sparse Format Selection on GPUs. 198-208 - Erfan Bank Tavakoli, Michael Riera, Masudul Hassan Quraishi, Fengbo Ren:

FSCHOL: An OpenCL-based HPC Framework for Accelerating Sparse Cholesky Factorization on FPGAs. 209-220

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














