default search action
IPDPS 2023: St. Petersburg, FL, USA - Workshops
- IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023 - Workshops, St. Petersburg, FL, USA, May 15-19, 2023. IEEE 2023, ISBN 979-8-3503-1199-0
- Anne C. Elster, Jan Christian Meyer:
Message from the HCW 2023 Technical Program Committee Co-Chairs. 4 - Peter M. Kogge:
HCW 2023 Keynote Heterogeneity and the Problem that Shall Not Be Named. 5 - Alok V. Kamatar, Ryan D. Friese, Roberto Gioiosa:
A Task Based Approach for Co-Scheduling Ensemble Workloads on Heterogeneous Nodes. 6-15 - Joshua Mack, Serhan Gener, Md Sahil Hassan, H. Umut Suluhan, Ali Akoglu:
CEDR-API: Productive, Performant Programming of Domain-Specific Embedded Systems. 16-25 - Anara Kozhokanova, Bo Wang, Christian Terboven, Matthias S. Müller:
Power-aware Computing with Optane Persistent Memory Modules. 26-31 - Logan T. Ward, J. Gregory Pauloski, Valérie Hayot-Sasson, Ryan Chard, Yadu N. Babuji, Ganesh Sivaraman, Sutanay Choudhury, Kyle Chard, Rajeev Thakur, Ian T. Foster:
Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources. 32-41 - Brett Foster, Shubbhi Taneja, Joseph B. Manzano, Kevin Barker:
Evaluating Energy Efficiency of GPUs using Machine Learning Benchmarks. 42-50 - Rui Alves, José Rufino:
Remote Execution of OpenCL and SYCL Applications via rOpenCL. 51-60 - Benjamin Welte, Joseph Zambreno:
An FPGA Implementation of SipHash. 63-70 - Filippo Carloni, Leonardo Panseri, Davide Conficconi, Mattia Sironi, Marco D. Santambrogio:
Enabling Efficient Regular Expression Matching at the Edge through Domain-Specific Architectures. 71-74 - Alberto Zeni, Emanuele Del Sozzo, Beatrice Branchini, Lorenzo Di Tucci, Marco D. Santambrogio:
New Solution For a (Scaff)Old Problem: an FPGA Approach. 75-78 - Can Aknesil, Elena Dubrova, Niklas Lindskog, Håkan Englund:
Is Your FPGA Transmitting Secrets: Covert Antennas from Interconnect. 79-84 - Yuhao Liu, Shubham Rai, Salim Ullah, Akash Kumar:
NetPU-M: a Generic Reconfigurable Neural Network Accelerator Architecture for MLPs. 85-92 - Shaarada D. Yamini, Mirishkar Sai Ganesh, Anil Kumar Vuppala, Suresh Purini:
Hardware Accelerator for Transformer based End-to-End Automatic Speech Recognition System. 93-100 - Seongyoung Kang, Sang-Woo Jun:
Near-Storage Accelerator for Bulk Graph Ingestion. 101-104 - Ikumi Okubo, Keisuke Sugiura, Hiroki Kawakami, Hiroki Matsutani:
A Lightweight Transformer Model using Neural ODE for FPGAs. 105-112 - Mizuki Yasuda, Keisuke Sugiura, Ryuto Kojima, Hiroki Matsutani:
An Edge-Server Partitioning Method for 3D LiDAR SLAM on FPGAs. 113-120 - Jelle Biesmans, Francesco Regazzoni, Nele Mentens:
Application-specific FPGAs: cryptographic agility through customized reconfigurable architectures. 121-124 - Quentin Ducasse, Pascal Cotret, Loïc Lagadec:
JIT Compiler Security through Low-Cost RISC-V Extension. 125-128 - Sergiu Mosanu, Joshua Fixelle, Mohammad Nazmus Sakib, Kevin Skadron, Mircea Stan:
FreezeTime: Towards System Emulation through Architectural Virtualization. 129-136 - Anastasis Togkousidis, Olga Chernomor, Alexandros Stamatakis:
Parallel Inference of Phylogenetic Stands with Gentrius. 139-148 - Neftali Watkinson, Divya Devineni, Victor Joe, Tony Givargis, Alexandru Nicolau, Alexander V. Veidenbaum:
Using Hyperdimensional Computing to Extract Features for the Detection of Type 2 Diabetes. 149-156 - Tazin Rahman, Oieswarya Bhowmik, Ananth Kalyanaraman:
An Efficient Parallel Sketch-based Algorithm for Mapping Long Reads to Contigs. 157-166 - Doru-Thom Popovici, Muaaz Gul Awan, Giulia Guidi, Rob Egan, Steven A. Hofmeyr, Leonid Oliker, Katherine A. Yelick:
Designing Efficient SIMD Kernels for High Performance Sequence Alignment. 167-176 - Pradeep Kumar, Sarah Revillar:
G-Bench: Fair Benchmarking to Support Innovations in Streaming Graph Systems. 179-188 - Michael Eydenberg, Mark Plagge, Siva Rajamanickam:
A Comparison of Spectral and Spatial Graph Convolutional Neural Network Kernels Using GraphSAGE-Sparse. 189-198 - Israt Nisa, Minjie Wang, Da Zheng, Qiang Fu, Ümit V. Çatalyürek, George Karypis:
Optimizing Irregular Dense Operators of Heterogeneous GNN Models on GPU. 199-206 - Benjamin Brock, Scott McMillan, Aydin Buluç, Timothy G. Mattson, José E. Moreira:
C++ and Interoperability Between Libraries: The GraphBLAS C++ Specification. 207-215 - Alberto Scolari, Albert-Jan Yzelman:
Effective implementation of the High Performance Conjugate Gradient benchmark on GraphBLAS. 216-225 - Seunghwa Kang, Chuck Hastings, Joe Eaton, Brad Rees:
cuGraph C++ primitives: vertex/edge-centric building blocks for parallel graph computing. 226-229 - Luca Cappelletti, Tommaso Fontana, Justin T. Reese, David A. Bader:
Billion-scale Detection of Isomorphic Nodes. 230-233 - Afton Geil, Serban D. Porumbescu, John D. Owens:
Maximum Clique Enumeration on the GPU. 234-244 - Alina Lazar, Virginia Niculescu, David P. Bunde:
Peachy Parallel Assignments (EduPar 2023). 248-255 - W. Feng, L. Davis-Wallace:
Parallel Programming with Pictures: Choosing Your Own Adventure. 256-261 - Aaron Jezghani, Jeffrey Young, Will Powell, Ronald Rahaman, J. Eric Coulter:
Future Computing with the Rogues Gallery. 262-269 - Ali Mokhtari, Drake Rawls, Tony Huynh, Jeremiah Green, Mohsen Amini Salehi:
E2C: A Visual Simulator to Reinforce Education of Heterogeneous Computing Systems. 270-277 - Mary Smith, Srishti Srivastava:
Introducing Parallel and Distributed Computing concepts through the use of Flashcards and a Card Game. 278-283 - Masaru Uchida, Hideyuki Kawashima:
Making Lock Manager Concurrent for Deterministic Database. 286-290 - Jie Wu:
Invited Paper: On the Cost-Optimal Parallel Solution of the Majority Problem. 291-294 - Zheming Jin, Jeffrey S. Vetter:
Understanding SYCL Portability for Pseudorandom Number Generation: a Case Study with Gene-Expression Connectivity Mapping. 295-298 - Somchart Fugkeaw:
Implementing An Outsourced Dual-Proxy Signing and Decryption Scheme in Mobile Cloud Computing. 299-307 - Tesshu Hanaka, Hirotaka Ono, Kosuke Sugiyama:
Solving Distance-constrained Labeling Problems for Small Diameter Graphs via TSP*. 308-313 - Koji Nakano, Daisuke Takafuji, Yasuaki Ito, Takashi Yazane, Junko Yano, Shiro Ozaki, Ryota Katsuki, Rie Mori:
Diverse Adaptive Bulk Search: a Framework for Solving QUBO Problems on Multiple GPUs. 314-325 - Xiang Fu, Hao Tang, Huimin Liao, Xin Huang, Wubiao Xu, Shiman Meng, Weiping Zhang, Luanzheng Guo, Kento Sato:
A High-dimensional Algorithm-Based Fault Tolerance Scheme. 326-330 - Niklas Bartelheimer, Zhaobin Zhu, Sarah Neuwirth:
Toward a Modular Workflow for Network Performance Characterization. 331-334 - Henry Zhu, Junyong Zhao, Nik Sultana:
A Domain-Specific Language for Reconfigurable, Distributed Software Architecture. 335-344 - Shiyao Xie, Akinori Miura, Kenji Ono:
Error-bounded Scalable Parallel Tensor Train Decomposition. 345-353 - Benjamin Michalowicz, Kaushik Kandadi Suresh, Bharath Ramesh, Aamir Shafi, Hari Subramoni, Mustafa Abduljabbar, Dhabaleswar K. Panda:
In-Depth Evaluation of a Lower-Level Direct-Verbs API on InfiniBand-based Clusters: Early Experiences. 354-363 - Zheming Jin, Jeffrey S. Vetter:
Understanding Performance Portability of SYCL Kernels: A Case Study with the All-Pairs Distance Calculation in Bioinformatics on GPUs. 366-372 - William F. Godoy, Pedro Valero-Lara, T. Elise Dettling, Christian Trefftz, Ian Jorquera, Thomas Sheehy, Ross G. Miller, Marc González Tallada, Jeffrey S. Vetter, Valentin Churavy:
Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes. 373-382 - Probir Roy, Birhanu Eshete, Pengfei Su:
Designing Secure Performance Metrics for Last Level Cache. 383-392 - Daniel Barry, Heike Jagode, Anthony Danalis, Jack J. Dongarra:
Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements. 393-402 - Srinivasan Subramaniyan, Xiaorui Wang:
OptiCPD: Optimization For The Canonical Polyadic Decomposition Algorithm on GPUs. 403-412 - Michael Wilkins, Garrett Weil, Luke Arnold, Nikos Hardavellas, Peter A. Dinda:
Evaluating Functional Memory-Managed Parallel Languages for HPC using the NAS Parallel Benchmarks. 413-422 - Sebastian Kreutzer, Christian Iwainsky, Marta Garcia-Gasulla, Victor Lopez, Christian H. Bischof:
Runtime-Adaptable Selective Performance Instrumentation. 423-432 - Mahdi Abbaszadeh, Tarek S. Abdelrahman, Reza Azimi, Tomasz S. Czajkowski, Maziar Goudarzi:
Efficient Data Streaming for a Tightly-Coupled Coarse-Grained Reconfigurable Array. 435-443 - Rui Rodrigues de Mello Junior, Gabriel Antoine Louis Paillard, Leandro Santiago de Araújo, Pedro C. Diniz, Felipe M. G. França:
GSink - A Runtime for Gamma Programs and its CGRA Mapping Proposal. 444-451 - Boma A. Adhi, Carlos Cortes, Emanuele Del Sozzo, Tomohiro Ueno, Yiyu Tan, Takuya Kojima, Artur Podobas, Kentaro Sano:
Less for More: Reducing Intra-CGRA Connectivity for Higher Performance and Efficiency in HPC. 452-459 - Artur Podobas:
Q2Logic: A Coarse-Grained FPGA Overlay targeting Schrödinger Quantum Circuit Simulations. 460-467 - Omar Ragheb, Rami Beidas, Jason Helge Anderson:
Statically Scheduled vs. Elastic CGRA Architectures: Impact on Mapping Feasibility. 468-475 - Maurizio Palesi, Enrico Russo, Abhijit Das, John Jose:
Wireless enabled Inter-Chiplet Communication in DNN Hardware Accelerators. 477-483 - Christopher Harrison, Henish Balu, Inês Dutra:
Predicting Hard Disk Drive Faults, Failures and Associated Misbehavior's. 484-493 - Peter Love:
Q-CASA Keynote Speaker Quantum Simulation from Quantum Chemistry to Quantum Field Theory. 496 - Eliot Kapit:
Q-CASA Invited Speaker Structured noise in quantum computers: an obstacle or an opportunity? 497 - Itay Hen:
Q-CASA Invited Speaker Simulating Hamiltonian Dynamics with the Off-diagonal Series Expansion. 498 - Robert Loredo, Fahad Saeed:
Q-CASA Invited Speakers Quantum-Centric Supercomputing Strategies for Neuroscience problems: Challenges and Progress. 499 - Robert Basili, Wenyang Qian, Shuo Tang, Austin Castellino, Mary Eshaghian-Wilner, Ashfaq Khokhar, Glenn R. Luecke, James P. Vary:
Q-CASA Invited Speaker Performance Evaluations of Noisy Approximate Quantum Fourier Arithmetic with Signed and Unsigned Integers. 500 - Taoreed Akinola, Xiangfang Li, Richard Wilkins, Pamela Obiomon, Lijun Qian:
Inverse Quantum Fourier Transform Inspired Algorithm for Unsupervised Image Segmentation. 501-508 - Akihiro Hayashi, Austin Adams, Jeffrey Young, Alexander J. McCaskey, Eugene F. Dumitrescu, Vivek Sarkar, Thomas M. Conte:
Enabling Multi-threading in Heterogeneous Quantum-Classical Programming Models. 509-516 - Daniel T. Chen, Ethan H. Hansen, Xinpeng Li, Vinooth Kulkarni, Vipin Chaudhary, Bin Ren, Qiang Guan, Sanmukh Kuppannagari, Ji Liu, Shuai Xu:
Efficient Quantum Circuit Cutting by Neglecting Basis Elements. 517-523 - Maximilian Jakob Heer, Emanuele Del Sozzo, Keisuke Fujii, Kentaro Sano:
Novel Union-Find-based Decoders for Scalable Quantum Error Correction on Systolic Arrays. 524-533 - Ankit Khandelwal, M. Girish Chandra:
Quantum-Enhanced Topological Data Analysis: A Peep from an Implementation Perspective. 534-540 - Wenyang Qian, Robert Basili, Mary Eshaghian-Wilner, Ashfaq Khokhar, Glenn R. Luecke, James P. Vary:
Comparative study on the variations of quantum approximate optimization algorithms to the Traveling Salesman Problem. 541-551 - Max Scheerer, Jonas Klamroth, Simon Garhofer, Florian Knäble, Oliver Denninger:
Experiences in Quantum Software Engineering. 552-559 - Pedro Valero-Lara:
(AsHES) 2023 Keynote Speaker Agnostic Programing: "Less is More". 563 - Arijit Bhattacharjee, Christopher S. Daley, Ali Jannesari:
OpenMP Offload Features and Strategies for High Performance across Architectures and Compilers. 564-573 - Dimitrios Galanopoulos, Panagiotis Mpakos, Petros Anastasiadis, Nectarios Koziris, Georgios I. Goumas:
Invited paper: An Artificial Matrix Generator for Multi-platform SpMV Performance Analysis. 574-577 - Mert Side, Brody Williams, John D. Leidel, Jonathan Woodruff, Simon W. Moore, Yong Chen:
Towards xBGAS on CHERI: Supporting a Secure Global Memory. 578-581 - Ronald M. Caplan, Miko M. Stulajter, Jon A. Linker:
Acceleration of a production Solar MHD code with Fortran standard parallelism: From OpenACC to 'do concurrent'. 582-590 - M. Emin Ozturk, Omid Asudeh, Gerald Sabin, P. Sadayappan, Aravind Sukumaran-Rajam:
A Performance Portability Study Using Tensor Contraction Benchmarks. 591-600 - Md Abdul Motaleb Faysal, Maximilian H. Bremer, Shaikh Arifuzzaman, Doru Popovici, John Shalf, Cy P. Chan:
Fast Community Detection in Graphs with Infomap Method using Accelerated Sparse Accumulation. 601-610 - Amanda Bienz:
Invited Paper: Benchmarking and Optimizing Data Movement on Emerging Heterogeneous Architectures. 611-614 - Amit Samanta, Faraz Ahmed, Lianjie Cao, Ryan Stutsman, Puneet Sharma:
Persistent Memory-Aware Scheduling for Serverless Workloads. 615-621 - Zhaobin Zhu, Niklas Bartelheimer, Sarah Neuwirth:
An Empirical Roofline Model for Extreme-Scale I/O Workload Analysis. 622-627 - Md. Kamal Hossain Chowdhury, Houjun Tang, Jean Luca Bez, Purushotham V. Bangalore, Suren Byna:
Efficient Asynchronous I/O with Request Merging. 628-636 - Sajid Ali, Steven Calvez, Philip H. Carns, Matthieu Dorier, Pengfei Ding, James Kowalkowski, Robert Latham, Andrew Norman, Marc F. Paterno, Robert B. Ross, Saba Sehrish, Shane Snyder, Jérome Soumagne:
HEPnOS: a Specialized Data Service for High Energy Physics Analysis. 637-646 - Sabine Roller, Peter Strazdins, Raphaël Couturier, Neda Ebrahimi Pour, Suzanne Michelle Shontz, Thomas Rauber, Gudula Rünger, Laurence T. Yang:
Message from the PDSEC-22 Workshop Chairs. 649-650 - Richard Angersbach, Sebastian Kuckuk, Harald Köstler:
Generating Coupling Interfaces for Multiphysics Simulations with ExaStencils and waLBerla. 651-661 - Pratik Nayak, Hartwig Anzt:
Utilizing batched solver ideas for efficient solution of non-batched linear systems. 662-665 - Florian Fey, Alexander Gerwing, Sergei Gorlatch:
GPU-Parallelized Simulation of Optical Forces on Nanoparticles in a Fluid Medium. 666-672 - Shinya Hashinoki, Satoshi Ohshima, Takahiro Katagiri, Toru Nagai, Tetsuya Hoshino:
Implementation of Radio Wave Propagation using RT Cores and Consideration of Programming Models. 673-681 - Patrick Diehl, Gregor Daiß, Kevin A. Huck, Dominic Marcello, Sagiv Shiber, Hartmut Kaiser, Dirk Pflüger:
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku. 682-691 - Mitsuo Yokokawa, Yuki Yamane, Kenta Yamaguchi, Takashi Soga, Taiki Matsumoto, Akihiro Musa, Kazuhiko Komatsu, Takashi Ishihara, Hiroaki Kobayashi:
I/O Performance Evaluation of a Memory-Saving DNS Code on SX-Aurora TSUBASA. 692-696 - Luanzheng Guo, Gokcen Kestor:
On Higher-performance Sparse Tensor Transposition. 697-701 - Hiroaki Kobayashi:
iWAPT2023 Keynote Speaker QC & HPC hybrid computing for simulation & data-analysis hybrid applications. 704 - Prasanna Balaprakash:
iWAPT2023 Invited Speaker Optimizing HPC Systems for Scientific Applications: Machine Learning Approaches to Performance Tuning and Anomaly Detection. 705 - Moto Satake, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa:
Balancing exploitation and exploration in parallel Bayesian optimization under computing resource constraint. 706-713 - Tao Yan, Qingguo Xu, Jiyu Luo, Jingwei Sun, Guangzhong Sun:
Scalable Tracing of MPI Events and Performance Metrics. 714-723 - Jacob O. Tørring, Ben van Werkhoven, Filip Petrovic, Floris-Jan Willemsen, Jiri Filipovic, Anne C. Elster:
Towards a Benchmarking Suite for Kernel Tuners. 724-733 - Christodoulos Stylianou, Michèle Weiland:
Optimizing Sparse Linear Algebra Through Automatic Format Selection and Machine Learning. 734-743 - Stijn Heldens, Ben van Werkhoven:
Kernel Launcher: C++ Library for Optimal-Performance Portable CUDA Applications. 744-753 - Christopher A. Metz, Mehran Goli, Rolf Drechsler:
Fast and Accurate: Machine Learning Techniques for Performance Estimation of CNNs for GPGPUs. 754-760 - Takeya Yamada, Hiroki Matsutani:
A Lightweight Concept Drift Detection Method for On-Device Learning on Resource-Limited Edge Devices. 761-768 - Hugo Hadjur, Doreid Ammar, Laurent Lefèvre:
Services Orchestration at the Edge and in the Cloud for Energy-Aware Precision Beekeeping Systems. 769-776 - Josef Hammer, Hermann Hellwagner:
Distributed On-Demand Deployment for Transparent Access to 5G Edge Computing Services. 777-784 - Prasanna Balaprakash:
Scalable Automated Design and Development of Deep Neural Networks for Scientific and Engineering Applications. 787 - Angela Dalton:
Scaling up Deep Learning: Efficiency in AI. 788 - Supriyo Chakraborty:
On Distributed Training of Foundation Models: Challenges and Observations. 789 - Hadjer Benmeziane, Amine Ziad Ounnoughene, Imane Hamzaoui, Younes Bouhadjar:
Skip Connections in Spiking Neural Networks: An Analysis of Their Effect on Network Training. 790-794 - Tianle Wang, Sudip K. Seal, Ramakrishnan Kannan, Cristina Garcia-Cardona, Thomas Proffen, Shantenu Jha:
A Parallel Machine Learning Workflow for Neutron Scattering Data Analysis. 795-798 - Seungjun Lee, Miri Yu, Daegun Yoon, Sangyoon Oh:
Can hierarchical client clustering mitigate the data heterogeneity effect in federated learning? 799-808 - Haoran Lin, Xinwei Qin, Shuang Qiu, Yi Sun, Zekun Yin, Weiguo Liu:
Ray-based Elastic Distributed Data Parallel Framework with Distributed Data Cache. 809-817 - Antonios Karteris, Georgios Tzanos, Lazaros Papadopoulos, Dimitrios Soudris:
Detection of Cyber Security Threats through Social Media Platforms. 820-823 - Bhashithe Abeysinghe, Rajshekhar Sunderraman:
Inferring stances of silent-participants in Twitter chatter using Label Propagation. 824-831