


default search action
ISPASS 2025: Ghent, Belgium
- IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2025, Ghent, Belgium, May 11-13, 2025. IEEE 2025, ISBN 979-8-3315-0294-2
- Nebil Ozer, Gregory Kollmer, Ramyad Hadidi, Bahar Asgari:
La Superba: Leveraging a Self-Comparison Method to Understand the Performance Benefits of Sparse Acceleration Optimizations. 1-12 - Yang Yang, Mohammad Sonji, Adwait Jog:
Dissecting Performance Overheads of Confidential Computing on GPU-based Systems. 1-16 - Thomas Rauber, Gudula Rünger:
Evaluation and Comparison of the Energy Efficiency of Several Intel Multicore Processors. 1-3 - Iris Uwizeyimana, Natalie Enright Jerger:
Carbon-Aware Server Replacement. 1-3 - Tanvi Sharma, Indranil Chakraborty, Mustafa Fayez Ali, Kaushik Roy:
Evaluating Compute in Memory Architectures for Matrix Multiplication: A Dataflow-Centric Perspective. 1-3 - Carlos Agulló-Domingo, Óscar Vera-López, Seyda Guzelhan, Lohit Daksha, Aymane El Jerari, Kaustubh Shivdikar, Rashmi S. Agrawal, David R. Kaeli, Ajay Joshi, José L. Abellán:
FIDESlib: A Fully-Fledged Open-Source FHE Library for Efficient CKKS on GPUs. 1-3 - Fareed Qararyah
, Mohammad Ali Maleki, Pedro Trancoso:
An Analytical Cost Model for Fast Evaluation of Multiple Compute-Engine CNN Accelerators. 1-13 - Rachid Karami, Sheng-Chun Kao, Hyoukjun Kwon:
Understanding the Performance Horizon of the Latest ML Workloads with NonGEMM Workloads. 1-14 - Jaeyoung Kang, Qirong Xia, Ipoom Jeong, Yongjoo Park, Nam Sung Kim:
Intel ® in-Memory Analytics Accelerator: Performance Characterization and Guidelines. 1-13 - Chenji Han, Huai Xu, Guangyao Guo, Yuxuan Wu, Fuxin Zhang:
MeMo: Enhancing Representative Sampling via Mechanistic Micro-Model Signatures. 1-13 - Jamin Seo, Jianming Tong, Tushar Krishna, Hyoukjun Kwon:
Exploring Constrained Dataflow Accelerators for Real-Time Multi-Task Multi-Model Ml Workloads. 1-11 - Anirudha Agrawal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam:
ConCCL: Optimizing ML Concurrent Computation and Communication with GPU DMA Engines. 1-11 - Junsoo Kim, Hunjong Lee, Geonwoo Ko, Gyubin Choi, Seri Ham, Seongmin Hong, Joo-Young Kim:
ADOR: A Design Exploration Framework for LLM Serving with Enhanced Latency and Throughput. 15-25 - Zishen Wan, Jiayi Qian, Yuhang Du, Jason Jabbour, Yilun Du, Yang Zhao, Arijit Raychowdhury, Tushar Krishna, Vijay Janapa Reddi:
Generative AI in Embodied Systems: System-Level Analysis of Performance, Efficiency and Scalability. 26-37 - Prabhu Vellaisamy, Thomas Labonte, Sourav Chakraborty, Matt Turner, Samantika Sury, John Paul Shen:
Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures. 49-61 - Eunsoo Jung, Eunbi Jeong, Gunjae Koo, Yunho Oh, Myung Kuk Yoon:
Hierarchical Traversal Stack Design Using Shared Memory for GPU Ray Tracing. 62-72 - Fangjia Shen, Aaron Barnes, Anusuya Nallathambi, Timothy G. Rogers:
RayFlex: An Open-Source RTL Implementation of the Hardware Ray Tracer Datapath. 73-84 - Varsha Singhania, Shaizeen Aga, Mohamed Assem Ibrahim:
FinGraV: Methodology for Fine-Grain GPU Power Visibility and Insights. 96-107 - Matin Raayai Ardakani, Andrew Nguyen, Ivan Rosales, Daoxuan Xu, Yuwei Sun, Yifan Sun, David Kaeli, Norman Rubin:
Luthier: A Dynamic Binary Instrumentation Framework Targeting AMD GPUs. 137-149 - Kaustubh Manohar Mhatre, Venkata Guru Prashanth Mulleti, Curt John Bansil, Endri Taka, Aman Arora:
Performance Analysis of GEMM Workloads on the AMD Versal Platform. 150-161 - Jaeheon Lee, Juhyung Park, Seonggyun Oh, Jinhyung Koo, Sungjin Lee:
Beyond the Numbers: Measuring Android Performance Through User Perception. 162-173 - Mansi Choudhary, Chris Kjellqvist, Jiaao Ma, Lisa Wu Wills:
COCOSSim: A Cycle-Accurate Simulator for Heterogeneous Systolic Array Architectures. 174-185 - Ritik Raj, Sarbartha Banerjee, Nikhil Chandra, Zishen Wan, Jianming Tong, Ananda Samajdhar, Tushar Krishna:
SCALE-Sim V3: a Modular Cycle-Accurate Systolic Accelerator Simulator for End-To-End System Analysis. 186-200 - Kaifeng Xu, Georgios Tziantzioulis, David Wentzlaff:
Evaluation of MindPalace for Chip Design Tradeoffs on Function-as-a-Service. 201-212 - Steven van der Vlugt, Leon C. Oostrum, Gijs Schoonderbeek, Ben van Werkhoven, Bram Veenboer, Krijn Doekemeijer, John W. Romein:
PowerSensor3: A Fast and Accurate Open Source Power Measurement Tool. 213-226 - Saichand Samudrala, Sushant Kondguli, Paul Gratz:
Benchmarking 3D Gaussian Splatting Rendering. 227-238 - Yongju Lee, Jaewon Kwon
, Cheolhwan Kim, Enhyeok Jang, Jiwon Lee, Hyunwuk Lee, Won Woo Ro:
COSMOS: An LLC Contention Slowdown Model for Heterogeneous Multi-Core Systems. 264-275 - Lieven Eeckhout:
Use Equal-Work or Equal-Time Speedup, Not Geomean Speedup. 276-285 - Noushin Azami, Martin Burtscher
:
Identifying Important Data Transformations for Synthesizing Effective Lossless Compressors. 286-296 - Chris Kjellqvist, Brendan Peercy, Alvin R. Lebeck, Lisa Wu Wills:
Beethoven: A Heterogeneous Multi-Core Accelerator System Composer. 297-308 - Panteleimonas Chatzimiltis, Georgia Antoniou, Haris Volos, Yiannakis Sazeides:
SAGA: A Surrogate Assisted Genetic Algorithm for Fast CPU Power Virus Generation. 309-319 - Sudhanshu Gupta, Niti Madan, Sooraj Puthoor, Nuwan Jayasena, Sandhya Dwarkadas:
Concurrent PIM and Load/Store Servicing in PIM-Enabled Memory. 320-334 - Rashid Aligholipour, Yuan Yao:
The Fake-Busy and True-Idle Problems of Running Graph Applications on Chiplet-Based Multi-Cores. 347-349 - Alexandra W. Chadwick
, Márton Erdos, Utpal Bora, Akshay Bhosale, Bob Lytton, Yuxin Guo, Richard Cooper, Giacomo Gabrielli, Timothy M. Jones:
The Future of Instruction-Level Parallelism (ILP). 350-352 - Seonho Lee, Jihwan Oh, Seokjin Go, Divya Mahajan:
Characterizing Compute-Communication Overlap in GPU-Accelerated Distributed Deep Learning: Performance and Power Implications. 353-355 - Inseong Hwang, Jihoon Jang
, Chaewon Park, Hyun Kim:
PIM-BEACON: A Benchmarking and Emulation Framework Supporting Adaptive CONfigurations in DRAM-Based Processing-in-Memory Systems. 356-358 - Abhinaba Chakraborty
, Wouter Tavernier, Akis Kourtis, Mario Pickavet, Andreas Oikonomakis, Didier Colle:
Profiling Concurrent Vision Inference Workloads on NVIDIA Jetson. 359-361 - Christin Bose, Cesar Avalos, Junrui Pan, Yechen Liu, Mahmoud Khairy, Clay Hughes, Timothy G. Rogers:
ASLink: Modeling Multi-GPU Execution in Accel-Sim. 362-364 - S. M. Mojahidul Ahsan, Mohammad Nouri, Ramesh Reddy Ganapam, Mohammad Alian, Tamzidul Hoque:
A Flexible and Accurate Circuit-Level Substrate for Future DRAM Design and Analysis. 371-373 - Pingyi Huo, Anusha Devulapally, Hasan Al Maruf, Meena Arunachalam, Mahmut Taylan Kandemir, Vijaykrishnan Narayanan:
TPNM: A CXL Based General Purpose Tiered Process Near Memory Framework. 374-376 - Kewei Yan, Yonghong Yan:
A Real-Time, Auto-Regression Method for in-Situ Feature Extraction in Hydrodynamics Simulations. 377-378 - Aniket Chatterjee, Conor James Green, Mithuna Thottethodi:
Library of Networks: An Online Tool for Design and Analysis of Network Topologies. 379-381 - Martí Torrents, Paul Caheny, Stijn Eyerman, Wim Heirman:
Multi-Core Aware Evaluation of Prefetchers. 382-384 - Martin Troiber, Martin Schulz, Blaise Tine, Hyesoon Kim:
Analysis of the RISC-V Vector Extension for Vulkan Graphics Kernels. 388-389 - Rodrigo Huerta, Antonio González:
GPU Simulation Acceleration via Parallelization. 390-392 - Wenzhe Guo, Joyjit Kundu, Uras Tos, Giuliano Sisto, Cedric Rolin, Lars-Åke Ragnarsson, Timon Evenblij:
Energon: A Sustainability-Driven Modeling Framework for AI Data Centers. 393-395 - Yves Vandriessche, Wim Heirman, Ed Nutting, Jeremy Birch, Judah Daniels, Mae Hood, Pascal Costanza:
Measuring Performance Overheads of Software Memory Management Using Functional-First Simulators. 399-400 - Rahul Tripathy, Sumit K. Mandal:
Interconnect Performance Estimation for ML Accelerators via Lightweight Analytical Model. 401-403

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.