


default search action
IEEE Computer Architecture Letters, Volume 24
Volume 24, Number 1, January - June 2025
- Haseung Bong

, Nahyeon Kang
, Youngsok Kim
, Joonsung Kim
, Hanhwi Jang
:
IntervalSim++: Enhanced Interval Simulation for Unbalanced Processor Designs. 1-4 - Myoungjun Chun

, Jae Yong Lee
, Inhyuk Choi, Jisung Park, Myungsuk Kim, Jihong Kim
:
Straw: A Stress-Aware WL-Based Read Reclaim Technique for High-Density NAND Flash-Based SSDs. 5-8 - Chaithanya Krishna Vadlamudi

, Bahar Asgari
:
Electra: Eliminating the Ineffectual Computations on Bitmap Compressed Matrices. 9-12 - Jiwon Lee

, Yunjae Lee
, Youngeun Kwon
, Minsoo Rhu
:
Characterization and Analysis of the 3D Gaussian Splatting Rendering Pipeline. 13-16 - Sepehr Tabrizchi

, Mehrdad Morsali
, David Z. Pan
, Shaahin Angizi
, Arman Roohi
:
PINSim: A Processing In- and Near-Sensor Simulator to Model Intelligent Vision Sensors. 17-20 - M. Vardhana

, Rohan Pinto:
High-Performance Winograd Based Accelerator Architecture for Convolutional Neural Network. 21-24 - E. Kritheesh

, Biswabandan Panda
:
SPAM: Streamlined Prefetcher-Aware Multi-Threaded Cache Covert-Channel Attack. 25-28 - Houxiang Ji

, Minho Kim
, Seonmu Oh, Daehoon Kim
, Nam Sung Kim
:
Cooperative Memory Deduplication With Intel Data Streaming Accelerator. 29-32 - Woohyung Choi

, Jinwoo Jeong
, Hanhwi Jang
, Jeongseob Ahn
:
GPU-Centric Memory Tiering for LLM Serving With NVIDIA Grace Hopper Superchip. 33-36 - Jongho Baik

, Jonghyeon Kim
, Chang Hyun Park
, Jeongseob Ahn
:
Accelerating Page Migrations in Operating Systems With Intel DSA. 37-40 - Yunhyeong Jeon

, Minwoo Jang
, Hwanjun Lee
, Yeji Jung
, Jin Jung, Jonggeon Lee
, Jinin So
, Daehoon Kim
:
RoPIM: A Processing-in-Memory Architecture for Accelerating Rotary Positional Embedding in Transformer Models. 41-44 - Sudhanva Gurumurthi

, Mattan Erez:
Editorial: A Letter From the Editor-in-Chief of IEEE Computer Architecture Letters. iii-iv - Hyunwoo Nam

, Jay Hwan Lee
, Shinhyung Yang
, Yeonsoo Kim
, Jiun Jeong
, Jeong-Geun Kim
, Bernd Burgstaller
:
Comprehensive Design Space Exploration for Graph Neural Network Aggregation on GPUs. 45-48 - Yanghui Ou

, Hengrui Zhang
, Austin Rovinski
, David Wentzlaff, Christopher Batten
:
Optically Connected Multi-Stack HBM Modules for Large Language Model Training and Inference. 49-52 - Byeori Kim

, Changhun Lee
, Gwangsun Kim
, Eunhyeok Park
:
Cost-Effective Extension of DRAM-PIM for Group-Wise LLM Quantization. 53-56 - Qirong Xia, Houxiang Ji

, Yang Zhou
, Nam Sung Kim
:
Hardware-Accelerated Kernel-Space Memory Compression Using Intel QAT. 57-60 - Pooya Aghanoury

, Santosh Ghosh, Nader Sehatbakhsh
:
Security Helper Chiplets: A New Paradigm for Secure Hardware Monitoring. 61-64 - Yuanmiao Lin

, Shansen Fu, Xueming Li, Chaoming Yang, Rongfeng Li, Hongmin Huang, Xianghong Hu
, Shuting Cai
, Xiaoming Xiong
:
A DSP-Based Precision-Scalable MAC With Hybrid Dataflow for Arbitrary-Basis-Quantization CNN Accelerator. 65-68 - Farshad Dizani

, Azam Ghanbari
, Joshua Kalyanapu
, Darsh Asher
, Samira Mirbagher Ajorpaz
:
Thor: A Non-Speculative Value Dependent Timing Side Channel Attack Exploiting Intel AMX. 69-72 - Omer Khan

:
A Data Prefetcher-Based 1000-Core RISC-V Processor for Efficient Processing of Graph Neural Networks. 73-76 - Zhenlong Ma

, Ning Kang
, Fan Yang
, Chongyang Hong, Jing Xu, Guojun Yuan
, Peiheng Zhang
, Zhan Wang, Ninghui Sun
:
Toward Scalable RDMA Through Resource Prefetching. 77-80 - Zhengpan Fei

, Mingchuan Lyu
, Satoshi Kawakami
, Koji Inoue
:
Data-Pattern-Driven LUT for Efficient In-Cache Computing in CNNs Acceleration. 81-84 - Taehun Kim

, Yunjae Lee
, Juntaek Lim
, Minsoo Rhu
:
A Characterization of Generative Recommendation Models: Study of Hierarchical Sequential Transduction Unit. 85-88 - Pawan Kumar Sanjaya

, Christina Giannoula
, Ian Colbert, Ihab Amer, Mehdi Saeedi
, Gabor Sines, Nandita Vijaykumar:
DPWatch: A Framework for Hardware-Based Differential Privacy Guarantees. 89-92 - Amin Mamandipoor

, Huy Dinh Tran
, Mohammad Alian
:
SDT: Cutting Datacenter Tax Through Simultaneous Data-Delivery Threads. 93-96 - Chihun Song

, Michael Jaemin Kim
, Yan Sun
, Houxiang Ji
, Kyungsan Kim
, TaeKyeong Ko
, Jung Ho Ahn
, Nam Sung Kim
:
X-PPR: Post Package Repair for CXL Memory. 97-100 - Hossein SeyyedAghaei

, Mahmood Naderan-Tahan, Magnus Jahre
, Lieven Eeckhout
:
Memory-Centric MCM-GPU Architecture. 101-104 - Shvetank Prakash

, Andrew Cheng
, Jason Yik
, Arya Tschand
, Radhika Ghosal
, Ikechukwu Uchendu, Jessica Quaye, Jeffrey Ma
, Shreyas Grampurohit
, Sofia Giannuzzi, Arnav Balyan, Fin Amin, Aadya Pipersenia
, Yash Choudhary, Ankita Nayak, Amir Yazdanbakhsh
, Vijay Janapa Reddi
:
QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture. 105-108 - Heng Cao

, Zhipeng Wu
, Dejian Li, Peiguang Jing
, Sio-Hang Pun
, Yu Liu
:
Accelerating Control Flow on CGRAs via Speculative Iteration Execution. 109-112 - Joshua Kalyanapu

, Farshad Dizani
, Azam Ghanbari
, Darsh Asher
, Samira Mirbagher Ajorpaz
:
Exploiting Intel AMX Power Gating. 113-116 - Liang Yan

, Xiaoyang Lu
, Xiaoming Chen
, Yinhe Han
, Xian-He Sun
:
Pyramid: Accelerating LLM Inference With Cross-Level Processing-in-Memory. 121-124 - Adnan Hasnat, Wim Heirman

, Shoaib Akram
:
Analyzing and Exploiting Memory Hierarchy Parallelism With MLP Stacks. 125-128 - Daniel Puckett

, Tyler Tomer, Paul V. Gratz
, Jiang Hu
, Galen M. Shipman
, Jered Dominguez-Trujillo
, Kevin Sheridan:
Estimating CPI Stacks From Multiplexed Performance Counter Data Using Machine Learning. 129-132 - Shabirahmed Badashasab Jigalur

, Daniel J. Mazure, Teresa C. Garcia, Yen-Cheng Kuan
:
Accelerating Vector Permutation Instruction Execution via Controllable Bitonic Network. 133-136 - Aalaa M. A. Babai

, Koji Inoue
:
Exploring Volatile FPGAs Potential for Accelerating Energy-Harvesting IoT Applications. 137-140 - Daeun Kim

, Jinwoo Hwang, Changhun Oh
, Jongse Park
:
MixDiT: Accelerating Image Diffusion Transformer Inference With Mixed-Precision MX Quantization. 141-144 - TaeHoon Kim

, Jaechun No
:
L-DTC: Load-based Dynamic Throughput Control for Guaranteed I/O Performance in Virtualized Environments. 145-148 - Arteen Abrishami

, Zhengrong Wang
, Tony Nowatzki:
Cache and Near-Data Co-Design for Chiplets. 149-152 - Mattia Tibaldi

, Christian Pilato
:
Amethyst: Reducing Data Center Emissions With Dynamic Autotuning and VM Management. 153-156 - Víctor Nicolás-Conesa

, J. Rubén Titos Gil
, Ricardo Fernández-Pascual
, Manuel E. Acacio
, Alberto Ros
:
WoperTM: Got Nacks? Use Them! 157-160 - Rui Liu

, Zerun Li
, Xiaoyu Zhang
, Xiaoming Chen
, Yinhe Han
, Minghua Tang
:
In-Memory Computing Accelerator for Iterative Linear Algebra Solvers. 161-164 - Minseok Seo

, Jungi Hyun
, Seongho Jeong
, Xuan Truong Nguyen
, Hyuk-Jae Lee
, Hyokeun Lee
:
OASIS: Outlier-Aware KV Cache Clustering for Scaling LLM Inference in CXL Memory Systems. 165-168 - Shunchen Shi

, Fan Yang
, Zhichun Li, Xueqi Li
, Ninghui Sun
:
Exploring the DIMM PIM Architecture for Accelerating Time Series Analysis. 169-172 - Seoyoung Ko

, Hyunjeong Shim
, Wanju Doh
, Sungmin Yun
, Jinin So
, Yongsuk Kwon
, Sang-Soo Park, Si-Dong Roh, Minyong Yoon
, Taeksang Song, Jung Ho Ahn
:
Cosmos: A CXL-Based Full In-Memory System for Approximate Nearest Neighbor Search. 173-176 - Shubhi Shukla

, Abhijeet Singh, Rajdeep Chakraborty, Anirban Chakraborty
, Tejas Rathod, Harshal Mumbaikar, Manoj Kumar Munigala, Madhusudhan K. N, Pabitra Mitra
, Debdeep Mukhopadhyay
:
Minimal Counters, Maximum Insight: Simplifying System Performance With HPC Clusters for Optimized Monitoring. 177-180 - Helya Hosseini

, Ubaid Bakhtiar
, Donghyeon Joo
, Bahar Asgari
:
Segin: Synergistically Enabling Fine-Grained Multi-Tenant and Resource Optimized SpMV. 181-184 - Kyoungho Jeun

, Hyeonu Kim
, Eojin Lee
:
Fold-PIM: A Cost-Efficient LPDDR5-Based PIM for On-Device SLMs. 185-188 - Sanjali Yadav

, Bahar Asgari
:
DynaFlow: An ML Framework for Dynamic Dataflow Selection in SpGEMM Accelerators. 189-192 - Hyunkyun Shin

, Seongtae Bang
, Hyungwon Park, Daehoon Kim
:
SAFE: Sharing-Aware Prefetching for Efficient GPU Memory Management With Unified Virtual Memory. 117-120
Volume 24, Number 2, July - December 2025
- Jongmin Shin

, Seongtae Bang
, Gyeongseo Park
, Daehoon Kim
:
pNet-gem5: Full-System Simulation With High-Performance Networking Enabled by Parallel Network Packet Processing. 193-196 - Jeongho Lee

, Sangjun Kim
, Jae Yong Lee
, Jaeyoung Kang
, Sungjin Lee
, Nam Sung Kim
, Jihong Kim
:
srNAND: A Novel NAND Flash Organization for Enhanced Small Read Throughput in SSDs. 197-200 - Wencheng Zou, Feiyun Zhao

, Nan Wu
:
Stardust: Scalable and Transferable Workload Mapping for Large AI on Multi-Chiplet Systems. 201-204 - Jaime Roelandts

, Ajeya Naithani
, Lieven Eeckhout
:
The Architectural Sustainability Indicator. 205-208 - Kwangrae Kim

, Ki-Seok Chung
:
HPN-SpGEMM: Hybrid PIM-NMP for SpGEMM. 209-212 - Junsu Kim

, Jaebeom Jeon
, Jaeyong Park, Sangun Choi
, Minseong Gil, Seokin Hong
, Gunjae Koo
, Myung Kuk Yoon
, Yunho Oh
:
MOST: Memory Oversubscription-Aware Scheduling for Tensor Migration on GPU Unified Storage. 213-216 - Jumin Kim

, Seungmin Baek
, Minbok Wi
, Hwayong Nam, Michael Jaemin Kim
, Sukhan Lee, Kyomin Sohn
, Jung Ho Ahn
:
Per-Row Activation Counting on Real Hardware: Demystifying Performance Overheads. 217-220 - Bhargav Reddy Godala

, Sankara Prasad Ramesh
, Krishnam Tibrewala
, Chrysanthos Pepi
, Gino Chacon
, Svilen Kanev
, Gilles A. Pokam, Alberto Ros
, Daniel A. Jiménez, Paul V. Gratz
, David I. August
:
Correct Wrong Path. 221-224 - Ruihao Li

, Lizy K. John
, Neeraja J. Yadwadkar
:
Old is Gold: Optimizing Single-Threaded Applications With ExGen-Malloc. 225-228 - Yuhan Liu

, Zihang Yang
, Fei Tong
, Liang Cheng, Ming Ling
:
Camulator: A Lightweight and Extensible Trace-Driven Cache Simulator for Embedded Multicore SoCs. 229-232 - Bikrant Das Sharma

, Houxiang Ji
, Ipoom Jeong
, Nam Sung Kim
:
Time Series Machine Learning Models for Precise SSD Access Latency Prediction. 233-236 - Emad Jacob Maroun

:
On Internally Tagged Instruction Set Architectures. 237-240 - Ling Yang

, Libo Huang
, Zhenxuan Xiong
, Yongwen Wang
, Weixia Xu
:
EgDiff: An Enhanced Global Load Value Predictor. 241-244 - Tatsuya Kubo

, Daichi Tokuda, Lei Qu
, Ting Cao
, Shinya Takamaeda-Yamazaki
:
PUDTune: Multi-Level Charging for High-Precision Calibration in Processing-Using-DRAM. 245-248 - Sevval Izmirli

, Julian Pavon
, Iván Vargas Valdivieso
, Betül Aydeger
, Kerem Yalçinkaya
, Adrián Cristal
, Oguz Ergin
, Osman S. Unsal
:
Halis: A Hardware-Software Co-Designed Near-Cache Accelerator for Graph Pattern Mining. 249-252 - Pratiksha Mundhe

, Yuta Hano, Satoshi Kawakami
, Teruo Tanimoto, Masamitsu Tanaka
, Koji Inoue
, Ilkwon Byun
:
Approximate SFQ-Based Computing Architecture Modeling With Device-Level Guidelines. 253-256 - Jiaqi Lou

, Yu Li
, Srikar Vanavasam
, Nam Sung Kim
:
HINT: A Hardware Platform for Intra-Host NIC Traffic and SmartNIC Emulation. 261-264 - Kwanhee Kyung

, Sungmin Yun
, Jung Ho Ahn
:
SSD Offloading for LLM Mixture-of-Experts Weights Considered Harmful in Energy Efficiency. 265-268 - Xueyang Liu

, Seonjin Na
, Euijun Chung, Jiashen Cao, Jing Yang, Hyesoon Kim
:
Contention-Aware GPU Thread Block Scheduler for Efficient GPU-SSD. 257-260 - Mengting Zhang

, Zhichuan Guo
, Shining Sun
:
RoSR: A Novel Selective Retransmission FPGA Architecture for RDMA NICs. 269-272 - Nayana Rajeev

, Cathrene Biju
, Titu Mary Ignatius
, Roy Paily Palathinkal
, Rekha K. James
:
RAESC: A Reconfigurable AES Countermeasure Architecture for RISC-V With Enhanced Power Side-Channel Resilience. 273-276 - Deyuan Guo

, MohammadHosein Gholamrezaei
, Matthew Hofmann
, Ashish Venkat, Zhiru Zhang
, Kevin Skadron
:
PIMsynth: A Unified Compiler Framework for Bit-Serial Processing-in-Memory Architectures. 277-280 - Hangyu Liu

, Shouxi Luo
, Ke Li
, Huanlai Xing
, Bo Peng
:
Checkflow: Low-Overhead Checkpointing for Deep Learning Training. 281-284 - Kyungsoo Kim

, Omin Kwon
, Yeonhong Park, Jae W. Lee
:
AiDE: Attention-FFN Disaggregated Execution for Cost-Effective LLM Decoding on CXL-PNM. 285-288 - Minho Kim

, Houxiang Ji
, Jaeyoung Kang
, Hwanjun Lee
, Daehoon Kim
, Nam Sung Kim
:
CABANA : Cluster-Aware Query Batching for Accelerating Billion-Scale ANNS With Intel AMX. 289-292 - Lei Wang

, Chia-Hang Lee
, Maccoy Merrell
, Gino Chacon
, Daniel A. Jiménez
, Paul V. Gratz
:
R-Max: A Method for Approximating the Benefit of Ideal Prefetching and Replacement Policy. 293-296 - Allen Aboytes

, Pankaj Mehra
:
Improving Performance on Tiered Memory With Semantic Data Placement. 297-300 - Diamantis Patsidis, Georgios Vavouliotis

:
Context-Aware Set Dueling for Dynamic Policy Arbitration. 301-304 - Sanya Srivastava

, Fletch Rydell
, Andrés Goens
, Vijay Nagarajan
, Daniel J. Sorin
:
Efficient Deadlock Avoidance by Considering Stalling, Message Dependencies, and Topology. 305-308 - Gyeongrok Yang

, Jaeha Min
, In-Jun Jung, Joo-Young Kim
:
A Quantitative Analysis of Mamba-2-Based Large Language Model: Study of State Space Duality. 309-312 - Rui Xie

, Asad Ul Haq
, Yunhua Fang
, Linsen Ma
, Sanchari Sen
, Swagath Venkataramani
, Liu Liu
, Tong Zhang
:
Breaking the HBM Bit Cost Barrier: Domain-Specific ECC for AI Inference Infrastructure. 313-316 - Haoyu Wang

, Noa Zilberman
, Ahmad Atamli, Amro Awad
:
Revisiting Virtual Memory Support for Confidential Computing Environments. 317-320 - Xinyu Wang

, Xiaotian Sun
, Wanqian Li
, Feng Min
, Xiaoyu Zhang
, Xinjiang Zhang, Yinhe Han
, Xiaoming Chen
:
Low-Latency PIM Accelerator for Edge LLM Inference. 321-324 - Elham Adibi, Mohammadamin Ajdari

, Pouria Arefijamal
, Amirsaeed Ahmadi-Tonekaboni
, Hossein Asadi
:
I/O-ETEM: An I/O-Aware Approach for Estimating Execution Time of Machine Learning Workloads. 325-328 - Jiahao Xiang

, Lang Li
:
Thread-Adaptive: High-Throughput Parallel Architectures of SLH-DSA on GPUs. 329-332 - Honghui Liu

, Xian Lin
, Xin Zheng
, Qiancheng Liu, Huaien Gao
, Shuting Cai
, Xiaoming Xiong
:
A Partial Tag-Data Decoupled Architecture for Last-Level Cache Optimization. 333-336 - Yunhua Fang

, Rui Xie
, Asad Ul Haq
, Linsen Ma
, Kaoutar El Maghraoui
, Naigang Wang
, Meng Wang
, Liu Liu
, Tong Zhang
:
Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System. 337-340 - Jinyu Liu

, Kiwan Maeng
:
In-Depth Characterization of Machine Learning on an Optimized Multi-Party Computing Library. 341-344 - SeokHyeon Kong

, Donghwan Kim
, Euiseong Seo
, Kiwan Maeng
:
Characterizing the System Overhead of Discrete Noise Generation for Differential Privacy. 345-348 - Xianghong Hu

, Yuanmiao Lin
, Xueming Li
, Ruidian Zhan, Jie Cao, Dayong Zhu, Shuting Cai
, Xin Zheng
, Xiaoming Xiong
:
A Multiple-Aspect Optimal CNN Accelerator in Top1 Accuracy, Performance, and Power Efficiency. 349-352 - Sookyung Choi

, Myunghyun Rhee
, Euiseok Kim, Kwangsik Shin
, Youngpyo Joo
, Hoshik Kim
:
PNM Meets Sparse Attention: Enabling Multi-Million Tokens Inference at Scale. 353-356 - Teresa Zhang

:
Rethinking In-Memory Hash Table Design for CXL-Based Main Memory Compression. 357-360 - Jaehong Cho

, Hyunmin Choi
, Jongse Park
:
LLMServingSim2.0: A Unified Simulator for Heterogeneous Hardware and Serving Techniques in LLM Infrastructure. 361-364 - Myunghyun Rhee

, Sookyung Choi
, Euiseok Kim, Joonseop Sim, Youngpyo Joo
, Hoshik Kim
:
MoSKA: Mixture of Shared KV Attention for Efficient Long-Sequence LLM Inference. 365-368 - Tianyao Shi

, Yanran Wu
, Sihang Liu
, Yi Ding
:
Disaggregated Speculative Decoding for Carbon-Efficient LLM Serving. 369-372 - Minki Jeong

, Daegun Yoon
, Soohong Ahn
, Seungyong Lee
, Jooyoung Kim
, Jinuk Jeon, Joonseop Sim, Youngpyo Joo
, Hoshik Kim
:
StreamDQ: HBM-Integrated On-the-Fly DeQuantization via Memory Load for Large Language Models. 373-376 - Xiaoyu Sun

, Haruki Mori
, Wei-Chang Zhao, Je-Min Hung, Hidehiro Fujiwara, Brian Crafton, Bo Zhang, Win-San Khwa
, Yu-Der Chih, Tsung-Yung Jonathan Chang, Kerem Akarvardar
:
Enhancing DCIM Efficiency with Multi-Storage-Row Architecture for Edge AI Workloads. 377-380 - Yanhuan Liu

, Wenming Li
, Kunming Zhang
, Tianyu Liu
, Xiaochun Ye
, Xuejun An
:
CODA: A Computation-Driven Paradigm for Sparse DNN Acceleration. 381-384 - Minjae Kim

, Jiwan Kim
, Won Hur
, Jiwon Lee
, Mingu Jung
, Cheolhwan Kim
, Ipoom Jeong
, Won Woo Ro
:
REDIT: Redirection-Enabled Memory-Side Directory Architecture for CXL Memory Fabric. 385-388 - Jiaxiang Li

, Mark C. Jeffrey
, Natalie Enright Jerger
:
A Performance Model for Disintegrated Manycores. 389-392 - Ertza Warraich

, Ali Imran
, Annus Zulfiqar, Shay Vargaftik, Sonia Fahmy
, Muhammad Shahbaz
:
Reimagining RDMA Through the Lens of ML. 393-396

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














