Stop the war!
Остановите войну!
for scientists:
default search action
ASPLOS 2024: La Jolla, CA, USA
- Rajiv Gupta, Nael B. Abu-Ghazaleh, Madan Musuvathi, Dan Tsafrir:
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, ASPLOS 2024, La Jolla, CA, USA, 27 April 2024- 1 May 2024. ACM 2024 - Amin Vahdat:
Societal infrastructure in the age of Artificial General Intelligence. 1 - Emmett Witchel:
Challenges and Opportunities for Systems Using CXL Memory. 2 - Tamar Eilam:
Harnessing the Power of Specialization for Sustainable Computing. 3 - Nafea Bshara:
AWS Trainium: The Journey for Designing and Optimization Full Stack ML Hardware. 4 - Jeffrey Yu, Kartik Prabhu, Yonatan Urman, Robert M. Radway, Eric Han, Priyanka Raina:
8-bit Transformer Inference and Fine-tuning for Edge Accelerators. 5-21 - Samuel Thomas, Kidus Workneh, Jac McCarty, Joseph Izraelevitz, Tamara Lehman, R. Iris Bahar:
A Midsummer Night's Tree: Efficient and High Performance Secure SCM. 22-37 - George Bisbas, Anton Lydike, Emilien Bauer, Nick Brown, Mathieu Fehr, Lawrence Mitchell, Gabriel Rodriguez-Canal, Maurice Jamieson, Paul H. J. Kelly, Michel Steuwer, Tobias Grosser:
A shared compilation stack for distributed-memory parallelism in stencil DSLs. 38-56 - Zhuoran Ji, Zhiyuan Zhang, Jiming Xu, Lei Ju:
Accelerating Multi-Scalar Multiplication for Efficient Zero Knowledge Proofs with Multi-GPU Systems. 57-70 - Xiaoyang Lu, Boyu Long, Xiaoming Chen, Yinhe Han, Xian-He Sun:
ACES: Accelerating Sparse Matrix Multiplication with Adaptive Execution Flow and Concurrency-Aware Cache Optimizations. 71-85 - Zhenbo Sun, Huanqi Cao, Yuanwei Wang, Guanyu Feng, Shengqi Chen, Haojie Wang, Wenguang Chen:
AdaPipe: Optimizing Pipeline Parallelism with Adaptive Recomputation and Partitioning. 86-100 - Sungjun Cho, Beomjun Kim, Hyunuk Cho, Gyeongseob Seo, Onur Mutlu, Myungsuk Kim, Jisung Park:
AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDs. 101-118 - Seyed Ali Jokar Jandaghi, Kaveh Mahdaviani, Amirhossein Mirhosseini, Sameh Elnikety, Cristiana Amza, Bianca Schroeder:
AUDIBLE: A Convolution-Based Resource Allocator for Oversubscribing Burstable Virtual Machines. 119-132 - Ruihao Gao, Zhichun Li, Guangming Tan, Xueqi Li:
BeeZip: Towards An Organized and Scalable Architecture for Data Compression. 133-148 - Hao Zhou, Qiukun Han, Heng Shi, Yalin Zhang, Jianguo Yao:
Boost Linear Algebra Computation Performance via Efficient VNNI Utilization. 149-163 - Hamid Farzaneh, João Paulo Cardoso de Lima, Mengyuan Li, Asif Ali Khan, Xiaobo Sharon Hu, Jerónimo Castrillón:
C4CAM: A Compiler for CAM-based In-memory Accelerators. 164-177 - Chang Chen, Xiuhong Li, Qianchao Zhu, Jiangfei Duan, Peng Sun, Xingcheng Zhang, Chao Yang:
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning. 178-191 - Zhuangzhuang Zhou, Vaibhav Gogte, Nilay Vaish, Chris Kennelly, Patrick Xia, Svilen Kanev, Tipp Moseley, Christina Delimitrou, Parthasarathy Ranganathan:
Characterizing a Memory Allocator at Warehouse Scale. 192-206 - Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, Ricardo Bianchini:
Characterizing Power Management Opportunities for LLMs in the Cloud. 207-222 - Hünkar Can Tunç, Ameya Prashant Deshmukh, Berk Çirisci, Constantin Enea, Andreas Pavlogiannis:
CSSTs: A Dynamic Data Structure for Partial Orders in Concurrent Execution Analysis. 223-238 - Dongning Ma, Fan Fred Lin, Alban Desmaison, Joel Coburn, Daniel Moore, Sriram Sankar, Xun Jiao:
Dr. DNA: Combating Silent Data Corruptions in Deep Learning using Distribution of Neuron Activations. 239-252 - Ruibo Fan, Wei Wang, Xiaowen Chu:
DTC-SpMM: Bridging the Gap in Accelerating General Sparse Matrix Multiplication with Tensor Cores. 253-267 - Harrison Williams, Matthew Hicks:
Energy-Adaptive Buffering for Efficient, Responsive, and Persistent Batteryless Systems. 268-282 - Mohannad Ismail, Christopher Jelesnianski, Yeongjin Jang, Changwoo Min, Wenjie Xiong:
Enforcing C/C++ Type and Scope at Runtime for Control-Flow and Data-Flow Integrity. 283-300 - Zhaodong Chen, Andrew Kerr, Richard Cai, Jack Kosaian, Haicheng Wu, Yufei Ding, Yuan Xie:
EVT: Accelerating Deep Learning Training with Epilogue Visitor Tree. 301-316 - Fabian Ritter, Sebastian Hack:
Explainable Port Mapping Inference with Sparse Performance Counters for AMD's Zen Architectures. 317-330 - Chuhao Xu, Yiyu Liu, Zijun Li, Quan Chen, Han Zhao, Deze Zeng, Qian Peng, Xueqi Wu, Haifeng Zhao, Senbo Fu, Minyi Guo:
FaaSMem: Improving Memory Efficiency of Serverless Computing with Memory Pool Architecture. 331-348 - Kai Zhong, Zhenhua Zhu, Guohao Dai, Hongyi Wang, Xinhao Yang, Haoyu Zhang, Jin Si, Qiuli Mao, Shulin Zeng, Ke Hong, Genghan Zhang, Huazhong Yang, Yu Wang:
FEASTA: A Flexible and Efficient Accelerator for Sparse Tensor Algebra in Machine Learning. 349-366 - Yifan Zhao, Hashim Sharif, Vikram S. Adve, Sasa Misailovic:
Felix: Optimizing Tensor Programs with Gradient Descent. 367-381 - Yuhao Liu, Shize Che, Junyu Zhou, Yunong Shi, Gushu Li:
Fermihedral: On the Optimal Compilation for Fermion-to-Qubit Encoding. 382-397 - Ben L. Titzer, Elizabeth Gilbert, Bradley Wei Jie Teo, Yash Anand, Kazuyuki Takayama, Heather Miller:
Flexible Non-intrusive Dynamic Instrumentation for WebAssembly. 398-415 - Yue Guan, Changming Yu, Yangjie Zhou, Jingwen Leng, Chao Li, Minyi Guo:
Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN Pruning. 416-430 - Guowei Liu, Laiping Zhao, Yiming Li, Zhaolin Duan, Sheng Chen, Yitao Hu, Zhiyuan Su, Wenyu Qu:
FUYAO: DPU-enabled Direct Data Transfer for Serverless Computing. 431-447 - Nick Wanninger, Tommy McMichen, Simone Campanoni, Peter A. Dinda:
Getting a Handle on Unmanaged Memory. 448-463 - Chia-Hao Chang, Jihoon Han, Anand Sivasubramaniam, Vikram Sharma Mailthody, Zaid Qureshi, Wen-Mei Hwu:
GMT: GPU Orchestrated Memory Tiering for the Big Data Era. 464-478 - Walid A. Hanafy, Qianlin Liang, Noman Bashir, Abel Souza, David E. Irwin, Prashant J. Shenoy:
Going Green for Less Green: Optimizing the Cost of Reducing Cloud Carbon Emissions. 479-496 - Junseo Lee, Seokwon Lee, Jungi Lee, Junyong Park, Jaewoong Sim:
GSCore: Efficient Radiance Field Rendering via Architectural Support for 3D Gaussian Splatting. 497-511 - Yichi Zhang, Dibei Chen, Gang Zeng, Jianfeng Zhu, Zhaoshi Li, Longlong Chen, Shaojun Wei, Leibo Liu:
Harp: Leveraging Quasi-Sequential Characteristics to Accelerate Sequence-to-Graph Mapping of Long Reads. 512-527 - Kun Wu, Mert Hidayetoglu, Xiang Song, Sitao Huang, Da Zheng, Israt Nisa, Wen-Mei Hwu:
Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures. 528-544 - Minseok Seo, Xuan Truong Nguyen, Seok Joong Hwang, Yongkee Kwon, Guhyun Kim, Chanwook Park, Ilkon Kim, Jaehan Park, Jeongbin Kim, Woojae Shin, Jongsoon Won, Haerang Choi, Kyuyoung Kim, Daehan Kwon, Chunseok Jeong, Sangheon Lee, Yongseok Choi, Wooseok Byun, Seungcheol Baek, Hyuk-Jae Lee, John Kim:
IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System. 545-560 - Tapti Palit, Pedro Fonseca:
Kaleidoscope: Precise Invariant-Guided Pointer Analysis. 561-576 - Akanksha Jain, Hannah Lin, Carlos Villavieja, Baris Kasikci, Chris Kennelly, Milad Hashemi, Parthasarathy Ranganathan:
Limoncello: Prefetchers for Scale. 577-590 - Julian Oppermann, Brindusa Mihaela Damian-Kosterhon, Florian Meisel, Tammo Mürmann, Eyck Jentzsch, Andreas Koch:
Longnail: High-Level Synthesis of Portable Custom Instruction Set Extensions for RISC-V Processors from Descriptions in the Open-Source CoreDSL Language. 591-606 - Renze Chen, Zijian Ding, Size Zheng, Chengrui Zhang, Jingwen Leng, Xuanzhe Liu, Yun Liang:
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN. 607-621 - Emil Tsalapatis, Ryan Hancock, Rakeeb Hossain, Ali José Mashtizadeh:
MemSnap μCheckpoints: A Data Single Level Store for Fearless Persistence. 622-638 - Jinsong Mao, Hailun Ding, Juan Zhai, Shiqing Ma:
Merlin: Multi-tier Optimization of eBPF Code for Performance and Compactness. 639-653 - Jiacheng Huang, Yunmo Zhang, Junqiao Qiu, Yu Liang, Rachata Ausavarungnirun, Qingan Li, Chun Jason Xue:
More Apps, Faster Hot-Launch on Mobile Devices via Fore/Background-aware GC-Swap Co-design. 654-670 - Siwei Tan, Debin Xiang, Liqiang Lu, Junlin Lu, Qiuping Jiang, Mingshuai Chen, Jianwei Yin:
MorphQPV: Exploiting Isomorphism in Quantum Programs to Facilitate Confident Verification. 671-688 - Jungwoo Kim, Seonggyun Oh, Jaeha Kung, Yeseong Kim, Sungjin Lee:
NDPipe: Exploiting Near-data Processing for Scalable Inference and Continuous Training in Photo Storage. 689-707 - Rongxin Han, Jingyu Wang, Qi Qi, Haifeng Sun, Chaowei Xu, Zhaoyang Wan, Zirui Zhuang, Yichuan Yu, Jianxin Liao:
NetRen: Service Migration-Driven Network Renascence with Synthesizing Updated Configuration. 708-721 - Guseul Heo, Sangyeop Lee, Jaehong Cho, Hyunmin Choi, Sanghyeon Lee, Hyungkyu Ham, Gwangsun Kim, Divya Mahajan, Jongse Park:
NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing. 722-737 - Hezi Zhang, Jixuan Ruan, Hassan Shapourian, Ramana Rao Kompella, Yufei Ding:
OnePerc: A Randomness-aware Compiler for Photonic Quantum Computing. 738-754 - Muyan Hu, Ashwin Venkatram, Shreyashri Biswas, Balamurugan Marimuthu, Bohan Hou, Gabriele Oliaro, Haojie Wang, Liyan Zheng, Xupeng Miao, Jidong Zhai, Zhihao Jia:
Optimal Kernel Orchestration for Tensor Programs with Korch. 755-769 - Hosein Yavarzadeh, Archit Agarwal, Max Christman, Christina Garman, Daniel Genkin, Andrew Kwong, Daniel Moghimi, Deian Stefan, Kazem Taram, Dean M. Tullsen:
Pathfinder: High-Resolution Control-Flow Attacks Exploiting the Conditional Branch Predictor. 770-784 - Lin Jia, James Patrick Mcmahon, Sumanth Gudaparthi, Shreyas Singh, Rajeev Balasubramonian:
PATHFINDER: Practical Real-Time Learning for Data Prefetching. 785-800 - Haoran Wang, Lei Wang, Haobo Xu, Ying Wang, Yuming Li, Yinhe Han:
PrimePar: Efficient Spatial-temporal Tensor Partitioning for Large Transformer Model Training. 801-817 - Narges Alavisamani, Suhas Vittal, Ramin Ayanzadeh, Poulami Das, Moinuddin K. Qureshi:
Promatch: Extending the Reach of Real-Time Quantum Error Correction with Adaptive Predecoding. 818-833 - Aditya Ranjan, Tirthak Patel, Daniel Silver, Harshitta Gandhi, Devesh Tiwari:
ProxiML: Building Machine Learning Classifiers for Photonic Quantum Computing. 834-849 - Sharjeel Khan, Bodhisatwa Chatterjee, Santosh Pande:
Pythia: Compiler-Guided Defense Against Non-Control Data Attacks. 850-866 - Kevin Laeufer, Brandon Fajardo, Abhik Ahuja, Vighnesh Iyer, Borivoje Nikolic, Koushik Sen:
RTL-Repair: Fast Symbolic Repair of Hardware Design Code. 867-881 - Bowen Zhang, Wei Chen, Peisen Yao, Chengpeng Wang, Wensheng Tang, Charles Zhang:
SIRO: Empowering Version Compatibility in Intermediate Representations via Program Synthesis. 882-899 - Armand Behroozi, Yuxiang Chen, Vlad Fruchter, Lavanya Subramanian, Sriseshan Srikanth, Scott A. Mahlke:
SlimSLAM: An Adaptive Runtime for Visual-Inertial Simultaneous Localization and Mapping. 900-915 - Wei Niu, Md Musfiqur Rahman Sanim, Zhihao Shu, Jiexiong Guan, Xipeng Shen, Miao Yin, Gagan Agrawal, Bin Ren:
SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile. 916-931 - Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Zhengxin Zhang, Rae Ying Yee Wong, Alan Zhu, Lijie Yang, Xiaoxiang Shi, Chunan Shi, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia:
SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification. 932-949 - Cong Li, Zhe Zhou, Size Zheng, Jiaxi Zhang, Yun Liang, Guangyu Sun:
SpecPIM: Accelerating Speculative Inference on PIM-Enabled System via Architecture-Dataflow Co-Exploration. 950-965 - Neha Prakriya, Yuze Chi, Suhail Basalama, Linghao Song, Jason Cong:
TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs. 966-980 - Chihun Song, Michael Jaemin Kim, Tianchen Wang, Houxiang Ji, Jinghan Huang, Ipoom Jeong, Jaehyun Park, Hwayong Nam, Minbok Wi, Jung Ho Ahn, Nam Sung Kim:
TAROT: A CXL SmartNIC-Based Defense Against Multi-bit Errors by Row-Hammer Attacks. 981-998 - Heehoon Kim, Junyeol Ryu, Jaejin Lee:
TCCL: Discovering Better Communication Paths for PCIe GPU Clusters. 999-1015 - Phitchaya Mangpo Phothilimthana, Saurabh Kadekodi, Soroush Ghodrati, Selene Moon, Martin Maas:
Thesios: Synthesizing Accurate Counterfactual I/O Traces from I/O Samples. 1016-1032 - Massimo Giordano, Rohan Doshi, Qianyun Lu, Boris Murmann:
TinyForge: A Design Space Exploration to Advance Energy and Silicon Area Trade-offs in tinyML Compute Architectures with Custom Latch Arrays. 1033-1047 - Tianrui Wei, Kevin Laeufer, Katie Lim, Jerry Zhao, Koushik Sen, Jonathan Balkind, Krste Asanovic:
Zoomie: A Software-like Debugging Tool for FPGAs. 1048-1062
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.