default search action
6th MLSys 2023: Miami, FL, USA
- Dawn Song, Michael Carbin, Tianqi Chen:
Proceedings of the Sixth Conference on Machine Learning and Systems, MLSys 2023, Miami, FL, USA, June 4-8, 2023. mlsys.org 2023 - Hyoukjun Kwon, Krishnakumar Nair, Jamin Seo, Jason Yik, Debabrata Mohapatra, Dongyuan Zhan, Jinook Song, Peter Capak, Peizhao Zhang, Peter Vajda, Colby R. Banbury, Mark Mazumder, Liangzhen Lai, Ashish Sirasao, Tushar Krishna, Harshit Khaitan, Vikas Chandra, Vijay Janapa Reddi:
XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse. - Ewen Wang, Boyi Chen, Mosharaf Chowdhury, Ajay Kannan, Franco Liang:
FLINT: A Platform for Federated Learning Integration. - Zining Zhang, Bingsheng He, Zhenjie Zhang:
Practical Edge Kernels for Integer-Only Vision Transformers Under Post-training Quantization. - Joel Lamy-Poirier:
Breadth-First Pipeline Parallelism. - Daochen Zha, Louis Feng, Liang Luo, Bhargav Bhushanam, Zirui Liu, Yusuo Hu, Jade Nie, Yuzhen Huang, Yuandong Tian, Arun Kejariwal, Xia Hu:
Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models. - Qinbin Li, Zhaomin Wu, Yanzheng Cai, Yuxuan Han, Ching Man Yung, Tianyuan Fu, Bingsheng He:
FedTree: A Federated Learning System For Trees. - Daniel Snider, Fanny Chevalier, Gennady Pekhimenko:
Hotline Profiler: Automatic Annotation and A Multi-Scale Timeline for Visualizing Time-Use in DNN Training. - Kevin Kuo, Pratiksha Thaker, Mikhail Khodak, John Nguyen, Daniel Jiang, Ameet Talwalkar, Virginia Smith:
On Noisy Evaluation in Federated Hyperparameter Tuning. - Tianhang Zheng, Hao Lan, Baochun Li:
Be Careful with PyPI Packages: You May Unconsciously Spread Backdoor Model Weights. - Yi Hu, Chaoran Zhang, Edward Andert, Harshul Singh, Aviral Shrivastava, James Laudon, Yanqi Zhou, Bob Iannucci, Carlee Joe-Wong:
GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing. - Yijin Li, Jiacheng Zhao, Qianqi Sun, Haohui Mai, Lei Chen, Wanlu Cao, Yanfan Chen, Zhicheng Li, Ying Liu, Xinyuan Zhang, Xiyu Shi, Jie Zhao, Jingling Xue, Huimin Cui, Xiaobing Feng:
SIRIUS: Harvesting Whole-Program Optimization Opportunities for DNNs. - Borui Wan, Juntao Zhao, Chuan Wu:
Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training. - Shen Li, Pritam Damania, Luca Wehrstedt, Rohan Varma, Omkar Salpekar, Pavel Belevich, Howard Huang, Yanli Zhao, Lucas Hosseini, Wanchao Liang, Hongyi Jia, Shihao Xu, Satendra Gera, Alisson G. Azzolini, Guoqiang Jerry Chen, Zachary DeVito, Chaoyang He, Amir Ziashahabi, Alban Desmaison, Edward Z. Yang, Gregory Chanan, Brian Vaughan, Manoj Krishnan, Joseph S. Spisak, Salman Avestimehr, Soumith Chintala:
PyTorch RPC: Distributed Deep Learning Built on Tensor-Optimized Remote Procedure Calls. - Hugo Barbalho, Patricia Kovaleski, Beibin Li, Luke Marshall, Marco Molinaro, Abhisek Pan, Eli Cortez, Matheus Leao, Harsh Patwari, Zuzu Tang, Larissa Rozales Gonçalves, David Dion, Thomas Moscibroda, Ishai Menache:
Virtual Machine Allocation with Lifetime Predictions. - Colby R. Banbury, Vijay Janapa Reddi, Alexander Elium, Shawn Hymel, David Tischler, Daniel Situnayake, Carl Ward, Louis Moreau, Jenny Plunkett, Matthew Kelcey, Mathijs Baaijens, Alessandro Grande, Dmitry Maslov, Arthur Beavis, Jan Jongboom, Jessica Quaye:
Edge Impulse: An MLOps Platform for Tiny Machine Learning. - Changho Hwang, Wei Cui, Yifan Xiong, Ziyue Yang, Ze Liu, Han Hu, Zilong Wang, Rafael Salas, Jithin Jose, Prabhat Ram, HoYuen Chau, Peng Cheng, Fan Yang, Mao Yang, Yongqiang Xiong:
Tutel: Adaptive Mixture-of-Experts at Scale. - Trevor Gale, Deepak Narayanan, Cliff Young, Matei Zaharia:
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts. - Ioannis Lamprou, Zhen Zhang, Javier de Juan, Hang Yang, Yongqiang Lai, Etienne Filhol, Cédric Bastoul:
Safe Optimized Static Memory Allocation for Parallel Deep Learning. - Guoliang He, Sean Parker, Eiko Yoneki:
X-RLflow: Graph Reinforcement Learning for Neural Network Subgraphs Transformation. - Vijay Anand Korthikanti, Jared Casper, Sangkug Lym, Lawrence McAfee, Michael Andersch, Mohammad Shoeybi, Bryan Catanzaro:
Reducing Activation Recomputation in Large Transformer Models. - Yan Wang, Yuhang Li, Ruihao Gong, Aishan Liu, Yanfei Wang, Jian Hu, Yongqiang Yao, Yunchen Zhang, Tianzi Xiao, Fengwei Yu, Xianglong Liu:
SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency. - Zhuang Wang, Xinyu Crystal Wu, Zhaozhuo Xu, T. S. Eugene Ng:
Cupcake: A Compression Scheduler for Scalable Communication-Efficient Distributed Training. - Zhongming Yu, Guohao Dai, Shang Yang, Genghan Zhang, Hengrui Zhang, Feiwen Zhu, June Yang, Jishen Zhao, Yu Wang:
HyperGef: A Framework Enabling Efficient Fusion for Hypergraph Neural Network on GPUs. - Yifan Zhao, Hashim Sharif, Peter Pao-Huang, Vatsin Shah, Arun Narenthiran Sivakumar, Mateus Valverde Gasparino, Abdulrahman Mahmoud, Nathan Zhao, Sarita V. Adve, Girish Chowdhary, Sasa Misailovic, Vikram S. Adve:
ApproxCaliper: A Programmable Framework for Application-aware Neural Network Optimization. - Horace He, Shangdi Yu:
Transcending Runtime-Memory Tradeoffs in Checkpointing by being Fusion Aware. - Ke Hong, Zhongming Yu, Guohao Dai, Xinhao Yang, Yaoxiu Lian, Zehao Liu, Ningyi Xu, Yuhan Dong, Yu Wang:
Exploiting Hardware Utilization and Adaptive Dataflow for Efficient Sparse Convolution in 3D Point Clouds. - Le Chen, Quazi Ishtiaque Mahmud, Hung Phan, Nesreen K. Ahmed, Ali Jannesari:
Learning to Parallelize with OpenMP by Augmented Heterogeneous AST Representation. - Michael Kuchnik, Virginia Smith, George Amvrosiadis:
Validating Large Language Models with ReLM. - Tim Kaler, Alexandros-Stavros Iliopoulos, Philip Murzynowski, Tao B. Schardl, Charles E. Leiserson, Jie Chen:
Communication-Efficient Graph Neural Networks with Probabilistic Neighborhood Expansion Analysis and Caching. - Yaosheng Fu, Evgeny Bolotin, Aamer Jaleel, Gal Dalal, Shie Mannor, Jacob Subag, Noam Korem, Michael Behar, David W. Nellans:
AutoScratch: ML-Optimized Cache Management for Inference-Oriented GPUs. - Bin Lin, Ningxin Zheng, Lei Wang, Shijie Cao, Lingxiao Ma, Quanlu Zhang, Yi Zhu, Ting Cao, Jilong Xue, Yuqing Yang, Fan Yang:
Efficient GPU Kernels for N: M-Sparse Weights in Deep Learning. - Yonghao Zhuang, Lianmin Zheng, Zhuohan Li, Eric P. Xing, Qirong Ho, Joseph Gonzalez, Ion Stoica, Hao Zhang, Hexu Zhao:
On Optimizing the Communication of Model Parallelism. - Sanket Purandare, Abdul Wasay, Animesh Jain, Stratos Idreos:
μ-TWO: 3× Faster Multi-Model Training with Orchestration and Memory Optimization. - Payman Behnam, Alexey Tumanov, Tushar Krishna, Pranav Gadikar, Yangyu Chen, Jianming Tong, Yue Pan, Abhimanyu Rajeshkumar Bambhaniya, Alind Khare:
Subgraph Stationary Hardware-Software Inference Co-Design. - Hongyi Wang, Saurabh Agarwal, Pongsakorn U.-Chupala, Yoshiki Tanaka, Eric P. Xing, Dimitris Papailiopoulos:
Cuttlefish: Low-Rank Model Training without All the Tuning. - Reiner Pope, Sholto Douglas, Aakanksha Chowdhery, Jacob Devlin, James Bradbury, Jonathan Heek, Kefan Xiao, Shivani Agrawal, Jeff Dean:
Efficiently Scaling Transformer Inference. - Vitaliy Chiley, Vithursan Thangarasa, Abhay Gupta, Anshul Samar, Joel Hestness, Dennis DeCoste:
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network. - Vidit Jain, Jatin Prakash, Deepak Saini, Jian Jiao, Ramachandran Ramjee, Manik Varma:
Renee: End-To-End Training of Extreme Classification Models. - Jaeyeon Won, Changwan Hong, Charith Mendis, Joel S. Emer, Saman P. Amarasinghe:
Unified Convolution Framework: A compiler-based approach to support sparse convolutions. - Guyue Huang, Yang Bai, Liu Liu, Yuke Wang, Bei Yu, Yufei Ding, Yuan Xie:
ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs. - Shiqi He, Qifan Yan, Feijie Wu, Lanjun Wang, Mathias Lécuyer, Ivan Beschastnikh:
GlueFL: Reconciling Client Sampling and Model Masking for Bandwidth Efficient Federated Learning. - Kazuki Osawa, Shigang Li, Torsten Hoefler:
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices. - Cheng Tan, Changliu Liu, Zhihao Jia, Tianhao Wei:
Building Verified Neural Networks for Computer Systems with Ouroboros. - Saurav Muralidharan:
Uniform Sparsity in Deep Neural Networks. - Mark Zhao, Dhruv Choudhary, Devashish Tyagi, Ajay Somani, Max Kaplan, Sung-Han Lin, Sarunya Pumma, Jongsoo Park, Aarti Basant, Niket Agarwal, Carole-Jean Wu, Christos Kozyrakis:
RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure. - Younghoon Byun, Seungsik Moon, Baeseong Park, Se Jung Kwon, Dongsoo Lee, Gunho Park, Eunji Yoo, Jung Gyu Min, Youngjoo Lee:
Sparsity-Aware Memory Interface Architecture using Stacked XORNet Compression for Accelerating Pruned-DNN Models.
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.