


Остановите войну!
for scientists:


default search action
Michael W. Mahoney
Person information

- affiliation: University of California, Berkeley, Department of Statistics
- affiliation: Stanford University, Department of Mathematics
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [j59]Sen Na
, Michal Derezinski, Michael W. Mahoney:
Hessian averaging in stochastic Newton methods achieves superlinear convergence. Math. Program. 201(1): 473-520 (2023) - [j58]Kimon Fountoulakis, Meng Liu
, David F. Gleich
, Michael W. Mahoney:
Flow-Based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance. SIAM Rev. 65(1): 59-143 (2023) - [c136]Francesco Quinzan, Rajiv Khanna, Moshik Hershcovitch, Sarel Cohen, Daniel G. Waddington, Tobias Friedrich, Michael W. Mahoney:
Fast Feature Selection with Fairness Constraints. AISTATS 2023: 7800-7823 - [c135]Geoffrey Négiar, Michael W. Mahoney, Aditi S. Krishnapriyan:
Learning differentiable solvers for systems with hard constraints. ICLR 2023 - [c134]T. Konstantin Rusch, Benjamin Paul Chamberlain, Michael W. Mahoney, Michael M. Bronstein, Siddhartha Mishra:
Gradient Gating for Deep Multi-Rate Learning on Graphs. ICLR 2023 - [c133]Derek Hansen, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Michael W. Mahoney:
Learning Physical Models that Can Respect Conservation Laws. ICML 2023: 12469-12510 - [c132]Liam Hodgkinson, Christopher van der Heide, Fred Roosta, Michael W. Mahoney:
Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes. ICML 2023: 13085-13117 - [c131]Ilgee Hong, Sen Na, Michael W. Mahoney, Mladen Kolar:
Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching. ICML 2023: 13174-13198 - [c130]Yefan Zhou, Yaoqing Yang, Arin Chang, Michael W. Mahoney:
A Three-regime Model of Network Pruning. ICML 2023: 42790-42809 - [c129]Yaoqing Yang
, Ryan Theisen
, Liam Hodgkinson
, Joseph E. Gonzalez
, Kannan Ramchandran
, Charles H. Martin
, Michael W. Mahoney
:
Test Accuracy vs. Generalization Gap: Model Selection in NLP without Accessing Training or Testing Data. KDD 2023: 3011-3021 - [i186]Sehoon Kim, Karttikeya Mangalam, Jitendra Malik, Michael W. Mahoney, Amir Gholami, Kurt Keutzer:
Big Little Transformer Decoder. CoRR abs/2302.07863 (2023) - [i185]Derek Hansen, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Michael W. Mahoney:
Learning Physical Models that Can Respect Conservation Laws. CoRR abs/2302.11002 (2023) - [i184]Riley Murray, James Demmel, Michael W. Mahoney, N. Benjamin Erichson, Maksim Melnichenko, Osman Asif Malik, Laura Grigori, Piotr Luszczek, Michal Derezinski, Miles E. Lopes, Tianyu Liang, Hengrui Luo, Jack J. Dongarra:
Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software. CoRR abs/2302.11474 (2023) - [i183]Sehoon Kim, Coleman Hooper, Thanakul Wattanawong, Minwoo Kang, Ruohan Yan, Hasan Genc, Grace Dinh, Qijing Huang, Kurt Keutzer, Michael W. Mahoney, Yakun Sophia Shao, Amir Gholami:
Full Stack Optimization of Transformer Inference: a Survey. CoRR abs/2302.14017 (2023) - [i182]Javier Campos, Zhen Dong, Javier M. Duarte, Amir Gholami, Michael W. Mahoney, Jovan Mitrevski, Nhan Tran:
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs. CoRR abs/2304.06745 (2023) - [i181]Ryan Theisen, Hyunsuk Kim, Yaoqing Yang, Liam Hodgkinson, Michael W. Mahoney:
When are ensembles really effective? CoRR abs/2305.12313 (2023) - [i180]Ilgee Hong, Sen Na, Michael W. Mahoney, Mladen Kolar:
Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching. CoRR abs/2305.18379 (2023) - [i179]Yefan Zhou, Yaoqing Yang, Arin Chang, Michael W. Mahoney:
A Three-regime Model of Network Pruning. CoRR abs/2305.18383 (2023) - [i178]Shashank Subramanian, Peter Harrington, Kurt Keutzer, Wahid Bhimji, Dmitriy Morozov, Michael W. Mahoney, Amir Gholami:
Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior. CoRR abs/2306.00258 (2023) - [i177]Sehoon Kim, Coleman Hooper, Amir Gholami, Zhen Dong, Xiuyu Li, Sheng Shen, Michael W. Mahoney, Kurt Keutzer:
SqueezeLLM: Dense-and-Sparse Quantization. CoRR abs/2306.07629 (2023) - [i176]Feynman T. Liang, Liam Hodgkinson, Michael W. Mahoney:
A Heavy-Tailed Algebra for Probabilistic Programming. CoRR abs/2306.09262 (2023) - [i175]Pu Ren, N. Benjamin Erichson, Shashank Subramanian, Omer San, Zarija Lukic, Michael W. Mahoney:
SuperBench: A Super-Resolution Benchmark Dataset for Scientific Machine Learning. CoRR abs/2306.14070 (2023) - [i174]Sitan Yang, Malcolm Wolff, Shankar Ramasubramanian, Vincent Quenneville-Bélair, Ronak Metha, Michael W. Mahoney:
GEANN: Scalable Graph Augmentations for Multi-Horizon Time Series Forecasting. CoRR abs/2307.03595 (2023) - [i173]Liam Hodgkinson, Christopher van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney:
The Interpolating Information Criterion for Overparameterized Models. CoRR abs/2307.07785 (2023) - [i172]Geoffrey Négiar, Ruijun Ma, O. Nangba Meetei, Mengfei Cao, Michael W. Mahoney:
Probabilistic Forecasting with Coherent Aggregation. CoRR abs/2307.09797 (2023) - [i171]Younghyun Cho, James Weldon Demmel, Michal Derezinski, Haoyun Li, Hengrui Luo, Michael W. Mahoney, Riley J. Murray:
Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems. CoRR abs/2308.15720 (2023) - 2022
- [j57]Fred Roosta
, Yang Liu
, Peng Xu, Michael W. Mahoney:
Newton-MR: Inexact Newton Method with minimum residual sub-problem solver. EURO J. Comput. Optim. 10: 100035 (2022) - [j56]Ali Eshragh
, Fred Roosta, Asef Nazari, Michael W. Mahoney:
LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data. J. Mach. Learn. Res. 23: 22:1-22:36 (2022) - [j55]Ping Ma, Yongkai Chen, Xinlian Zhang, Xin Xing, Jingyi Ma, Michael W. Mahoney:
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms. J. Mach. Learn. Res. 23: 177:1-177:45 (2022) - [c128]Sehoon Kim, Amir Gholami, Zhewei Yao, Nicholas Lee, Patrick Wang, Aniruddha Nrusimha, Bohan Zhai, Tianren Gao, Michael W. Mahoney, Kurt Keutzer:
Integer-Only Zero-Shot Quantization for Efficient Speech Recognition. ICASSP 2022: 4288-4292 - [c127]Majid Jahani, Sergey Rusakov, Zheng Shi, Peter Richtárik, Michael W. Mahoney, Martin Takác:
Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information. ICLR 2022 - [c126]Soon Hoe Lim, N. Benjamin Erichson, Francisco Utrera, Winnie Xu, Michael W. Mahoney:
Noisy Feature Mixup. ICLR 2022 - [c125]T. Konstantin Rusch, Siddhartha Mishra, N. Benjamin Erichson, Michael W. Mahoney:
Long Expressive Memory for Sequence Modeling. ICLR 2022 - [c124]Liam Hodgkinson, Umut Simsekli, Rajiv Khanna, Michael W. Mahoney:
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers. ICML 2022: 8774-8795 - [c123]Feynman T. Liang, Michael W. Mahoney, Liam Hodgkinson:
Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows. ICML 2022: 13257-13270 - [c122]Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael W. Mahoney, Alvin Cheung:
GACT: Activation Compressed Training for Generic Network Architectures. ICML 2022: 14139-14152 - [c121]Da Long, Zheng Wang, Aditi S. Krishnapriyan, Robert M. Kirby, Shandian Zhe, Michael W. Mahoney:
AutoIP: A United Framework to Integrate Physics into Gaussian Processes. ICML 2022: 14210-14222 - [c120]Zhengming Zhang, Ashwinee Panda, Linyue Song, Yaoqing Yang, Michael W. Mahoney, Prateek Mittal, Kannan Ramchandran, Joseph Gonzalez:
Neurotoxin: Durable Backdoors in Federated Learning. ICML 2022: 26429-26446 - [c119]Sehoon Kim, Amir Gholami, Albert E. Shaw, Nicholas Lee, Karttikeya Mangalam, Jitendra Malik, Michael W. Mahoney, Kurt Keutzer:
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition. NeurIPS 2022 - [c118]Woosuk Kwon, Sehoon Kim, Michael W. Mahoney, Joseph Hassoun, Kurt Keutzer, Amir Gholami:
A Fast Post-Training Pruning Framework for Transformers. NeurIPS 2022 - [c117]Shixing Yu, Zhewei Yao, Amir Gholami, Zhen Dong, Sehoon Kim, Michael W. Mahoney, Kurt Keutzer:
Hessian-Aware Pruning and Optimal Neural Implant. WACV 2022: 3665-3676 - [i170]N. Benjamin Erichson, Soon Hoe Lim, Francisco Utrera, Winnie Xu, Ziang Cao, Michael W. Mahoney:
NoisyMix: Boosting Robustness by Combining Data Augmentations, Stability Training, and Noise Injections. CoRR abs/2202.01263 (2022) - [i169]Yaoqing Yang, Ryan Theisen, Liam Hodgkinson, Joseph E. Gonzalez, Kannan Ramchandran, Charles H. Martin, Michael W. Mahoney:
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data. CoRR abs/2202.02842 (2022) - [i168]Aditi S. Krishnapriyan, Alejandro F. Queiruga, N. Benjamin Erichson, Michael W. Mahoney:
Learning continuous models for continuous physics. CoRR abs/2202.08494 (2022) - [i167]Da Long, Zheng Wang, Aditi S. Krishnapriyan, Robert M. Kirby, Shandian Zhe, Michael W. Mahoney:
AutoIP: A United Framework to Integrate Physics into Gaussian Processes. CoRR abs/2202.12316 (2022) - [i166]Francesco Quinzan, Rajiv Khanna, Moshik Hershcovitch, Sarel Cohen, Daniel G. Waddington, Tobias Friedrich, Michael W. Mahoney:
Fast Feature Selection with Fairness Constraints. CoRR abs/2202.13718 (2022) - [i165]Sen Na, Michal Derezinski, Michael W. Mahoney:
Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence. CoRR abs/2204.09266 (2022) - [i164]Woosuk Kwon, Sehoon Kim, Michael W. Mahoney, Joseph Hassoun, Kurt Keutzer, Amir Gholami:
A Fast Post-Training Pruning Framework for Transformers. CoRR abs/2204.09656 (2022) - [i163]Sarah E. Chasins, Alvin Cheung, Natacha Crooks, Ali Ghodsi, Ken Goldberg, Joseph E. Gonzalez, Joseph M. Hellerstein, Michael I. Jordan, Anthony D. Joseph, Michael W. Mahoney, Aditya G. Parameswaran
, David A. Patterson, Raluca Ada Popa, Koushik Sen, Scott Shenker, Dawn Song, Ion Stoica:
The Sky Above The Clouds. CoRR abs/2205.07147 (2022) - [i162]Feynman T. Liang, Liam Hodgkinson, Michael W. Mahoney:
Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows. CoRR abs/2205.07918 (2022) - [i161]Sen Na, Michael W. Mahoney:
Asymptotic Convergence Rate and Statistical Inference for Stochastic Sequential Quadratic Programming. CoRR abs/2205.13687 (2022) - [i160]Sehoon Kim, Amir Gholami, Albert E. Shaw, Nicholas Lee, Karttikeya Mangalam, Jitendra Malik, Michael W. Mahoney, Kurt Keutzer:
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition. CoRR abs/2206.00888 (2022) - [i159]Zhengming Zhang, Ashwinee Panda, Linyue Song, Yaoqing Yang, Michael W. Mahoney, Joseph E. Gonzalez, Kannan Ramchandran, Prateek Mittal:
Neurotoxin: Durable Backdoors in Federated Learning. CoRR abs/2206.10341 (2022) - [i158]Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael W. Mahoney, Alvin Cheung:
GACT: Activation Compressed Training for General Architectures. CoRR abs/2206.11357 (2022) - [i157]Shashank Subramanian, Robert M. Kirby, Michael W. Mahoney, Amir Gholami:
Adaptive Self-supervision Algorithms for Physics-informed Neural Networks. CoRR abs/2207.04084 (2022) - [i156]Geoffrey Négiar, Michael W. Mahoney, Aditi S. Krishnapriyan:
Learning differentiable solvers for systems with hard constraints. CoRR abs/2207.08675 (2022) - [i155]T. Konstantin Rusch, Benjamin Paul Chamberlain, Michael W. Mahoney, Michael M. Bronstein, Siddhartha Mishra:
Gradient Gating for Deep Multi-Rate Learning on Graphs. CoRR abs/2210.00513 (2022) - [i154]Liam Hodgkinson, Christopher van der Heide, Fred Roosta, Michael W. Mahoney:
Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes. CoRR abs/2210.07612 (2022) - [i153]N. Benjamin Erichson, Soon Hoe Lim, Michael W. Mahoney:
Gated Recurrent Neural Networks with Weighted Time-Delay Feedback. CoRR abs/2212.00228 (2022) - 2021
- [j54]Wooseok Ha, Kimon Fountoulakis, Michael W. Mahoney:
Statistical guarantees for local graph clustering. J. Mach. Learn. Res. 22: 148:1-148:54 (2021) - [j53]Charles H. Martin, Michael W. Mahoney:
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning. J. Mach. Learn. Res. 22: 165:1-165:73 (2021) - [j52]Keith D. Levin, Fred Roosta, Minh Tang, Michael W. Mahoney, Carey E. Priebe:
Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings. J. Mach. Learn. Res. 22: 194:1-194:59 (2021) - [j51]Swapnil Das, James Demmel, Kimon Fountoulakis, Laura Grigori, Michael W. Mahoney, Shenghao Yang
:
Parallel and Communication Avoiding Least Angle Regression. SIAM J. Sci. Comput. 43(2): C154-C176 (2021) - [c116]Zhewei Yao, Amir Gholami, Sheng Shen, Mustafa Mustafa, Kurt Keutzer, Michael W. Mahoney:
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning. AAAI 2021: 10665-10673 - [c115]Ryan Theisen, Jason M. Klusowski, Michael W. Mahoney:
Good Classifiers are Abundant in the Interpolating Regime. AISTATS 2021: 3376-3384 - [c114]Zhengming Zhang, Yaoqing Yang, Zhewei Yao, Yujun Yan, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney:
Improving Semi-supervised Federated Learning by Reducing the Gradient Diversity of Models. IEEE BigData 2021: 1214-1225 - [c113]Michal Derezinski, Zhenyu Liao, Edgar Dobriban, Michael W. Mahoney:
Sparse sketches with small inversion bias. COLT 2021: 1467-1510 - [c112]Sheng Shen, Zhewei Yao, Douwe Kiela, Kurt Keutzer, Michael W. Mahoney:
What's Hidden in a One-layer Randomly Weighted Transformer? EMNLP (1) 2021: 2914-2921 - [c111]N. Benjamin Erichson, Omri Azencot, Alejandro F. Queiruga, Liam Hodgkinson, Michael W. Mahoney:
Lipschitz Recurrent Neural Networks. ICLR 2021 - [c110]Zhenyu Liao
, Romain Couillet, Michael W. Mahoney:
Sparse Quantized Spectral Clustering. ICLR 2021 - [c109]Francisco Utrera, Evan Kravitz, N. Benjamin Erichson, Rajiv Khanna, Michael W. Mahoney:
Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification. ICLR 2021 - [c108]Jianfei Chen, Lianmin Zheng, Zhewei Yao, Dequan Wang, Ion Stoica, Michael W. Mahoney, Joseph Gonzalez:
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training. ICML 2021: 1803-1813 - [c107]Liam Hodgkinson, Michael W. Mahoney:
Multiplicative Noise and Heavy Tails in Stochastic Optimization. ICML 2021: 4262-4274 - [c106]Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer:
I-BERT: Integer-only BERT Quantization. ICML 2021: 5506-5518 - [c105]Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael W. Mahoney, Kurt Keutzer:
HAWQ-V3: Dyadic Neural Network Quantization. ICML 2021: 11875-11886 - [c104]Michal Derezinski, Rajiv Khanna, Michael W. Mahoney:
Improved Guarantees and a Multiple-descent Curve for Column Subset Selection and the Nystrom Method (Extended Abstract). IJCAI 2021: 4765-4769 - [c103]Vipul Gupta
, Dhruv Choudhary, Ping Tak Peter Tang, Xiaohan Wei, Xing Wang, Yuzhen Huang, Arun Kejariwal, Kannan Ramchandran, Michael W. Mahoney:
Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism. KDD 2021: 2928-2936 - [c102]Michal Derezinski, Jonathan Lacotte, Mert Pilanci, Michael W. Mahoney:
Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update. NeurIPS 2021: 2835-2847 - [c101]Soon Hoe Lim, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney:
Noisy Recurrent Neural Networks. NeurIPS 2021: 5124-5137 - [c100]Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney:
Taxonomizing local versus global structure in neural network loss landscapes. NeurIPS 2021: 18722-18733 - [c99]Zhenyu Liao, Michael W. Mahoney:
Hessian Eigenspectra of More Realistic Nonlinear Models. NeurIPS 2021: 20104-20117 - [c98]Alejandro F. Queiruga, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney:
Stateful ODE-Nets using Basis Function Expansions. NeurIPS 2021: 21770-21781 - [c97]Aditi S. Krishnapriyan, Amir Gholami, Shandian Zhe, Robert M. Kirby, Michael W. Mahoney:
Characterizing possible failure modes in physics-informed neural networks. NeurIPS 2021: 26548-26560 - [c96]N. Benjamin Erichson, Dane Taylor, Qixuan Wu, Michael W. Mahoney:
Noise-Response Analysis of Deep Neural Networks Quantifies Robustness and Fingerprints Structural Malware. SDM 2021: 100-108 - [c95]Vipul Gupta, Avishek Ghosh, Michal Derezinski, Rajiv Khanna, Kannan Ramchandran, Michael W. Mahoney:
LocalNewton: Reducing communication rounds for distributed learning. UAI 2021: 632-642 - [c94]Liam Hodgkinson, Christopher van der Heide, Fred Roosta, Michael W. Mahoney:
Stochastic continuous normalizing flows: training SDEs as ODEs. UAI 2021: 1130-1140 - [c93]Rajiv Khanna, Liam Hodgkinson, Michael W. Mahoney:
Geometric rates of convergence for kernel-based sampling algorithms. UAI 2021: 2156-2164 - [e2]Jonghyun Lee, Eric F. Darve, Peter K. Kitanidis, Michael W. Mahoney, Anuj Karpatne, Matthew W. Farthing, Tyler J. Hesser:
Proceedings of the AAAI 2021 Spring Symposium on Combining Artificial Intelligence and Machine Learning with Physical Sciences, Stanford, CA, USA, March 22nd - to - 24th, 2021. CEUR Workshop Proceedings 2964, CEUR-WS.org 2021 [contents] - [i152]Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer:
I-BERT: Integer-only BERT Quantization. CoRR abs/2101.01321 (2021) - [i151]Shixing Yu, Zhewei Yao, Amir Gholami, Zhen Dong, Michael W. Mahoney, Kurt Keutzer:
Hessian-Aware Pruning and Optimal Neural Implant. CoRR abs/2101.08940 (2021) - [i150]Soon Hoe Lim, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney:
Noisy Recurrent Neural Networks. CoRR abs/2102.04877 (2021) - [i149]Omri Azencot, N. Benjamin Erichson, Mirela Ben-Chen, Michael W. Mahoney:
A Differential Geometry Perspective on Orthogonal Recurrent Models. CoRR abs/2102.09589 (2021) - [i148]Zhenyu Liao, Michael W. Mahoney:
Hessian Eigenspectra of More Realistic Nonlinear Models. CoRR abs/2103.01519 (2021) - [i147]Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer:
A Survey of Quantization Methods for Efficient Neural Network Inference. CoRR abs/2103.13630 (2021) - [i146]Sehoon Kim, Amir Gholami, Zhewei Yao, Aniruddha Nrusimha, Bohan Zhai, Tianren Gao, Michael W. Mahoney, Kurt Keutzer:
Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition. CoRR abs/2103.16827 (2021) - [i145]Jianfei Chen, Lianmin Zheng, Zhewei Yao, Dequan Wang, Ion Stoica, Michael W. Mahoney, Joseph E. Gonzalez:
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training. CoRR abs/2104.14129 (2021) - [i144]Vipul Gupta, Avishek Ghosh, Michal Derezinski, Rajiv Khanna, Kannan Ramchandran, Michael W. Mahoney:
LocalNewton: Reducing Communication Bottleneck for Distributed Learning. CoRR abs/2105.07320 (2021) - [i143]Zhewei Yao, Linjian Ma, Sheng Shen, Kurt Keutzer, Michael W. Mahoney:
MLPruning: A Multilevel Structured Pruning Framework for Transformer-based Models. CoRR abs/2105.14636 (2021) - [i142]Charles H. Martin, Michael W. Mahoney:
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics. CoRR abs/2106.00734 (2021) - [i141]Alejandro F. Queiruga, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney:
Compressing Deep ODE-Nets using Basis Function Expansions. CoRR abs/2106.10820 (2021) - [i140]Michal Derezinski, Jonathan Lacotte, Mert Pilanci, Michael W. Mahoney:
Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update. CoRR abs/2107.07480 (2021) - [i139]Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney:
Taxonomizing local versus global structure in neural network loss landscapes. CoRR abs/2107.11228 (2021) - [i138]Liam Hodgkinson, Umut Simsekli, Rajiv Khanna, Michael W. Mahoney:
Generalization Properties of Stochastic Optimizers via Trajectory Analysis. CoRR abs/2108.00781 (2021) - [i137]Aditi S. Krishnapriyan, Amir Gholami, Shandian Zhe, Robert M. Kirby, Michael W. Mahoney:
Characterizing possible failure modes in physics-informed neural networks. CoRR abs/2109.01050 (2021) - [i136]Sheng Shen, Zhewei Yao, Douwe Kiela, Kurt Keutzer, Michael W. Mahoney:
What's Hidden in a One-layer Randomly Weighted Transformer? CoRR abs/2109.03939 (2021) - [i135]Majid Jahani, Sergey Rusakov, Zheng Shi, Peter Richtárik
, Michael W. Mahoney, Martin Takác:
Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information. CoRR abs/2109.05198 (2021) - [i134]Soon Hoe Lim, N. Benjamin Erichson, Francisco Utrera, Winnie Xu, Michael W. Mahoney:
Noisy Feature Mixup. CoRR abs/2110.02180 (2021) - [i133]T. Konstantin Rusch, Siddhartha Mishra, N. Benjamin Erichson, Michael W. Mahoney:
Long Expressive Memory for Sequence Modeling. CoRR abs/2110.04744 (2021) - [i132]Luca Pion-Tonachini, Kristofer E. Bouchard, Héctor García Martín, Sean Peisert
, W. Bradley Holtz, Anil Aswani, Dipankar Dwivedi, Haruko M. Wainwright, Ghanshyam Pilania, Benjamin Nachman, Babetta L. Marrone, Nicola Falco, Prabhat, Daniel B. Arnold, Alejandro Wolf-Yadlin, Sarah Powers, Sharlee Climer, Quinn Jackson, Ty Carlson, Michael Sohn, Petrus H. Zwart, Neeraj Kumar, Amy Justice, Claire J. Tomlin, Daniel A. Jacobson, Gos Micklem, Georgios V. Gkoutos, Peter J. Bickel, Jean-Baptiste Cazier, Juliane Müller, Bobbie-Jo Webb-Robertson, Rick Stevens, Mark Anderson, Kenneth Kreutz-Delgado, Michael W. Mahoney, James B. Brown:
Learning from learning machines: a new generation of AI technology to meet the needs of science. CoRR abs/2111.13786 (2021) - 2020
- [j50]Peng Xu, Fred Roosta
, Michael W. Mahoney:
Newton-type methods for non-convex optimization under inexact Hessian information. Math. Program. 184(1): 35-70 (2020) - [c92]Linjian Ma, Gabe Montague, Jiayu Ye, Zhewei Yao, Amir Gholami, Kurt Keutzer, Michael W. Mahoney:
Inefficiency of K-FAC for Large Batch Size Training. AAAI 2020: 5053-5060 - [c91]Sheng Shen, Zhen Dong, Jiayu Ye, Linjian Ma, Zhewei Yao, Amir Gholami, Michael W. Mahoney, Kurt Keutzer:
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT. AAAI 2020: 8815-8821 - [c90]Ping Ma, Xinlian Zhang, Xin Xing, Jingyi Ma, Michael W. Mahoney:
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms. AISTATS 2020: 1026-1035 - [c89]Wooseok Ha, Kimon Fountoulakis, Michael W. Mahoney:
Statistical guarantees for local graph clustering. AISTATS 2020: 2687-2697 - [c88]Michal Derezinski, Feynman T. Liang, Michael W. Mahoney:
Bayesian experimental design using regularized determinantal point processes. AISTATS 2020: 3197-3207 - [c87]Vipul Gupta
, Swanand Kadhe, Thomas A. Courtade, Michael W. Mahoney, Kannan Ramchandran:
OverSketched Newton: Fast Convex Optimization for Serverless Systems. IEEE BigData 2020: 288-297 - [c86]