default search action
Himabindu Lakkaraju
Hima Lakkaraju
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j3]Satyapriya Krishna, Tessa Han, Alex Gu, Steven Wu, Shahin Jabbari, Himabindu Lakkaraju:
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective. Trans. Mach. Learn. Res. 2024 (2024) - [c57]Sree Harsha Tanneru, Chirag Agarwal, Himabindu Lakkaraju:
Quantifying Uncertainty in Natural Language Explanations of Large Language Models. AISTATS 2024: 1072-1080 - [c56]Alex Oesterling, Jiaqi Ma, Flávio P. Calmon, Himabindu Lakkaraju:
Fair Machine Unlearning: Data Removal while Mitigating Disparities. AISTATS 2024: 3736-3744 - [c55]Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju:
Understanding the Effects of Iterative Prompting on Truthfulness. ICML 2024 - [c54]Martin Pawelczyk, Seth Neel, Himabindu Lakkaraju:
In-Context Unlearning: Language Models as Few-Shot Unlearners. ICML 2024 - [c53]Himabindu Lakkaraju, Qiaozhu Mei, Chenhao Tan, Jie Tang, Yutong Xie:
The First Workshop on AI Behavioral Science. KDD 2024: 6724-6725 - [c52]Yanchen Liu, Srishti Gautam, Jiaqi Ma, Himabindu Lakkaraju:
Confronting LLMs with Traditional ML: Rethinking the Fairness of Large Language Models in Tabular Classifications. NAACL-HLT 2024: 3603-3620 - [c51]Hanlin Zhang, Yifan Zhang, Yaodong Yu, Dhruv Madeka, Dean P. Foster, Eric P. Xing, Himabindu Lakkaraju, Sham M. Kakade:
A Study on the Calibration of In-context Learning. NAACL-HLT 2024: 6118-6136 - [i75]Chirag Agarwal, Sree Harsha Tanneru, Himabindu Lakkaraju:
Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models. CoRR abs/2402.04614 (2024) - [i74]Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju:
Understanding the Effects of Iterative Prompting on Truthfulness. CoRR abs/2402.06625 (2024) - [i73]Usha Bhalla, Alex Oesterling, Suraj Srinivas, Flávio P. Calmon, Himabindu Lakkaraju:
Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE). CoRR abs/2402.10376 (2024) - [i72]Haiyan Zhao, Fan Yang, Himabindu Lakkaraju, Mengnan Du:
Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability. CoRR abs/2402.10688 (2024) - [i71]Zhenting Qi, Hanlin Zhang, Eric P. Xing, Sham M. Kakade, Himabindu Lakkaraju:
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems. CoRR abs/2402.17840 (2024) - [i70]Tessa Han, Aounon Kumar, Chirag Agarwal, Himabindu Lakkaraju:
Towards Safe and Aligned Large Language Models for Medicine. CoRR abs/2403.03744 (2024) - [i69]Jiaqi Ma, Vivian Lai, Yiming Zhang, Chacha Chen, Paul Hamilton, Davor Ljubenkov, Himabindu Lakkaraju, Chenhao Tan:
OpenHEXAI: An Open-Source Framework for Human-Centered Evaluation of Explainable Machine Learning. CoRR abs/2403.05565 (2024) - [i68]Elita A. Lobo, Harvineet Singh, Marek Petrik, Cynthia Rudin, Himabindu Lakkaraju:
Data Poisoning Attacks on Off-Policy Policy Evaluation Methods. CoRR abs/2404.04714 (2024) - [i67]Aounon Kumar, Himabindu Lakkaraju:
Manipulating Large Language Models to Increase Product Visibility. CoRR abs/2404.07981 (2024) - [i66]Aaron J. Li, Satyapriya Krishna, Himabindu Lakkaraju:
More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness. CoRR abs/2404.18870 (2024) - [i65]Andreas Madsen, Himabindu Lakkaraju, Siva Reddy, Sarath Chandar:
Interpretability Needs a New Paradigm. CoRR abs/2405.05386 (2024) - [i64]Sree Harsha Tanneru, Dan Ley, Chirag Agarwal, Himabindu Lakkaraju:
On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models. CoRR abs/2406.10625 (2024) - [i63]Alex Oesterling, Usha Bhalla, Suresh Venkatasubramanian, Himabindu Lakkaraju:
Operationalizing the Blueprint for an AI Bill of Rights: Recommendations for Practitioners, Researchers, and Policy Makers. CoRR abs/2407.08689 (2024) - [i62]Charumathi Badrinath, Usha Bhalla, Alex Oesterling, Suraj Srinivas, Himabindu Lakkaraju:
All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models. CoRR abs/2407.13449 (2024) - [i61]Catherine Huang, Martin Pawelczyk, Himabindu Lakkaraju:
Explaining the Model, Protecting Your Data: Revealing and Mitigating the Data Privacy Risks of Post-Hoc Model Explanations via Membership Inference. CoRR abs/2407.17663 (2024) - [i60]Kaivalya Rawal, Himabindu Lakkaraju:
Learning Recourse Costs from Pairwise Feature Comparisons. CoRR abs/2409.13940 (2024) - [i59]Zhenting Qi, Hongyin Luo, Xuliang Huang, Zhuokai Zhao, Yibo Jiang, Xiangjun Fan, Himabindu Lakkaraju, James R. Glass:
Quantifying Generalization Complexity for Large Language Models. CoRR abs/2410.01769 (2024) - [i58]Dan Ley, Suraj Srinivas, Shichang Zhang, Gili Rusak, Himabindu Lakkaraju:
Generalized Group Data Attribution. CoRR abs/2410.09940 (2024) - 2023
- [j2]Dylan Slack, Satyapriya Krishna, Himabindu Lakkaraju, Sameer Singh:
Explaining machine learning models with interactive natural language conversations using TalkToModel. Nat. Mac. Intell. 5(8): 873-883 (2023) - [j1]Sean McGrath, Parth Mehta, Alexandra Zytek, Isaac Lage, Himabindu Lakkaraju:
When Does Uncertainty Matter?: Understanding the Impact of Predictive Uncertainty in ML Assisted Decision Making. Trans. Mach. Learn. Res. 2023 (2023) - [c50]Martin Pawelczyk, Himabindu Lakkaraju, Seth Neel:
On the Privacy Risks of Algorithmic Recourse. AISTATS 2023: 9680-9696 - [c49]Martin Pawelczyk, Teresa Datta, Johannes van den Heuvel, Gjergji Kasneci, Himabindu Lakkaraju:
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse. ICLR 2023 - [c48]Ruijiang Gao, Himabindu Lakkaraju:
On the Impact of Algorithmic Recourse on Social Segregation. ICML 2023: 10727-10743 - [c47]Satyapriya Krishna, Jiaqi Ma, Himabindu Lakkaraju:
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten. ICML 2023: 17808-17826 - [c46]Krishnaram Kenthapadi, Himabindu Lakkaraju, Nazneen Rajani:
Generative AI meets Responsible AI: Practical Challenges and Opportunities. KDD 2023: 5805-5806 - [c45]Xuhong Li, Mengnan Du, Jiamin Chen, Yekun Chai, Himabindu Lakkaraju, Haoyi Xiong:
M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models. NeurIPS 2023 - [c44]Usha Bhalla, Suraj Srinivas, Himabindu Lakkaraju:
Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability. NeurIPS 2023 - [c43]Satyapriya Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, Sameer Singh, Himabindu Lakkaraju:
Post Hoc Explanations of Language Models Can Improve Language Models. NeurIPS 2023 - [c42]Suraj Srinivas, Sebastian Bordt, Himabindu Lakkaraju:
Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness. NeurIPS 2023 - [c41]Anna P. Meyer, Dan Ley, Suraj Srinivas, Himabindu Lakkaraju:
On Minimizing the Impact of Dataset Shifts on Actionable Explanations. UAI 2023: 1434-1444 - [c40]Valeria Fionda, Olaf Hartig, Reyhaneh Abdolazimi, Sihem Amer-Yahia, Hongzhi Chen, Xiao Chen, Peng Cui, Jeffrey Dalton, Xin Luna Dong, Lisette Espín-Noboa, Wenqi Fan, Manuela Fritz, Quan Gan, Jingtong Gao, Xiaojie Guo, Torsten Hahmann, Jiawei Han, Soyeon Caren Han, Estevam Hruschka, Liang Hu, Jiaxin Huang, Utkarshani Jaimini, Olivier Jeunen, Yushan Jiang, Fariba Karimi, George Karypis, Krishnaram Kenthapadi, Himabindu Lakkaraju, Hady W. Lauw, Thai Le, Trung-Hoang Le, Dongwon Lee, Geon Lee, Liat Levontin, Cheng-Te Li, Haoyang Li, Ying Li, Jay Chiehen Liao, Qidong Liu, Usha Lokala, Ben London, Siqu Long, Hande Küçük-McGinty, Yu Meng, Seungwhan Moon, Usman Naseem, Pradeep Natarajan, Behrooz Omidvar-Tehrani, Zijie Pan, Devesh Parekh, Jian Pei, Tiago Peixoto, Steven Pemberton, Josiah Poon, Filip Radlinski, Federico Rossetto, Kaushik Roy, Aghiles Salah, Mehrnoosh Sameki, Amit P. Sheth, Cogan Shimizu, Kijung Shin, Dongjin Song, Julia Stoyanovich, Dacheng Tao, Johanne Trippas, Quoc Truong, Yu-Che Tsai, Adaku Uchendu, Bram van den Akker, Lin Wang, Minjie Wang, Shoujin Wang, Xin Wang, Ingmar Weber, Henry Weld, Lingfei Wu, Da Xu, Yifan Ethan Xu, Shuyuan Xu, Bo Yang, Ke Yang, Elad Yom-Tov, Jaemin Yoo, Zhou Yu, Reza Zafarani, Hamed Zamani, Meike Zehlike, Qi Zhang, Xikun Zhang, Yongfeng Zhang, Yu Zhang, Zheng Zhang, Liang Zhao, Xiangyu Zhao, Wenwu Zhu:
Tutorials at The Web Conference 2023. WWW (Companion Volume) 2023: 648-658 - [i57]Satyapriya Krishna, Jiaqi Ma, Himabindu Lakkaraju:
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten. CoRR abs/2302.04288 (2023) - [i56]Satyapriya Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, Sameer Singh, Himabindu Lakkaraju:
Post Hoc Explanations of Language Models Can Improve Language Models. CoRR abs/2305.11426 (2023) - [i55]Suraj Srinivas, Sebastian Bordt, Hima Lakkaraju:
Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness. CoRR abs/2305.19101 (2023) - [i54]Alexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj Srinivas, Himabindu Lakkaraju:
Word-Level Explanations for Analyzing Bias in Text-to-Image Models. CoRR abs/2306.05500 (2023) - [i53]Dan Ley, Leonard Tang, Matthew Nazari, Hongjin Lin, Suraj Srinivas, Himabindu Lakkaraju:
Consistent Explanations in the Face of Model Indeterminacy via Ensembling. CoRR abs/2306.06193 (2023) - [i52]Anna P. Meyer, Dan Ley, Suraj Srinivas, Himabindu Lakkaraju:
On Minimizing the Impact of Dataset Shifts on Actionable Explanations. CoRR abs/2306.06716 (2023) - [i51]Skyler Wu, Eric Meng Shen, Charumathi Badrinath, Jiaqi Ma, Himabindu Lakkaraju:
Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions. CoRR abs/2307.13339 (2023) - [i50]Tessa Han, Suraj Srinivas, Himabindu Lakkaraju:
Efficient Estimation of the Local Robustness of Machine Learning Models. CoRR abs/2307.13885 (2023) - [i49]Alex Oesterling, Jiaqi Ma, Flávio P. Calmon, Hima Lakkaraju:
Fair Machine Unlearning: Data Removal while Mitigating Disparities. CoRR abs/2307.14754 (2023) - [i48]Usha Bhalla, Suraj Srinivas, Himabindu Lakkaraju:
Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability. CoRR abs/2307.15007 (2023) - [i47]Catherine Huang, Chelse Swoopes, Christina Xiao, Jiaqi Ma, Himabindu Lakkaraju:
Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage. CoRR abs/2308.04341 (2023) - [i46]Aounon Kumar, Chirag Agarwal, Suraj Srinivas, Soheil Feizi, Hima Lakkaraju:
Certifying LLM Safety against Adversarial Prompting. CoRR abs/2309.02705 (2023) - [i45]Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju:
On the Trade-offs between Adversarial Robustness and Actionable Explanations. CoRR abs/2309.16452 (2023) - [i44]Nicholas Kroeger, Dan Ley, Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju:
Are Large Language Models Post Hoc Explainers? CoRR abs/2310.05797 (2023) - [i43]Martin Pawelczyk, Seth Neel, Himabindu Lakkaraju:
In-Context Unlearning: Language Models as Few Shot Unlearners. CoRR abs/2310.07579 (2023) - [i42]Yanchen Liu, Srishti Gautam, Jiaqi Ma, Himabindu Lakkaraju:
Investigating the Fairness of Large Language Models for Predictions on Tabular Data. CoRR abs/2310.14607 (2023) - [i41]Sree Harsha Tanneru, Chirag Agarwal, Himabindu Lakkaraju:
Quantifying Uncertainty in Natural Language Explanations of Large Language Models. CoRR abs/2311.03533 (2023) - [i40]Hanlin Zhang, Yifan Zhang, Yaodong Yu, Dhruv Madeka, Dean P. Foster, Eric P. Xing, Himabindu Lakkaraju, Sham M. Kakade:
A Study on the Calibration of In-context Learning. CoRR abs/2312.04021 (2023) - [i39]Tessa Han, Yasha Ektefaie, Maha Farhat, Marinka Zitnik, Himabindu Lakkaraju:
Is Ignorance Bliss? The Role of Post Hoc Explanation Faithfulness and Alignment in Model Trust in Laypeople and Domain Experts. CoRR abs/2312.05690 (2023) - 2022
- [c39]Jessica Dai, Sohini Upadhyay, Ulrich Aïvodji, Stephen H. Bach, Himabindu Lakkaraju:
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations. AIES 2022: 203-214 - [c38]Harvineet Singh, Shalmali Joshi, Finale Doshi-Velez, Himabindu Lakkaraju:
Towards Robust Off-Policy Evaluation via Human Inputs. AIES 2022: 686-699 - [c37]Martin Pawelczyk, Chirag Agarwal, Shalmali Joshi, Sohini Upadhyay, Himabindu Lakkaraju:
Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis. AISTATS 2022: 4574-4594 - [c36]Chirag Agarwal, Marinka Zitnik, Himabindu Lakkaraju:
Probing GNN Explainers: A Rigorous Theoretical and Empirical Analysis of GNN Explanation Methods. AISTATS 2022: 8969-8996 - [c35]Murtuza N. Shergadwala, Himabindu Lakkaraju, Krishnaram Kenthapadi:
A Human-Centric Perspective on Model Monitoring. HCOMP 2022: 173-183 - [c34]Krishnaram Kenthapadi, Himabindu Lakkaraju, Pradeep Natarajan, Mehrnoosh Sameki:
Model Monitoring in Practice: Lessons Learned and Open Challenges. KDD 2022: 4800-4801 - [c33]Chirag Agarwal, Satyapriya Krishna, Eshika Saxena, Martin Pawelczyk, Nari Johnson, Isha Puri, Marinka Zitnik, Himabindu Lakkaraju:
OpenXAI: Towards a Transparent Evaluation of Model Explanations. NeurIPS 2022 - [c32]Tessa Han, Suraj Srinivas, Himabindu Lakkaraju:
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations. NeurIPS 2022 - [c31]Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, François Fleuret:
Efficient Training of Low-Curvature Neural Networks. NeurIPS 2022 - [c30]Elita A. Lobo, Harvineet Singh, Marek Petrik, Cynthia Rudin, Himabindu Lakkaraju:
Data poisoning attacks on off-policy policy evaluation methods. UAI 2022: 1264-1274 - [i38]Satyapriya Krishna, Tessa Han, Alex Gu, Javin Pombra, Shahin Jabbari, Steven Wu, Himabindu Lakkaraju:
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective. CoRR abs/2202.01602 (2022) - [i37]Himabindu Lakkaraju, Dylan Slack, Yuxin Chen, Chenhao Tan, Sameer Singh:
Rethinking Explainability as a Dialogue: A Practitioner's Perspective. CoRR abs/2202.01875 (2022) - [i36]Martin Pawelczyk, Teresa Datta, Johannes van den Heuvel, Gjergji Kasneci, Himabindu Lakkaraju:
Algorithmic Recourse in the Face of Noisy Human Responses. CoRR abs/2203.06768 (2022) - [i35]Chirag Agarwal, Nari Johnson, Martin Pawelczyk, Satyapriya Krishna, Eshika Saxena, Marinka Zitnik, Himabindu Lakkaraju:
Rethinking Stability for Attribution-based Explanations. CoRR abs/2203.06877 (2022) - [i34]Jessica Dai, Sohini Upadhyay, Ulrich Aïvodji, Stephen H. Bach, Himabindu Lakkaraju:
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations. CoRR abs/2205.07277 (2022) - [i33]Tessa Han, Suraj Srinivas, Himabindu Lakkaraju:
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations. CoRR abs/2206.01254 (2022) - [i32]Murtuza N. Shergadwala, Himabindu Lakkaraju, Krishnaram Kenthapadi:
A Human-Centric Take on Model Monitoring. CoRR abs/2206.02868 (2022) - [i31]Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, François Fleuret:
Flatten the Curve: Efficiently Training Low-Curvature Neural Networks. CoRR abs/2206.07144 (2022) - [i30]Chirag Agarwal, Eshika Saxena, Satyapriya Krishna, Martin Pawelczyk, Nari Johnson, Isha Puri, Marinka Zitnik, Himabindu Lakkaraju:
OpenXAI: Towards a Transparent Evaluation of Model Explanations. CoRR abs/2206.11104 (2022) - [i29]Dylan Slack, Satyapriya Krishna, Himabindu Lakkaraju, Sameer Singh:
TalkToModel: Understanding Machine Learning Models With Open Ended Dialogues. CoRR abs/2207.04154 (2022) - [i28]Chirag Agarwal, Owen Queen, Himabindu Lakkaraju, Marinka Zitnik:
Evaluating Explainability for Graph Neural Networks. CoRR abs/2208.09339 (2022) - [i27]Harvineet Singh, Shalmali Joshi, Finale Doshi-Velez, Himabindu Lakkaraju:
Towards Robust Off-Policy Evaluation via Human Inputs. CoRR abs/2209.08682 (2022) - [i26]Martin Pawelczyk, Himabindu Lakkaraju, Seth Neel:
On the Privacy Risks of Algorithmic Recourse. CoRR abs/2211.05427 (2022) - 2021
- [c29]Aida Rahmattalabi, Shahin Jabbari, Himabindu Lakkaraju, Phebe Vayanos, Max Izenberg, Ryan Brown, Eric Rice, Milind Tambe:
Fair Influence Maximization: a Welfare Optimization Approach. AAAI 2021: 11630-11638 - [c28]Tom Sühr, Sophie Hilgard, Himabindu Lakkaraju:
Does Fair Ranking Improve Minority Outcomes? Understanding the Interplay of Human and Algorithmic Biases in Online Hiring. AIES 2021: 989-999 - [c27]Himabindu Lakkaraju:
Towards Reliable and Practicable Algorithmic Recourse. CIKM 2021: 4 - [c26]Sushant Agarwal, Shahin Jabbari, Chirag Agarwal, Sohini Upadhyay, Steven Wu, Himabindu Lakkaraju:
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations. ICML 2021: 110-119 - [c25]Dylan Slack, Anna Hilgard, Himabindu Lakkaraju, Sameer Singh:
Counterfactual Explanations Can Be Manipulated. NeurIPS 2021: 62-75 - [c24]Dylan Slack, Anna Hilgard, Sameer Singh, Himabindu Lakkaraju:
Reliable Post hoc Explanations: Modeling Uncertainty in Explainability. NeurIPS 2021: 9391-9404 - [c23]Sohini Upadhyay, Shalmali Joshi, Himabindu Lakkaraju:
Towards Robust and Reliable Algorithmic Recourse. NeurIPS 2021: 16926-16937 - [c22]Alexis Ross, Himabindu Lakkaraju, Osbert Bastani:
Learning Models for Actionable Recourse. NeurIPS 2021: 18734-18746 - [c21]Chirag Agarwal, Himabindu Lakkaraju, Marinka Zitnik:
Towards a unified framework for fair and stable graph representation learning. UAI 2021: 2114-2124 - [i25]Sushant Agarwal, Shahin Jabbari, Chirag Agarwal, Sohini Upadhyay, Zhiwei Steven Wu, Himabindu Lakkaraju:
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations. CoRR abs/2102.10618 (2021) - [i24]Chirag Agarwal, Himabindu Lakkaraju, Marinka Zitnik:
Towards a Unified Framework for Fair and Stable Graph Representation Learning. CoRR abs/2102.13186 (2021) - [i23]Sohini Upadhyay, Shalmali Joshi, Himabindu Lakkaraju:
Towards Robust and Reliable Algorithmic Recourse. CoRR abs/2102.13620 (2021) - [i22]Harvineet Singh, Shalmali Joshi, Finale Doshi-Velez, Himabindu Lakkaraju:
Learning Under Adversarial and Interventional Shifts. CoRR abs/2103.15933 (2021) - [i21]Dylan Slack, Sophie Hilgard, Himabindu Lakkaraju, Sameer Singh:
Counterfactual Explanations Can Be Manipulated. CoRR abs/2106.02666 (2021) - [i20]Chirag Agarwal, Marinka Zitnik, Himabindu Lakkaraju:
Towards a Rigorous Theoretical Analysis and Evaluation of GNN Explanations. CoRR abs/2106.09078 (2021) - [i19]Martin Pawelczyk, Shalmali Joshi, Chirag Agarwal, Sohini Upadhyay, Himabindu Lakkaraju:
On the Connections between Counterfactual Explanations and Adversarial Examples. CoRR abs/2106.09992 (2021) - [i18]Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju:
Feature Attributions and Counterfactual Explanations Can Be Manipulated. CoRR abs/2106.12563 (2021) - [i17]Jessica Dai, Sohini Upadhyay, Stephen H. Bach, Himabindu Lakkaraju:
What will it take to generate fairness-preserving explanations? CoRR abs/2106.13346 (2021) - 2020
- [c20]Himabindu Lakkaraju, Osbert Bastani:
"How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations. AIES 2020: 79-85 - [c19]Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, Himabindu Lakkaraju:
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods. AIES 2020: 180-186 - [c18]Himabindu Lakkaraju, Nino Arsov, Osbert Bastani:
Robust and Stable Black Box Explanations. ICML 2020: 5628-5638 - [c17]Kaivalya Rawal, Himabindu Lakkaraju:
Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses. NeurIPS 2020 - [c16]Wanqian Yang, Lars Lorch, Moritz A. Graule, Himabindu Lakkaraju, Finale Doshi-Velez:
Incorporating Interpretable Output Constraints in Bayesian Neural Networks. NeurIPS 2020 - [i16]Aida Rahmattalabi, Shahin Jabbari, Himabindu Lakkaraju, Phebe Vayanos, Eric Rice, Milind Tambe:
Fair Influence Maximization: A Welfare Optimization Approach. CoRR abs/2006.07906 (2020) - [i15]Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju:
How Much Should I Trust You? Modeling Uncertainty of Black Box Explanations. CoRR abs/2008.05030 (2020) - [i14]Kaivalya Rawal, Himabindu Lakkaraju:
Interpretable and Interactive Summaries of Actionable Recourses. CoRR abs/2009.07165 (2020) - [i13]Wanqian Yang, Lars Lorch, Moritz A. Graule, Himabindu Lakkaraju, Finale Doshi-Velez:
Incorporating Interpretable Output Constraints in Bayesian Neural Networks. CoRR abs/2010.10969 (2020) - [i12]Alexis Ross, Himabindu Lakkaraju, Osbert Bastani:
Ensuring Actionable Recourse via Adversarial Training. CoRR abs/2011.06146 (2020) - [i11]Sean McGrath, Parth Mehta, Alexandra Zytek, Isaac Lage, Himabindu Lakkaraju:
When Does Uncertainty Matter?: Understanding the Impact of Predictive Uncertainty in ML Assisted Decision Making. CoRR abs/2011.06167 (2020) - [i10]Himabindu Lakkaraju, Nino Arsov, Osbert Bastani:
Robust and Stable Black Box Explanations. CoRR abs/2011.06169 (2020) - [i9]Tom Sühr, Sophie Hilgard, Himabindu Lakkaraju:
Does Fair Ranking Improve Minority Outcomes? Understanding the Interplay of Human and Algorithmic Biases in Online Hiring. CoRR abs/2012.00423 (2020) - [i8]Kaivalya Rawal, Ece Kamar, Himabindu Lakkaraju:
Can I Still Trust You?: Understanding the Impact of Distribution Shifts on Algorithmic Recourses. CoRR abs/2012.11788 (2020)
2010 – 2019
- 2019
- [c15]Himabindu Lakkaraju, Ece Kamar, Rich Caruana, Jure Leskovec:
Faithful and Customizable Explanations of Black Box Models. AIES 2019: 131-138 - [i7]Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, Himabindu Lakkaraju:
How can we fool LIME and SHAP? Adversarial Attacks on Post hoc Explanation Methods. CoRR abs/1911.02508 (2019)