default search action
Noah A. Smith
Person information
- affiliation: University of Washington, Seattle, WA, USA
- affiliation: Allen Institute for AI, Seattle, WA, USA
- affiliation: Carnegie Mellon University, Pittsburgh, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j34]Alexis Palmer, Noah A. Smith, Arthur Spirling:
Using proprietary language models in academic research requires explicit justification. Nat. Comput. Sci. 4(1): 2-3 (2024) - [j33]Judit Ács, Endre Hamerlik, Roy Schwartz, Noah A. Smith, András Kornai:
Morphosyntactic probing of multilingual BERT models. Nat. Lang. Eng. 30(4): 753-792 (2024) - [c289]Kai Nylund, Suchin Gururangan, Noah A. Smith:
Time is Encoded in the Weights of Finetuned Language Models. ACL (1) 2024: 2571-2587 - [c288]Bowen Zhao, Zander Brumbaugh, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith:
Set the Clock: Temporal Alignment of Pretrained Language Models. ACL (Findings) 2024: 15015-15040 - [c287]Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Raghavi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo:
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research. ACL (1) 2024: 15725-15788 - [c286]Dirk Groeneveld, Iz Beltagy, Evan Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi:
OLMo: Accelerating the Science of Language Models. ACL (1) 2024: 15789-15809 - [c285]Tal August, Kyle Lo, Noah A. Smith, Katharina Reinecke:
Know Your Audience: The benefits and pitfalls of generating plain language summaries beyond the "general" audience. CHI 2024: 14:1-14:26 - [c284]Yanai Elazar, Jiayao Zhang, David Wadden, Bo Zhang, Noah A. Smith:
Estimating the Causal Effect of Early ArXiving on Paper Acceptance. CLeaR 2024: 913-933 - [c283]Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Dragomir Radev, Yejin Choi, Noah A. Smith:
A Call for Clarity in Beam Search: How It Works and When It Stops. LREC/COLING 2024: 77-90 - [c282]Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna:
BLINK: Multimodal Large Language Models Can See but Not Perceive. ECCV (23) 2024: 148-166 - [c281]Yanai Elazar, Bhargavi Paranjape, Hao Peng, Sarah Wiegreffe, Khyathi Raghavi Chandu, Vivek Srikumar, Sameer Singh, Noah A. Smith:
Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals. EMNLP (Findings) 2024: 3603-3623 - [c280]Orevaoghene Ahia, Anuoluwapo Aremu, Diana Abagyan, Hila Gonen, David Ifeoluwa Adelani, Daud Abolade, Noah A. Smith, Yulia Tsvetkov:
Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects. EMNLP 2024: 4392-4409 - [c279]Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer:
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models. EMNLP 2024: 10822-10837 - [c278]William Merrill, Noah A. Smith, Yanai Elazar:
Evaluating n-Gram Novelty of Language Models Using Rusty-DAWG. EMNLP 2024: 14459-14473 - [c277]Jacob Morrison, Noah A. Smith, Hannaneh Hajishirzi, Pang Wei Koh, Jesse Dodge, Pradeep Dasigi:
Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging. EMNLP (Findings) 2024: 15604-15621 - [c276]Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Evan Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hannaneh Hajishirzi, Noah A. Smith, Jesse Dodge:
What's In My Big Data? ICLR 2024 - [c275]Sewon Min, Suchin Gururangan, Eric Wallace, Weijia Shi, Hannaneh Hajishirzi, Noah A. Smith, Luke Zettlemoyer:
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore. ICLR 2024 - [c274]Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Wen-tau Yih, Mike Lewis:
In-Context Pretraining: Language Modeling Beyond Document Boundaries. ICLR 2024 - [c273]Muru Zhang, Ofir Press, William Merrill, Alisa Liu, Noah A. Smith:
How Language Model Hallucinations Can Snowball. ICML 2024 - [i213]Alisa Liu, Xiaochuang Han, Yizhong Wang, Yulia Tsvetkov, Yejin Choi, Noah A. Smith:
Tuning Language Models by Proxy. CoRR abs/2401.08565 (2024) - [i212]Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer:
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models. CoRR abs/2401.10440 (2024) - [i211]Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Raghavi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo:
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research. CoRR abs/2402.00159 (2024) - [i210]Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi:
OLMo: Accelerating the Science of Language Models. CoRR abs/2402.00838 (2024) - [i209]Bowen Zhao, Zander Brumbaugh, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith:
Set the Clock: Temporal Alignment of Pretrained Language Models. CoRR abs/2402.16797 (2024) - [i208]Tal August, Kyle Lo, Noah A. Smith, Katharina Reinecke:
Know Your Audience: The benefits and pitfalls of generating plain language summaries beyond the "general" audience. CoRR abs/2403.04979 (2024) - [i207]Rahul Nadkarni, Yizhong Wang, Noah A. Smith:
Third-Party Language Model Performance Prediction from Instruction. CoRR abs/2403.12413 (2024) - [i206]Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, Hao Cheng, Noah A. Smith, Mari Ostendorf:
Encode Once and Decode in Parallel: Efficient Transformer Decoding. CoRR abs/2403.13112 (2024) - [i205]Nathan Lambert, Valentina Pyatkin, Jacob Morrison, LJ Miranda, Bill Yuchen Lin, Khyathi Raghavi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi:
RewardBench: Evaluating Reward Models for Language Modeling. CoRR abs/2403.13787 (2024) - [i204]Margaret Y. Li, Alisa Liu, Zhaofeng Wu, Noah A. Smith:
A Taxonomy of Ambiguity Types for NLP. CoRR abs/2403.14072 (2024) - [i203]Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna:
BLINK: Multimodal Large Language Models Can See but Not Perceive. CoRR abs/2404.12390 (2024) - [i202]Kabir Ahuja, Vidhisha Balachandran, Madhur Panwar, Tianxing He, Noah A. Smith, Navin Goyal, Yulia Tsvetkov:
Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically. CoRR abs/2404.16367 (2024) - [i201]Ilia Kuznetsov, Osama Mohammed Afzal, Koen Dercksen, Nils Dycke, Alexander Goldberg, Tom Hope, Dirk Hovy, Jonathan K. Kummerfeld, Anne Lauscher, Kevin Leyton-Brown, Sheng Lu, Mausam, Margot Mieskes, Aurélie Névéol, Danish Pruthi, Lizhen Qu, Roy Schwartz, Noah A. Smith, Thamar Solorio, Jingyan Wang, Xiaodan Zhu, Anna Rogers, Nihar B. Shah, Iryna Gurevych:
What Can Natural Language Processing Do for Peer Review? CoRR abs/2405.06563 (2024) - [i200]Hamish Ivison, Yizhong Wang, Jiacheng Liu, Zeqiu Wu, Valentina Pyatkin, Nathan Lambert, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi:
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback. CoRR abs/2406.09279 (2024) - [i199]Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Ranjay Krishna:
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models. CoRR abs/2406.09403 (2024) - [i198]William Merrill, Noah A. Smith, Yanai Elazar:
Evaluating n-Gram Novelty of Language Models Using Rusty-DAWG. CoRR abs/2406.13069 (2024) - [i197]Boyi Wei, Weijia Shi, Yangsibo Huang, Noah A. Smith, Chiyuan Zhang, Luke Zettlemoyer, Kai Li, Peter Henderson:
Evaluating Copyright Takedown Methods for Language Models. CoRR abs/2406.18664 (2024) - [i196]Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon S. Du:
Decoding-Time Language Model Alignment with Multiple Objectives. CoRR abs/2406.18853 (2024) - [i195]Orevaoghene Ahia, Anuoluwapo Aremu, Diana Abagyan, Hila Gonen, David Ifeoluwa Adelani, Daud Abolade, Noah A. Smith, Yulia Tsvetkov:
Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects. CoRR abs/2406.19564 (2024) - [i194]Weijia Shi, Jaechan Lee, Yangsibo Huang, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah A. Smith, Chiyuan Zhang:
MUSE: Machine Unlearning Six-Way Evaluation for Language Models. CoRR abs/2407.06460 (2024) - [i193]Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Valentin Hoffman, Tomasz Limisiewicz, Yulia Tsvetkov, Noah A. Smith:
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization. CoRR abs/2407.08818 (2024) - [i192]Faeze Brahman, Sachin Kumar, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin, Abhilasha Ravichander, Sarah Wiegreffe, Nouha Dziri, Khyathi Raghavi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi:
The Art of Saying No: Contextual Noncompliance in Language Models. CoRR abs/2407.12043 (2024) - [i191]Jonathan Hayase, Alisa Liu, Yejin Choi, Sewoong Oh, Noah A. Smith:
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data? CoRR abs/2407.16607 (2024) - [i190]Hila Gonen, Terra Blevins, Alisa Liu, Luke Zettlemoyer, Noah A. Smith:
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models. CoRR abs/2408.06518 (2024) - [i189]Nikita Haduong, Irene Wang, Bo-Ru Lu, Prithviraj Ammanabrolu, Noah A. Smith:
CPS-TaskForge: Generating Collaborative Problem Solving Environments for Diverse Communication Tasks. CoRR abs/2408.08853 (2024) - [i188]Nikita Haduong, Alice Gao, Noah A. Smith:
Risks and NLP Design: A Case Study on Procedural Document QA. CoRR abs/2408.11860 (2024) - [i187]Guang Yang, Muru Zhang, Lin Qiu, Yanming Wan, Noah A. Smith:
Toward a More Complete OMR Solution. CoRR abs/2409.00316 (2024) - [i186]Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A. Smith, Pang Wei Koh, Amanpreet Singh, Hannaneh Hajishirzi:
OLMoE: Open Mixture-of-Experts Language Models. CoRR abs/2409.02060 (2024) - [i185]Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, Yen-Sung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou, Arnavi Chheda, Jenna Sparks, Sam Skjonsberg, Michael Schmitz, Aaron Sarnat, Byron Bischoff, Pete Walsh, Chris Newell, Piper Wolters, Tanmay Gupta, Kuo-Hao Zeng, Jon Borchardt, Dirk Groeneveld, Jen Dumas, Crystal Nam, Sophie Lebrecht, Caitlin Wittlif, Carissa Schoenick, Oscar Michel, Ranjay Krishna, Luca Weihs, Noah A. Smith, Hannaneh Hajishirzi, Ross B. Girshick, Ali Farhadi, Aniruddha Kembhavi:
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models. CoRR abs/2409.17146 (2024) - [i184]Jacob Morrison, Noah A. Smith, Hannaneh Hajishirzi, Pang Wei Koh, Jesse Dodge, Pradeep Dasigi:
Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging. CoRR abs/2410.12937 (2024) - [i183]Sachin Kumar, Chan Young Park, Yulia Tsvetkov, Noah A. Smith, Hannaneh Hajishirzi:
ComPO: Community Preferences for Language Model Personalization. CoRR abs/2410.16027 (2024) - [i182]Nikita Haduong, Noah A. Smith:
Raising the Stakes: Performance Pressure Improves AI-Assisted Decision Making. CoRR abs/2410.16560 (2024) - [i181]Lester James V. Miranda, Yizhong Wang, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi:
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback. CoRR abs/2410.19133 (2024) - [i180]Nathan Lambert, Jacob Morrison, Valentina Pyatkin, Shengyi Huang, Hamish Ivison, Faeze Brahman, Lester James V. Miranda, Alisa Liu, Nouha Dziri, Shane Lyu, Yuling Gu, Saumya Malik, Victoria Graf, Jena D. Hwang, Jiangjiang Yang, Ronan Le Bras, Oyvind Tafjord, Chris Wilhelm, Luca Soldaini, Noah A. Smith, Yizhong Wang, Pradeep Dasigi, Hannaneh Hajishirzi:
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training. CoRR abs/2411.15124 (2024) - [i179]Akshita Bhagia, Jiacheng Liu, Alexander Wettig, David Heineman, Oyvind Tafjord, Ananya Harsh Jha, Luca Soldaini, Noah A. Smith, Dirk Groeneveld, Pang Wei Koh, Jesse Dodge, Hannaneh Hajishirzi:
Establishing Task Scaling Laws via Compute-Efficient Model Ladders. CoRR abs/2412.04403 (2024) - 2023
- [j32]Zhaofeng Wu, William Merrill, Hao Peng, Iz Beltagy, Noah A. Smith:
Transparency Helps Reveal When Language Models Learn Meaning. Trans. Assoc. Comput. Linguistics 11: 617-634 (2023) - [c272]Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu:
One Embedder, Any Task: Instruction-Finetuned Text Embeddings. ACL (Findings) 2023: 1102-1121 - [c271]Nikita Haduong, Alice Gao, Noah A. Smith:
Risks and NLP Design: A Case Study on Procedural Document QA. ACL (Findings) 2023: 1248-1269 - [c270]Wenya Wang, Vivek Srikumar, Hannaneh Hajishirzi, Noah A. Smith:
Elaboration-Generating Commonsense Question Answering at Scale. ACL (1) 2023: 1619-1635 - [c269]Haoxin Li, Phillip Keung, Daniel Cheng, Jungo Kasai, Noah A. Smith:
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference. ACL (2) 2023: 1723-1730 - [c268]Sofia Serrano, Jesse Dodge, Noah A. Smith:
Stubborn Lexical Bias in Data and Models. ACL (Findings) 2023: 8131-8146 - [c267]Hamish Ivison, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi:
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors. ACL (Findings) 2023: 9036-9061 - [c266]Ian Magnusson, Noah A. Smith, Jesse Dodge:
Reproducibility in NLP: What Have We Learned from the Checklist? ACL (Findings) 2023: 12789-12811 - [c265]Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi:
Self-Instruct: Aligning Language Models with Self-Generated Instructions. ACL (1) 2023: 13484-13508 - [c264]Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, Yejin Choi:
We're Afraid Language Models Aren't Modeling Ambiguity. EMNLP 2023: 790-807 - [c263]Jiacheng Liu, Wenya Wang, Dianzhuo Wang, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi:
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements. EMNLP 2023: 1264-1287 - [c262]Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah A. Smith:
That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context? EMNLP (Findings) 2023: 4555-4569 - [c261]Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah A. Smith, Mike Lewis:
Measuring and Narrowing the Compositionality Gap in Language Models. EMNLP (Findings) 2023: 5687-5711 - [c260]Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov:
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models. EMNLP 2023: 9904-9923 - [c259]Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer:
Demystifying Prompts in Language Models via Perplexity Estimation. EMNLP (Findings) 2023: 10136-10148 - [c258]Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A. Smith, Jiebo Luo:
PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3. ICCV 2023: 2951-2963 - [c257]Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah A. Smith:
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering. ICCV 2023: 20349-20360 - [c256]Zhoujun Cheng, Tianbao Xie, Peng Shi, Chengzu Li, Rahul Nadkarni, Yushi Hu, Caiming Xiong, Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu:
Binding Language Models in Symbolic Languages. ICLR 2023 - [c255]Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu:
Selective Annotation Makes Language Models Better Few-Shot Learners. ICLR 2023 - [c254]Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, Kentaro Inui:
RealTime QA: What's the Answer Right Now? NeurIPS 2023 - [c253]Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi:
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources. NeurIPS 2023 - [c252]Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, Hannaneh Hajishirzi:
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training. NeurIPS 2023 - [c251]Orevaoghene Ahia, Hila Gonen, Vidhisha Balachandran, Yulia Tsvetkov, Noah A. Smith:
LEXPLAIN: Improving Model Explanations via Lexicon Supervision. *SEM@ACL 2023: 207-216 - [i178]Haoxin Li, Phillip Keung, Daniel Cheng, Jungo Kasai, Noah A. Smith:
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference. CoRR abs/2301.04761 (2023) - [i177]Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah A. Smith:
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering. CoRR abs/2303.11897 (2023) - [i176]Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, Noah A. Smith, Luke Zettlemoyer:
Scaling Expert Language Models with Unsupervised Domain Discovery. CoRR abs/2303.14177 (2023) - [i175]Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, Yejin Choi:
We're Afraid Language Models Aren't Modeling Ambiguity. CoRR abs/2304.14399 (2023) - [i174]Jiacheng Liu, Wenya Wang, Dianzhuo Wang, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi:
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements. CoRR abs/2305.03695 (2023) - [i173]Muru Zhang, Ofir Press, William Merrill, Alisa Liu, Noah A. Smith:
How Language Model Hallucinations Can Snowball. CoRR abs/2305.13534 (2023) - [i172]Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov:
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models. CoRR abs/2305.13707 (2023) - [i171]Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, Hannaneh Hajishirzi:
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training. CoRR abs/2306.01693 (2023) - [i170]Sofia Serrano, Jesse Dodge, Noah A. Smith:
Stubborn Lexical Bias in Data and Models. CoRR abs/2306.02190 (2023) - [i169]Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi:
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources. CoRR abs/2306.04751 (2023) - [i168]Judit Ács, Endre Hamerlik, Roy Schwartz, Noah A. Smith, András Kornai:
Morphosyntactic probing of multilingual BERT models. CoRR abs/2306.06205 (2023) - [i167]Ian Magnusson, Noah A. Smith, Jesse Dodge:
Reproducibility in NLP: What Have We Learned from the Checklist? CoRR abs/2306.09562 (2023) - [i166]Yanai Elazar, Jiayao Zhang, David Wadden, Bo Zhang, Noah A. Smith:
Estimating the Causal Effect of Early ArXiving on Paper Acceptance. CoRR abs/2306.13891 (2023) - [i165]Bo-Ru Lu, Nikita Haduong, Chia-Hsuan Lee, Zeqiu Wu, Hao Cheng, Paul Koester, Jean Utke, Tao Yu, Noah A. Smith, Mari Ostendorf:
DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations. CoRR abs/2307.07047 (2023) - [i164]Hao Peng, Qingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, Iz Beltagy, Evan Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi:
Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation. CoRR abs/2307.09701 (2023) - [i163]Sewon Min, Suchin Gururangan, Eric Wallace, Hannaneh Hajishirzi, Noah A. Smith, Luke Zettlemoyer:
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore. CoRR abs/2308.04430 (2023) - [i162]Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis:
In-Context Pretraining: Language Modeling Beyond Document Boundaries. CoRR abs/2310.10638 (2023) - [i161]Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah A. Smith:
That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context? CoRR abs/2310.14610 (2023) - [i160]Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge:
What's In My Big Data? CoRR abs/2310.20707 (2023) - [i159]Haoxin Li, Phillip Keung, Daniel Cheng, Jungo Kasai, Noah A. Smith:
ACID: Abstractive, Content-Based IDs for Document Retrieval with Language Models. CoRR abs/2311.08593 (2023) - [i158]Yanai Elazar, Bhargavi Paranjape, Hao Peng, Sarah Wiegreffe, Khyathi Chandu Raghavi, Vivek Srikumar, Sameer Singh, Noah A. Smith:
Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals. CoRR abs/2311.09605 (2023) - [i157]Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew E. Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi:
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2. CoRR abs/2311.10702 (2023) - [i156]Sofia Serrano, Zander Brumbaugh, Noah A. Smith:
Language Models: A Guide for the Perplexed. CoRR abs/2311.17301 (2023) - [i155]Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, Jesse Dodge:
Paloma: A Benchmark for Evaluating Language Model Fit. CoRR abs/2312.10523 (2023) - [i154]Kai Nylund, Suchin Gururangan, Noah A. Smith:
Time is Encoded in the Weights of Finetuned Language Models. CoRR abs/2312.13401 (2023) - 2022
- [j31]William Merrill, Ashish Sabharwal, Noah A. Smith:
Saturated Transformers are Constant-Depth Threshold Circuits. Trans. Assoc. Comput. Linguistics 10: 843-856 (2022) - [c250]Yao Dou, Maxwell Forbes, Rik Koncel-Kedziorski, Noah A. Smith, Yejin Choi:
Is GPT-3 Text Indistin