


default search action
NAACL-HLT 2025: Albuquerque, New Mexico, USA - Volume 1: Long Papers
- Luis Chiruzzo, Alan Ritter, Lu Wang:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2025 - Volume 1: Long Papers, Albuquerque, New Mexico, USA, April 29 - May 4, 2025. Association for Computational Linguistics 2025, ISBN 979-8-89176-189-6 - Arkadiy Saakyan, Shreyas Kulkarni, Tuhin Chakrabarty, Smaranda Muresan:
Understanding Figurative Meaning through Explainable Visual Entailment. 1-23 - Nicole Meister, Carlos Guestrin, Tatsunori Hashimoto:
Benchmarking Distributional Alignment of Large Language Models. 24-49 - Zeyuan Liu, Ziyu Huan, Xiyao Wang, Jiafei Lyu, Jian Tao, Xiu Li, Furong Huang, Huazhe Xu:
World Models with Hints of Large Language Models for Goal Achieving. 50-72 - Xinglin Wang, Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Boyuan Pan, Heda Wang, Yao Hu, Kan Li:
CogLM: Tracking Cognitive Development of Large Language Models. 73-87 - Minh Duc Chu, Zihao He, Rebecca Dorn, Kristina Lerman:
Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities. 88-111 - Xueyang Feng, Bo Lan, Quanyu Dai, Lei Wang, Jiakai Tang, Xu Chen, Zhenhua Dong, Ji-Rong Wen:
Improving Retrospective Language Agents via Joint Policy Gradient Optimization. 112-141 - Xiangyan Liu, Bo Lan, Zhiyuan Hu, Yang Liu, Zhicheng Zhang, Fei Wang, Michael Qizhe Shieh, Wenmeng Zhou:
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases. 142-160 - Feifan Song, Yuxuan Fan, Xin Zhang, Peiyi Wang, Houfeng Wang:
Instantly Learning Preference Alignment via In-context DPO. 161-178 - Han Zhang, Yuheng Ma, Hanfang Yang:
ALTER: Augmentation for Large-Table-Based Reasoning. 179-198 - Yiping Jin, Leo Wanner, Aneesh Moideen Koya:
What the #?*!: Disentangling Hate Across Target Identities. 199-221 - Matthieu Futeral, Andrea Agostinelli, Marco Tagliasacchi, Neil Zeghidour, Eugene Kharitonov:
MAD Speech: Measures of Acoustic Diversity of Speech. 222-235 - Artem Snegirev, Maria Tikhonova, Anna Maksimova, Alena Fenogenova, Aleksandr Abramov:
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design. 236-254 - Mingwen Dong, Nischal Ashok Kumar, Yiqun Hu, Anuj Chauhan, Chung-Wei Hang, Shuaichen Chang, Lin Pan, Wuwei Lan, Henghui Zhu, Jiarong Jiang, Patrick Ng, Zhiguo Wang:
PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries. 255-273 - Nandan Thakur, Suleman Kazi, Ge Luo, Jimmy Lin, Amin Ahmad:
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems. 274-298 - Do Xuan Long, Ngoc-Hai Nguyen, Tiviatis Sim, Hieu Dao, Shafiq Joty, Kenji Kawaguchi, Nancy F. Chen, Min-Yen Kan:
LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs. 299-330 - Xiaofeng Wu, Karl Stratos, Wei Xu:
The Impact of Visual Information in Chinese Characters: Evaluating Large Models' Ability to Recognize and Utilize Radicals. 331-350 - Soumya Suvra Ghosal, Soumyabrata Pal, Koyel Mukherjee, Dinesh Manocha:
PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from related Example Banks. 351-365 - Tingchen Fu, Yupeng Hou, Julian J. McAuley, Rui Yan:
Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts. 366-384 - Garrett Tanzer:
Fingerspelling within Sign Language Translation. 385-464 - Nishant Balepur, Alexa F. Siu, Nedim Lipka, Franck Dernoncourt, Tong Sun, Jordan Lee Boyd-Graber, Puneet Mathur:
MoDS: Moderating a Mixture of Document Speakers to Summarize Debatable Queries in Document Collections. 465-491 - Guanlin Li, Yuki Arase, Noël Crespi:
Aligning Sentence Simplification with ESL Learner's Proficiency for Language Acquisition. 492-507 - Tim Baumgärtner, Ted Briscoe, Iryna Gurevych:
PeerQA: A Scientific Question Answering Dataset from Peer Reviews. 508-544 - Yilong Xu, Jinhua Gao, Xiaoming Yu, Baolong Bi, Huawei Shen, Xueqi Cheng:
ALiiCE: Evaluating Positional Fine-grained Citation Generation. 545-561 - Alberto Sánchez Pérez, Alaa Boukhary, Paolo Papotti, Luis Castejón Lozano, Adam Elwood:
An LLM-Based Approach for Insight Generation in Data Analysis. 562-582 - Tao Zhang, Yige Wang, ZhuHangyu ZhuHangyu, Li Xin, Chen Xiang, Tian Hua Zhou, Jin Ma:
WebQuality: A Large-scale Multi-modal Web Page Quality Assessment Dataset with Multiple Scoring Dimensions. 583-596 - Chaoyun Zhang, Liqun Li, Shilin He, Xu Zhang, Bo Qiao, Si Qin, Minghua Ma, Yu Kang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang:
UFO: A UI-Focused Agent for Windows OS Interaction. 597-622 - Yoo Yeon Sung, Maharshi Gor, Eve Fleisig, Ishani Mondal, Jordan Lee Boyd-Graber:
Is your benchmark truly adversarial? AdvScore: Evaluating Human-Grounded Adversarialness. 623-642 - Liwen Sun, James (Jialun) Zhao, Wenjing Han, Chenyan Xiong:
Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation. 643-655 - Nitay Calderon, Roi Reichart:
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs. 656-693 - Ruohong Zhang, Liangke Gui, Zhiqing Sun, Yihao Feng, Keyang Xu, Yuanhan Zhang, Di Fu, Chunyuan Li, Alexander G. Hauptmann, Yonatan Bisk, Yiming Yang:
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward. 694-717 - James Seale Smith, Chi-Heng Lin, Shikhar Tuli, Haris Jeelani, Shangqian Gao, Yilin Shen, Hongxia Jin, Yen-Chang Hsu:
FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing. 718-730 - Yuqicheng Zhu, Nico Potyka, Jiarong Pan, Bo Xiong, Yunjie He, Evgeny Kharlamov, Steffen Staab:
Conformalized Answer Set Prediction for Knowledge Graph Embedding. 731-750 - Xingran Zhou, Kun Yang, Changtao Miao, Bingyu Hu, Zhuoer Xu, Shiwen Cui, Changhua Meng, Dan Hong:
Parameter-free and Accessible Prompt Learning to Enhance Adversarial Robustness for Pre-trained Vision-Language Models. 751-761 - Alan Ramponi, Agnese Daffara, Sara Tonelli:
Fine-grained Fallacy Detection with Human Label Variation. 762-784 - Hila Gonen, Terra Blevins, Alisa Liu, Luke Zettlemoyer, Noah A. Smith:
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models. 785-798 - Ruihan Yang, Jiangjie Chen, Yikai Zhang, Siyu Yuan, Aili Chen, Kyle Richardson, Yanghua Xiao, Deqing Yang:
SELFGOAL: Your Language Agents Already Know How to Achieve High-level Goals. 799-819 - Jonas Golde, Patrick Haller, Max Ploner, Fabio Barth, Nicolaas Paul Jedema, Alan Akbik:
Familarity: Better Evaluation of Zero-Shot Named Entity Recognition by Quantifying Label Shifts in Synthetic Training Data. 820-834 - Hwanjun Song, Taewon Yun, Yuho Lee, Jihwan Oh, Gihun Lee, Jason Cai, Hang Su:
Learning to Summarize from LLM-generated Feedback. 835-857 - Ankush Agarwal, Chaitanya Devaguptapu, Ganesh S:
Hybrid Graphs for Table-and-Text based Question Answering using LLMs. 858-875 - Ying Nie, Binwei Yan, Tianyu Guo, Hao Liu, Haoyu Wang, Wei He, Binfan Zheng, Weihao Wang, Qiang Li, Weijian Sun, Yunhe Wang, Dacheng Tao:
CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models. 876-891 - Xiaopeng Yu, Wanpeng Zhang, Zongqing Lu:
LLM-Based Explicit Models of Opponents for Multi-Agent Games. 892-911 - Yan Yang, Zeguan Xiao, Xin Lu, Hongru Wang, Xuetao Wei, Hailiang Huang, Guanhua Chen, Yun Chen:
SeqAR: Jailbreak LLMs with Sequential Auto-Generated Characters. 912-931 - Shota Onohara, Atsuyuki Miyai, Yuki Imajuku, Kazuki Egashira, Jeonghun Baek, Xiang Yue, Graham Neubig, Kiyoharu Aizawa:
JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation. 932-950 - Siyu Yuan, Kaitao Song, Jiangjie Chen, Xu Tan, Yongliang Shen, Kan Ren, Dongsheng Li, Deqing Yang:
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction. 951-972 - Paloma Piot, Javier Parapar:
Decoding Hate: Exploring Language Models' Reactions to Hate Speech. 973-990 - Ziqiao Ma, Zekun Wang, Joyce Chai:
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations. 991-1010 - Langlin Huang, Mengyu Bu, Yang Feng:
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation. 1011-1028 - Rajkumar Pujari, Dan Goldwasser:
LLM-Human Pipeline for Cultural Grounding of Conversations. 1029-1048 - Vy Vo, Lizhen Qu, Tao Feng, Yuncheng Hua, Xiaoxi Kang, Songhai Fan, Tim Dwyer, Lay-Ki Soon, Gholamreza Haffari:
ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning. 1049-1074 - Bryan Chen Zhengyu Tan, Roy Ka-Wei Lee:
Unmasking Implicit Bias: Evaluating Persona-Prompted LLM Responses in Power-Disparate Social Scenarios. 1075-1108 - Quang Duc Nguyen, Tung Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen:
GloCOM: A Short Text Neural Topic Model via Global Clustering Context. 1109-1124 - Shahar Katz, Lior Wolf:
Reversed Attention: On The Gradient Descent Of Attention Layers In GPT. 1125-1152 - Ziqi Jin, Wei Lu:
Self-Harmonized Chain of Thought. 1153-1174 - Liyan Wang, Haotong Wang, Yves Lepage:
AnaScore: Understanding Semantic Parallelism in Proportional Analogies. 1175-1188 - Kelvin Han, Claire Gardent:
Generating Complex Question Decompositions in the Face of Distribution Shifts. 1189-1211 - Yeonjun In, Sungchul Kim, Ryan A. Rossi, Md. Mehrab Tanjim, Tong Yu, Ritwik Sinha, Chanyoung Park:
Diversify-verify-adapt: Efficient and Robust Retrieval-Augmented Ambiguous Question Answering. 1212-1233 - Kaushal Kumar Maurya, KV Aditya Srivatsa, Kseniia Petukhova, Ekaterina Kochmar:
Unifying AI Tutor Evaluation: An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors. 1234-1251 - Kuniaki Saito, Chen-Yu Lee, Kihyuk Sohn, Yoshitaka Ushiku:
Where is the answer? An empirical study of positional bias for parametric knowledge extraction in language model. 1252-1269 - Mete Ismayilzada, Defne Circi, Jonne Sälevä, Hale Sirin, Abdullatif Köksal, Bhuwan Dhingra, Antoine Bosselut, Duygu Ataman, Lonneke van der Plas:
Evaluating Morphological Compositional Generalization in Large Language Models. 1270-1305 - Bichen Wang, Yuzhe Zi, Yixin Sun, Yanyan Zhao, Bing Qin:
Balancing Forget Quality and Model Utility: A Reverse KL-Divergence Knowledge Distillation Approach for Better Unlearning in LLMs. 1306-1321 - Jie Feng, Yuwei Du, Jie Zhao, Yong Li:
AgentMove: A Large Language Model based Agentic Framework for Zero-shot Next Location Prediction. 1322-1338 - Vivian G. Li:
Embedding derived animacy rankings offer insights into the sources of grammatical animacy. 1339-1351 - Qianyue Wang, Jinwu Hu, Zhengping Li, Yufeng Wang, Daiyuan Li, Yu Hu, Mingkui Tan:
Generating Long-form Story Using Dynamic Hierarchical Outlining with Memory-Enhancement. 1352-1391 - Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, Zhicheng Dou:
Little Giants: Synthesizing High-Quality Embedding Data at Scale. 1392-1411 - Zehong Wang, Sidney Liu, Zheyuan Zhang, Tianyi Ma, Chuxu Zhang, Yanfang Ye:
Can LLMs Convert Graphs to Text-Attributed Graphs? 1412-1432 - Haoran Liao, Shaohua Hu, Zhihao Zhu, Hao He, Yaohui Jin:
Forest for the Trees: Overarching Prompting Evokes High-Level Reasoning in Large Language Models. 1433-1453 - Samuel J. Bell, Mariano Coria Meglioli, Megan Richards, Eduardo Sánchez, Christophe Ropers, Skyler Wang, Adina Williams, Levent Sagun, Marta R. Costa-jussà:
On the Role of Speech Data in Reducing Toxicity Detection Bias. 1454-1468 - Andrea Seveso, Daniele Potertì, Edoardo Federici, Mario Mezzanzanica, Fabio Mercorio:
ITALIC: An Italian Culture-Aware Natural Language Benchmark. 1469-1478 - Donghao Huang, Thanh-Son Nguyen, Fiona Liausvia, Zhaoxia Wang:
RAP: A Metric for Balancing Repetition and Performance in Open-Source Large Language Models. 1479-1496 - Xiyang Liu, Chunming Hu, Richong Zhang, Junfan Chen, Baowen Xu:
Improving Data Annotation for Low-Resource Relation Extraction with Logical Rule-Augmented Collaborative Language Models. 1497-1510 - Yara Shamshoum, Nitzan Hodos, Yuval Sieradzki, Assaf Schuster:
CompAct: Compressed Activations for Memory-Efficient LLM Training. 1511-1524 - Peng Hu, Sizhe Liu, Changjiang Gao, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang:
Large Language Models Are Cross-Lingual Knowledge-Free Reasoners. 1525-1542 - Federico Errica, Davide Sanvito, Giuseppe Siracusano, Roberto Bifulco:
What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering. 1543-1558 - Danyang Liu, Fanjie Kong, Xiaohang Sun, Dhruva Patil, Avijit Vajpayee, Zhu Liu, Vimal Bhat, Najmeh Sadoughi:
Detect, Disambiguate, and Translate: On-Demand Visual Reasoning for Multimodal Machine Translation with Large Vision-Language Models. 1559-1570 - Xinhao Xu, Hui Chen, Mengyao Lyu, Sicheng Zhao, Yizhe Xiong, Zijia Lin, Jungong Han, Guiguang Ding:
Mitigating Hallucinations in Multi-modal Large Language Models via Image Token Attention-Guided Decoding. 1571-1590 - Zixuan Yi, Iadh Ounis:
A Multi-modal Large Language Model with Graph-of-Thought for Effective Recommendation. 1591-1606 - Nadav Borenstein, Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein:
Investigating Human Values in Online Communities. 1607-1627 - Tianyu Liu, Jirui Qi, Paul He, Arianna Bisazza, Mrinmaya Sachan, Ryan Cotterell:
Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation. 1628-1647 - Shaopeng Tang, Lin Li, Xiaohui Tao, Leqi Zhong, Qing Xie:
MATO: A Model-Agnostic Training Optimization for Aspect Sentiment Triplet Extraction. 1648-1662 - Tong Zhu, Daize Dong, Xiaoye Qu, Jiacheng Ruan, Wenliang Chen, Yu Cheng:
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts. 1663-1677 - Chenwei Wan, Matthieu Labeau, Chloé Clavel:
EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics. 1678-1695 - Jianxin Liang, Xiaojun Meng, Huishuai Zhang, Yueqian Wang, Jiansheng Wei, Dongyan Zhao:
ReasVQA: Advancing VideoQA with Imperfect Reasoning Process. 1696-1709 - Haoyuan Wu, Haisheng Zheng, Zhuolun He, Bei Yu:
Divergent Thoughts toward One Goal: LLM-based Multi-Agent Collaboration System for Electronic Design Automation. 1710-1721 - Yingxue Fu:
A Survey of QUD Models for Discourse Processing. 1722-1732 - Zhichao Shi, Shaoling Jing, Yi Cheng, Hao Zhang, Yuanzhuo Wang, Jie Zhang, Huawei Shen, Xueqi Cheng:
SafetyQuizzer: Timely and Dynamic Evaluation on the Safety of LLMs. 1733-1747 - Haoran Li, Wei Fan, Yulin Chen, Cheng Jiayang, Tianshu Chu, Xuebing Zhou, Peizhao Hu, Yangqiu Song:
Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory. 1748-1766 - Ziyao Xu, Houfeng Wang:
Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion. 1767-1783 - Honglin Mu, Han He, Yuxin Zhou, Yunlong Feng, Yang Xu, Libo Qin, Xiaoming Shi, Zeming Liu, Xudong Han, Qi Shi, Qingfu Zhu, Wanxiang Che:
Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring. 1784-1799 - Lingxiao Luo, Bingda Tang, Xuanzhong Chen, Rong Han, Ting Chen:
VividMed: Vision Language Model with Versatile Visual Grounding for Medicine. 1800-1821 - Kezhou Chen, Shuo Wang, Huixia Ben, Shengeng Tang, Yanbin Hao:
Mixture of Multimodal Adapters for Sentiment Analysis. 1822-1833 - Elisabeth Kirsten, Ivan Habernal, Vedant Nanda, Muhammad Bilal Zafar:
The Impact of Inference Acceleration on Bias of LLMs. 1834-1853 - Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, David Ifeoluwa Adelani, Ibrahim Said Ahmad, Saminu Mohammad Aliyu, Paul Röttger, Abigail Oppong, Andiswa Bukula, Chiamaka Ijeoma Chukwuneke, Ebrahim Chekol Jibril, Elyas Abdi Ismail, Esubalew Alemneh, Hagos Tesfahun Gebremichael, Lukman Jibril Aliyu, Meriem Beloucif, Oumaima Hourrane, Rooweither Mabuya, Salomey Osei, Samuel Rutunda, Tadesse Destaw Belay, Tadesse Kebede Guge, Tesfa Tegegne Asfaw, Lilian Diana Awuor Wanzare, Nelson Odhiambo Onyango, Seid Muhie Yimam, Nedjma Ousidhoum:
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages. 1854-1871 - Jian Xie, Kexun Zhang, Jiangjie Chen, Siyu Yuan, Kai Zhang, Yikai Zhang, Lei Li, Yanghua Xiao:
Revealing the Barriers of Language Agents in Planning. 1872-1888 - Hideo Kobayashi, Wuwei Lan, Peng Shi, Shuaichen Chang, Jiang Guo, Henghui Zhu, Zhiguo Wang, Patrick Ng:
You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL. 1889-1901 - Zhen Yang, Ping Jian, Chengzhi Li:
Option Symbol Matters: Investigating and Mitigating Multiple-Choice Option Symbol Bias of Large Language Models. 1902-1917 - Xinyu Tang, Xiaolei Wang, Xin Zhao, Ji-Rong Wen:
DAWN-ICL: Strategic Planning of Problem-solving Trajectories for Zero-Shot In-Context Learning. 1918-1934 - Yao Xu, Shizhu He, Jiabei Chen, ZengXiangrong ZengXiangrong, Bingning Wang, Guang Liu, Jun Zhao, Kang Liu:
LLaSA: Large Language and Structured Data Assistant. 1935-1946 - Fu-An Chao, Berlin Chen:
Towards Efficient and Multifaceted Computer-assisted Pronunciation Training Leveraging Hierarchical Selective State Space Model and Decoupled Cross-entropy Loss. 1947-1961 - Abhilasha Ravichander, Jillian Fisher, Taylor Sorensen, Ximing Lu, Maria Antoniak, Bill Yuchen Lin, Niloofar Mireshghallah, Chandra Bhagavatula, Yejin Choi:
Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models. 1962-1978 - Rena Gao, Xuetong Wu, Carsten Roever, Jing Wu, Long Lv, Jingxuan Wu, Jey Han Lau:
An Interpretable and Crosslingual Method for Evaluating Second-Language Dialogues. 1979-2008 - Rupeng Zhang, Haowei Wang, Junjie Wang, Mingyang Li, Yuekai Huang, Dandan Wang, Qing Wang:
From Allies to Adversaries: Manipulating LLM Tool-Calling through Adversarial Injection. 2009-2028 - Jonathan Tonglet, Gabriel Thiem, Iryna Gurevych:
COVE: COntext and VEracity prediction for out-of-context images. 2029-2049 - Yang Zhong, Diane J. Litman:
Discourse-Driven Evaluation: Unveiling Factual Inconsistency in Long Document Summarization. 2050-2073 - Soumadeep Saha, Sutanoya Chakraborty, Saptarshi Saha, Utpal Garain:
Language Models are Crossword Solvers. 2074-2090 - Ming-Bin Chen, Lea Frermann, Jey Han Lau:
WHoW: A Cross-domain Approach for Analysing Conversation Moderation. 2091-2126 - Joan Nwatu, Oana Ignat, Rada Mihalcea:
Uplifting Lower-Income Data: Strategies for Socioeconomic Perspective Shifts in Large Multi-modal Models. 2127-2144 - Satya Krishna Gorti, Ilan Gofman, Zhaoyan Liu, Jiapeng Wu, Noël Vouitsis, Guangwei Yu, Jesse C. Cresswell, Rasa Hosseinzadeh:
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation. 2145-2160 - Jiang Li, Xiangdong Su, Guanglai Gao:
Mitigating Heterogeneity among Factor Tensors via Lie Group Manifolds for Tensor Decomposition Based Temporal Knowledge Graph Embedding. 2161-2172 - Lindia Tjuatja, Graham Neubig, Tal Linzen, Sophie Hao:
What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length. 2173-2186 - Tianze Luo, Xingchen Miao, Wenbo Duan:
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching. 2187-2198 - Mingqi Gao, Xinyu Hu, Li Lin, Xiaojun Wan:
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation. 2199-2222 - Xingwei Tan, Yuxiang Zhou, Gabriele Pergola, Yulan He:
Cascading Large Language Models for Salient Event Graph Generation. 2223-2245 - Artem Vazhentsev, Lyudmila Rvanova, Ivan Lazichny, Alexander Panchenko, Maxim Panov, Timothy Baldwin, Artem Shelmanov:
Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models. 2246-2262 - Kenza Benkirane, Jackie Kay, María Pérez-Ortiz:
How Can We Diagnose and Treat Bias in Large Language Models for Clinical Decision-Making? 2263-2288 - Xiaofeng Zhang, Yihao Quan, Chen Shen, Xiaosong Yuan, Shaotian Yan, Liang Xie, Wenxiao Wang, Chaochen Gu, Hao Tang, Jieping Ye:
From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks. 2289-2299 - Lekang Jiang, Pascal A Scherz, Stefan Goetz:
Patent-CR: A Dataset for Patent Claim Revision. 2300-2314 - Yuhang Zhou, Giannis Karamanolakis, Victor Soto, Anna Rumshisky, Mayank Kulkarni, Furong Huang, Wei Ai, Jianhua Lu:
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs. 2315-2328 - Sangmitra Madhusudan, Robert Morabito, Skye Reid, Nikta Gohari Sadr, Ali Emami:
Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books. 2329-2358 - Xiaoni Duan, Zhuoyan Li, Chien-Ju Ho, Ming Yin:
Exploring the Cost-Effectiveness of Perspective Taking in Crowdsourcing Subjective Assessment: A Case Study of Toxicity Detection. 2359-2372 - Abhinav Rao, Akhila Yerukola, Vishwa Shah, Katharina Reinecke, Maarten Sap:
NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models. 2373-2403 - Tianqi Liu, Zhen Qin, Junru Wu, Jiaming Shen, Misha Khalman, Rishabh Joshi, Yao Zhao, Mohammad Saleh, Simon Baumgartner, Jialu Liu, Peter J. Liu, Xuanhui Wang:
LiPO: Listwise Preference Optimization through Learning-to-Rank. 2404-2420 - Maximilian Spliethöver, Tim Knebler, Fabian Fumagalli, Maximilian Muschalik, Barbara Hammer, Eyke Hüllermeier, Henning Wachsmuth:
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection. 2421-2449 - Anh Duc Le, Nam Le Hai, Thanh Xuan Nguyen, Linh Ngo Van, Nguyen Thi Ngoc Diep, Sang Dinh, Thien Huu Nguyen:
Enhancing Discriminative Representation in Similar Relation Clusters for Few-Shot Continual Relation Extraction. 2450-2467 - Jinu Lee, Wonseok Hwang:
SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning. 2468-2484 - Zhongwei Wan, Hui Shen, Xin Wang, Che Liu, Zheda Mai, Mi Zhang:
MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference. 2485-2497 - Ada Defne Tur, Gaurav Kamath, Siva Reddy:
Language Models Largely Exhibit Human-like Constituent Ordering Preferences. 2498-2521 - Sindhu Padakandla, Sadbhavana Babar, Rathod Darshan D, Manohar Kaul:
SafeQuant: LLM Safety Analysis via Quantized Gradient Inspection. 2522-2536 - Yirong Zeng, Xiao Ding, Bibo Cai, Ting Liu, Bing Qin:
Exploring Large Language Models for Effective Rumor Detection on Social Media. 2537-2552 - Ryan A. Cook, John P. Lalor, Ahmed Abbasi:
No Simple Answer to Data Complexity: An Examination of Instance-Level Complexity Metrics for Classification Tasks. 2553-2573 - Neha Srikanth, Rachel Rudinger:
NLI under the Microscope: What Atomic Hypothesis Decomposition Reveals. 2574-2589 - Thibaud Leteno, Irina Proskurina, Antoine Gourru, Julien Velcin, Charlotte Laclau, Guillaume Metzler, Christophe Gravier:
HISTOIRESMORALES: A French Dataset for Assessing Moral Alignment. 2590-2612 - Kwanghee Choi, Eunjung Yeo, Kalvin Chang, Shinji Watanabe, David R. Mortensen:
Leveraging Allophony in Self-Supervised Speech Models for Atypical Pronunciation Assessment. 2613-2628 - Hanwen Du, Bo Peng, Xia Ning:
SAPIENT: Mastering Multi-turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search. 2629-2648 - Kayla Schroeder, Zach Wood-Doughty:
Reliability of Topic Modeling. 2649-2662 - Shuai Liu, Jonathan May:
Style Transfer with Multi-iteration Preference Optimization. 2663-2681 - Chenlong Zhang, Tong Zhou, Pengfei Cao, Zhuoran Jin, Yubo Chen, Kang Liu, Jun Zhao:
DTELS: Towards Dynamic Granularity of Timeline Summarization. 2682-2703 - Yichuan Li, Xinyang Zhang, Chenwei Zhang, Mao Li, Tianyi Liu, Pei Chen, Yifan Gao, Kyumin Lee, Kaize Ding, Zhengyang Wang, Zhihan Zhang, Jingbo Shang, Xian Li, Trishul Chilimbi:
ALERT: An LLM-powered Benchmark for Automatic Evaluation of Recommendation Explanations. 2704-2719 - Yasir Khan, Xinlei Wu, Sangpil Youm, Justin Ho, Aryaan Shaikh, Jairo Garciga, Rohan Sharma, Bonnie J. Dorr:
DETQUS: Decomposition-Enhanced Transformers for QUery-focused Summarization. 2720-2731 - David Ifeoluwa Adelani, Jessica Ojo, Israel Abebe Azime, Jian Yun Zhuang, Jesujoba Oluwadara Alabi, Xuanli He, Millicent Ochieng, Sara Hooker, Andiswa Bukula, En-Shiun Annie Lee, Chiamaka Ijeoma Chukwuneke, Happy Buzaaba, Blessing K. Sibanda, Godson Koffi Kalipe, Jonathan Mukiibi, Salomon Kabongo Kabenamualu, Foutse Yuehgoh, Mmasibidi Setaka, Lolwethu Ndolela, Nkiruka Odu, Rooweither Mabuya, Salomey Osei, Shamsuddeen Hassan Muhammad, Sokhar Samb, Tadesse Kebede Guge, Tombekai Vangoni Sherman, Pontus Stenetorp:
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models. 2732-2757 - Arturo Oncevay, Charese Smiley, Xiaomo Liu:
The Impact of Domain-Specific Terminology on Machine Translation for Finance in European Languages. 2758-2775 - Yining Lu, Dixuan Wang, Tianjian Li, Dongwei Jiang, Sanjeev Khudanpur, Meng Jiang, Daniel Khashabi:
Benchmarking Language Model Creativity: A Case Study on Code Generation. 2776-2794 - Xinyu Wang, Wenbo Zhang, Sai Dileep Koneru, Hangzhi Guo, Bonam Mingole, S. Shyam Sundar, Sarah Rajtmajer, Amulya Yadav:
Have LLMs Reopened the Pandora's Box of AI-Generated Fake News? 2795-2811 - Chonghe Jiang, Bao Nguyen, Anthony Man-Cho So, Viet Anh Nguyen:
Probe-Free Low-Rank Activation Intervention. 2812-2824 - Zhiheng Lyu, Kevin Yang, Lingpeng Kong, Dan Klein:
FactTrack: Time-Aware World State Tracking in Story Outlines. 2825-2848 - Julius Cheng, Maike Züfle, Vilém Zouhar, Andreas Vlachos:
A Bayesian Optimization Approach to Machine Translation Reranking. 2849-2862 - Pouya Pezeshkpour, Estevam Hruschka:
Multi-Conditional Ranking with Large Language Models. 2863-2883 - Peng Lu, Ivan Kobyzev, Mehdi Rezagholizadeh, Boxing Chen, Philippe Langlais:
ReGLA: Refining Gated Linear Attention. 2884-2898 - Kshitish Ghate, Isaac Slaughter, Kyra Wilson, Mona T. Diab, Aylin Caliskan:
Intrinsic Bias is Predicted by Pretraining Data and Correlates with Downstream Performance in Vision-Language Encoders. 2899-2915 - Eduardo Treviño, Hugo Contant, James Ngai, Graham Neubig, Zora Zhiruo Wang:
Benchmarking Failures in Tool-Augmented Language Models. 2916-2934 - Reza Averly, Xia Ning:
Entity Decomposition with Filtering: A Zero-Shot Clinical Named Entity Recognition Framework. 2935-2951 - Shenglai Zeng, Jiankun Zhang, Bingheng Li, Yuping Lin, Tianqi Zheng, Dante Everaert, Hanqing Lu, Hui Liu, Yue Xing, Monica Xiao Cheng, Jiliang Tang:
Towards Knowledge Checking in Retrieval-augmented Generation: A Representation Perspective. 2952-2969 - Longju Bai, Angana Borah, Oana Ignat, Rada Mihalcea:
The Power of Many: Multi-Agent Multimodal Models for Cultural Image Captioning. 2970-2993 - Tsz Kin Lam, Marco Gaido, Sara Papi, Luisa Bentivogli, Barry Haddow:
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison. 2994-3006 - Delvin Ce Zhang, Dongwon Lee:
CORRECT: Context- and Reference-Augmented Reasoning and Prompting for Fact-Checking. 3007-3019 - Michael A. Lepori, Michael Curtis Mozer, Asma Ghandeharioun:
Racing Thoughts: Explaining Contextualization Errors in Large Language Models. 3020-3036 - Yimu Wang, Shuai Yuan, Bo Xue, Xiangru Jian, Wei Pang, Mushi Wang, Ning Yu:
DREAM: Improving Video-Text Retrieval Through Relevance-Based Augmentation Using Large Foundation Models. 3037-3056 - Zhikun Xu, Ming Shen, Jacob Dineen, Zhaonan Li, Xiao Ye, Shijie Lu, Aswin RRV, Chitta Baral, Ben Zhou:
ToW: Thoughts of Words Improve Reasoning in Large Language Models. 3057-3075 - Bairu Hou, Yang Zhang, Jacob Andreas, Shiyu Chang:
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation. 3076-3099 - Qinchan Li, Sophie Hao:
ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors. 3100-3111 - Valentina Pyatkin, Bonnie Webber, Ido Dagan, Reut Tsarfaty:
Superlatives in Context: Modeling the Implicit Semantics of Superlatives. 3112-3126 - Arash Gholami Davoodi, Seyed Pouyan Mousavi Davoudi, Pouya Pezeshkpour:
LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs. 3127-3140 - Yong Cao, Haijiang Liu, Arnav Arora, Isabelle Augenstein, Paul Röttger, Daniel Hershcovich:
Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations. 3141-3154 - Dan Friedman, Abhishek Panigrahi, Danqi Chen:
Representing Rule-based Chatbots with Transformers. 3155-3180 - Michael Hanna, Aaron Mueller:
Incremental Sentence Processing Mechanisms in Autoregressive Transformer Language Models. 3181-3203 - William Hogan, Jingbo Shang:
Entangled Relations: Leveraging NLI and Meta-analysis to Enhance Biomedical Relation Extraction. 3204-3220 - Hengyi Wang, Haizhou Shi, Shiwei Tan, Weiyi Qin, Wenyuan Wang, Tunyu Zhang, Akshay Nambi, Tanuja Ganu, Hao Wang:
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models. 3221-3241 - Genta Indra Winata, Frederikus Hudi, Patrick Amadeus Irawan, David Anugraha, Rifki Afina Putri, Yutong Wang, Adam Nohejl, Ubaidillah Ariq Prathama, Nedjma Ousidhoum, Afifa Amriani, Anar Rzayev, Anirban Das, Ashmari Pramodya, Aulia Adila, Bryan Wilie, Candy Olivia Mawalim, Cheng Ching Lam, Daud Abolade, Emmanuele Chersoni, Enrico Santus, Fariz Ikhwantri, Garry Kuwanto, Hanyang Zhao, Haryo Akbarianto Wibowo, Holy Lovenia, Jan Christian Blaise Cruz, Jan Wira Gotama Putra, Junho Myung, Lucky Susanto, Maria Angelica Riera Machin, Marina Zhukova, Michael Anugraha, Muhammad Farid Adilazuarda, Natasha Christabelle Santosa, Peerat Limkonchotiwat, Raj Dabre, Rio Alexander Audino, Samuel Cahyawijaya, Shi-Xiong Zhang, Stephanie Yulia Salim, Yi Zhou, Yinxuan Gui, David Ifeoluwa Adelani, En-Shiun Annie Lee, Shogo Okada, Ayu Purwarianti, Alham Fikri Aji, Taro Watanabe, Derry Tanti Wijaya, Alice Oh, Chong-Wah Ngo:
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines. 3242-3264 - Runjin Chen, Gabriel J. Perin, Xuxi Chen, Xilun Chen, Yan Han, Nina S. T. Hirata, Junyuan Hong, Bhavya Kailkhura:
Extracting and Understanding the Superficial Knowledge in Alignment. 3265-3280 - Junzhi Chen, Juhao Liang, Benyou Wang:
Smurfs: Multi-Agent System using Context-Efficient DFSDT for Tool Planning. 3281-3298 - Nan Xu, Fei Wang, Sheng Zhang, Hoifung Poon, Muhao Chen:
From Introspection to Best Practices: Principled Analysis of Demonstrations in Multimodal In-Context Learning. 3299-3324 - Tianjian Li, Haoran Xu, Weiting Tan, Kenton Murray, Daniel Khashabi:
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets. 3325-3343 - Nan Xu, Xuezhe Ma:
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems. 3344-3370 - Siyan Li, Vethavikashini Chithrra Raghuram, Omar Khattab, Julia Hirschberg, Zhou Yu:
PAPILLON: Privacy Preservation from Internet-based and Local Language Model Ensembles. 3371-3390 - Hayley Ross, Ameya Sunil Mahabaleshwarkar, Yoshi Suhara:
When2Call: When (not) to Call Tools. 3391-3409 - Zilu Tang, Rajen Chatterjee, Sarthak Garg:
Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization. 3410-3433 - Yilun Hao, Yongchao Chen, Yang Zhang, Chuchu Fan:
Large Language Models Can Solve Real-World Planning Rigorously with Formal Verification Tools. 3434-3483 - So Young Lee, Russell Scheinberg, Amber Shore, Ameeta Agrawal:
Who Relies More on World Knowledge and Bias for Syntactic Ambiguity Resolution: Humans or LLMs? 3484-3498 - Jin Zhao, Jingxuan Tu, Bingyang Ye, Xinrui Hu, Nianwen Xue, James Pustejovsky:
Beyond Benchmarks: Building a Richer Cross-Document Event Coreference Dataset with Decontextualization. 3499-3513 - Kristina Gligoric, Tijana Zrnic, Cinoo Lee, Emmanuel J. Candès, Dan Jurafsky:
Can Unconfident LLM Annotations Be Used for Confident Conclusions? 3514-3533 - Junyi Ye, Ankan Dash, Wenpeng Yin, Guiling Wang:
Beyond End-to-End VLMs: Leveraging Intermediate Text Representations for Superior Flowchart Understanding. 3534-3548 - Robert Pugh, Cheyenne Wing, María Ximena Juárez Huerta, Angeles Márquez Hernandez, Francis M. Tyers:
Ihquin tlahtouah in Tetelahtzincocah: An annotated, multi-purpose audio and text corpus of Western Sierra Puebla Nahuatl. 3549-3562 - Hanjie Chen, Zhouxiang Fang, Yash Singla, Mark Dredze:
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions. 3563-3599 - Katie Kang, Eric Wallace, Claire J. Tomlin, Aviral Kumar, Sergey Levine:
Unfamiliar Finetuning Examples Control How Language Models Hallucinate. 3600-3612 - Guangya Wan, Yuqi Wu, Jie Chen, Sheng Li:
Reasoning Aware Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling. 3613-3635 - Ghazal Khalighinejad, Sharon Scott, Ollie Liu, Kelly L. Anderson, Rickard Stureborg, Aman Tyagi, Bhuwan Dhingra:
MatViX: Multimodal Information Extraction from Visually Rich Articles. 3636-3655 - Bowen Jiang, Yangxinyu Xie, Xiaomeng Wang, Yuan Yuan, Zhuoqun Hao, Xinyi Bai, Weijie J. Su, Camillo Jose Taylor, Tanwi Mallick:
Towards Rationality in Language and Multimodal Agents: A Survey. 3656-3675 - Ahmed Musa Awon, Yun Lu, Shera Potka, Alex Thomo:
CluSanT: Differentially Private and Semantically Coherent Text Sanitization. 3676-3693 - Kevin Xu, Yeganeh Kordi, Tanay Nayak, Adi Asija, Yizhong Wang, Kate Sanders, Adam Byerly, Jingyu Zhang, Benjamin Van Durme, Daniel Khashabi:
TurkingBench: A Challenge Benchmark for Web Agents. 3694-3710 - Jierui Li, Hung Le, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Doyen Sahoo:
CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models. 3711-3726 - Abhijnan Nath, Andrey Volozin, Saumajit Saha, Albert Nanda, Galina Grunin, Rahul Bhotika, Nikhil Krishnaswamy:
DPL: Diverse Preference Learning Without A Reference Model. 3727-3747 - Jingyu Zhang, Marc Marone, Tianjian Li, Benjamin Van Durme, Daniel Khashabi:
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data. 3748-3768 - Zejun Li, Ruipu Luo, Jiwen Zhang, Minghui Qiu, Xuanjing Huang, Zhongyu Wei:
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models. 3769-3798 - François Roewer-Després, Jinyue Feng, Zining Zhu, Frank Rudzicz:
ACCORD: Closing the Commonsense Measurability Gap. 3799-3829 - Kung-Hsiang Huang, Akshara Prabhakar, Sidharth Dhawan, Yixin Mao, Huan Wang, Silvio Savarese, Caiming Xiong, Philippe Laban, Chien-Sheng Wu:
CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments. 3830-3850 - Juan Pablo Muñoz, Jinjie Yuan, Nilesh Jain:
Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models. 3851-3863 - Mian Zhang, Xianjun Yang, Xinlu Zhang, Travis Labrum, Jamie C. Chiu, Shaun M. Eack, Fei Fang, William Yang Wang, Zhiyu Chen:
CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy. 3864-3900 - Eui Jun Hwang, Sukmin Cho, Junmyeong Lee, Jong C. Park:
An Efficient Gloss-Free Sign Language Translation Using Spatial Configurations and Motion Dynamics with LLMs. 3901-3920 - Ryan Li, Yanzhe Zhang, Diyi Yang:
Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping. 3921-3955 - Chenglei Si, Yanzhe Zhang, Ryan Li, Zhengyuan Yang, Ruibo Liu, Diyi Yang:
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering. 3956-3974 - Hai Wang, Yuzhi Liang, Han Ren:
Temporal-Aware Soft Prompt Tuning for Automatic Text Dating. 3975-3987 - Ziyue Li, Tianyi Zhou:
Sparser Mixture-of-Adapters with Cross-Layer Generalization. 3988-4002 - Mert Inan, Yang Zhong, Vidya Ganesh, Malihe Alikhani:
How to Align Multiple Signed Language Corpora for Better Sign-to-Sign Translations? 4003-4016 - Weicheng Ma, Hefan Zhang, Ivory Yang, Shiyu Ji, Joice Chen, Farnoosh Hashemi, Shubham Mohole, Ethan Gearey, Michael Macy, Saeed Hassanpour, Soroush Vosoughi:
Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication. 4017-4045 - Karuna Bhaila, Minh-Hao Van, Xintao Wu:
Soft Prompting for Unlearning in Large Language Models. 4046-4056 - Nguyen Hoang Anh, Quyen Tran, Thanh Xuan Nguyen, Nguyen Thi Ngoc Diep, Linh Ngo Van, Thien Huu Nguyen, Trung Le:
Mutual-pairing Data Augmentation for Fewshot Continual Relation Extraction. 4057-4075 - Guijin Son, Hanwool Lee, Sungdong Kim, Seungone Kim, Niklas Muennighoff, Taekyoon Choi, Cheonbok Park, Kang Min Yoo, Stella Biderman:
KMMLU: Measuring Massive Multitask Language Understanding in Korean. 4076-4104 - Zheyuan Liu, Guangyao Dou, Mengzhao Jia, Zhaoxuan Tan, Qingkai Zeng, Yongle Yuan, Meng Jiang:
Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench. 4105-4135 - Panayiotis Christou, Md. Zahidul Islam, Yuzhang Lin, Jingwei Xiong:
LLM4DistReconfig: A Fine-tuned Large Language Model for Power Distribution Network Reconfiguration. 4136-4155 - Baizhou Huang, Xiaojun Wan:
WaterPool: A Language Model Watermark Mitigating Trade-Offs among Imperceptibility, Efficacy and Robustness. 4156-4182 - Cheng Wang, Yiwei Wang, Yujun Cai, Bryan Hooi:
Tricking Retrievers with Influential Tokens: An Efficient Black-Box Corpus Poisoning Attack. 4183-4194 - Yifan Song, Guoyin Wang, Sujian Li, Bill Yuchen Lin:
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism. 4195-4206 - Peiran Wang, Xiaogeng Liu, Chaowei Xiao:
CVE-Bench: Benchmarking LLM-based Software Engineering Agent's Ability to Repair Real-World CVE Vulnerabilities. 4207-4224 - Reya Vir, Shreya Shankar, Harrison Chase, Will Fu-Hinthorn, Aditya G. Parameswaran:
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines. 4225-4245 - Zezhong Wang, Xingshan Zeng, Weiwen Liu, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong:
ToolFlow: Boosting LLM Tool-Calling Through Natural and Coherent Dialogue Synthesis. 4246-4263 - Yuqing Zhou, Ziwei Zhu:
Fighting Spurious Correlations in Text Classification via a Causal Learning Perspective. 4264-4274 - Yu Xia, Junda Wu, Sungchul Kim, Tong Yu, Ryan A. Rossi, Haoliang Wang, Julian J. McAuley:
Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval. 4275-4286 - Xin Wang, Samiul Alam, Zhongwei Wan, Hui Shen, Mi Zhang:
SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression. 4287-4296 - Bin Wang, Xunlong Zou, Geyu Lin, Shuo Sun, Zhuohan Liu, Wenyu Zhang, Zhengyuan Liu, AiTi Aw, Nancy F. Chen:
AudioBench: A Universal Benchmark for Audio Large Language Models. 4297-4316 - Zirun Guo, Shulei Wang, Wang Lin, Weicai Yan, Yangyang Wu, Tao Jin:
Efficient Prompting for Continual Adaptation to Missing Modalities. 4317-4327 - Arkadeep Acharya, Rudra Murthy, Vishwajeet Kumar, Jaydeep Sen:
Benchmarking and Building Zero-Shot Hindi Retrieval Model with Hindi-BEIR and NLLB-E5. 4328-4348 - Muzhi Li, Cehao Yang, Chengjin Xu, Xuhui Jiang, Yiyan Qi, Jian Guo, Ho-fung Leung, Irwin King:
Retrieval, Reasoning, Re-ranking: A Context-Enriched Framework for Knowledge Graph Completion. 4349-4363 - Junehyoung Kwon, Mihyeon Kim, Eunju Lee, Juhwan Choi, YoungBin Kim:
See-Saw Modality Balance: See Gradient, and Sew Impaired Vision-Language Balance to Mitigate Dominant Modality Bias. 4364-4378 - Jiawei Liu, Yanjiao Liu, Xun Gong, Tingting Wang, Hong Chen, Yunfeng Hu:
Harnessing and Evaluating the Intrinsic Extrapolation Ability of Large Language Models for Vehicle Trajectory Prediction. 4379-4391 - Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Radha Poovendran:
Stronger Models are Not Always Stronger Teachers for Instruction Tuning. 4392-4405 - Pengxiang Lan, Haoyu Xu, Enneng Yang, Yuliang Liang, Guibing Guo, Jianzhe Zhao, Xingwei Wang:
Efficient and Effective Prompt Tuning via Prompt Decomposition and Compressed Outer Product. 4406-4421 - Jiancheng Dong, Lei Jiang, Wei Jin, Lu Cheng:
Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs. 4422-4435 - Xinyu Lu, Xueru Wen, Yaojie Lu, Bowen Yu, Hongyu Lin, Haiyang Yu, Le Sun, Xianpei Han, Yongbin Li:
Transferable Post-training via Inverse Value Learning. 4436-4447 - Heegyu Kim, Taeyang Jeon, Seunghwan Choi, Seungtaek Choi, Hyunsouk Cho:
FLEX: Expert-level False-Less EXecution Metric for Text-to-SQL Benchmark. 4448-4475 - Xinran Wang, Enmao Diao, Qi Le, Jie Ding, Ali Anwar:
AID: Adaptive Integration of Detectors for Safe AI with Language Models. 4476-4492 - Jiayang Yu, Yihang Zhang, Bin Wang, Peiqin Lin, Yongkang Liu, Shi Feng:
SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model. 4493-4506 - Tung Nguyen, Tue Le, Hoang Tran Vuong, Quang Duc Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen:
Sharpness-Aware Minimization for Topic Models with High-Quality Document Representations. 4507-4524 - Woosung Koh, Jang Han Yoon, Minhyung Lee, Youngjin Song, Jaegwan Cho, Jaehyun Kang, Taehyeon Kim, Se-Young Yun, Youngjae Yu, Bongshin Lee:
C²: Scalable Auto-Feedback for LLM-based Chart Generation. 4525-4566 - Zhu Liu, Cunliang Kong, Ying Liu, Maosong Sun:
A Top-down Graph-based Tool for Modeling Classical Semantic Maps: A Case Study of Supplementary Adverbs. 4567-4576 - Dehai Min, Zhiyang Xu, Guilin Qi, Lifu Huang, Chenyu You:
UniHGKR: Unified Instruction-aware Heterogeneous Knowledge Retrievers. 4577-4594 - Vipul Gupta, Candace Ross, David Pantoja, Rebecca J. Passonneau, Megan Ung, Adina Williams:
Improving Model Evaluation using SMART Filtering of Benchmark Datasets. 4595-4615 - Zexuan Qiu, Zijing Ou, Bin Wu, Jingjing Li, Aiwei Liu, Irwin King:
Entropy-Based Decoding for Retrieval-Augmented Large Language Models. 4616-4627 - Shengqi Zhu, Jeffrey Rzeszotarski:
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models. 4628-4646 - Weiliang Zhao, Daniel Ben-Levi, Wei Hao, Junfeng Yang, Chengzhi Mao:
Diversity Helps Jailbreak Large Language Models. 4647-4680 - Nishanth Sridhar Nakshatri, Shamik Roy, Rajarshi Das, Suthee Chaidaroon, Leonid Boytsov, Rashmi Gangadharaiah:
Constrained Decoding with Speculative Lookaheads. 4681-4700 - Wonjun Lee, Solee Im, Heejin Do, Yunsu Kim, Jungseul Ok, Gary Lee:
DyPCL: Dynamic Phoneme-level Contrastive Learning for Dysarthric Speech Recognition. 4701-4712 - Jinmyeong An, Sangwon Ryu, Heejin Do, Yunsu Kim, Jungseul Ok, Gary Lee:
Revisiting Early Detection of Sexual Predators via Turn-level Optimization. 4713-4724 - Yinghao Aaron Li, Xilin Jiang, Cong Han, Nima Mesgarani:
StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion. 4725-4744 - Satyapriya Krishna, Kalpesh Krishna, Anhad Mohananey, Steven Schwarcz, Adam Stambler, Shyam Upadhyay, Manaal Faruqui:
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation. 4745-4759 - Qinzhuo Wu, Wei Liu, Jian Luan, Bin Wang:
ReachAgent: Enhancing Mobile Agent via Page Reaching and Operation. 4760-4775 - Chengyuan Liu, Shihang Wang, Lizhi Qing, Jun Lin, Ji Zhang, Fei Wu, Kun Kuang:
Learning to Solve Domain-Specific Calculation Problems with Knowledge-Intensive Programs Generator. 4776-4791 - Jiayi Han, Liang Du, Hongwei Du, Xiangguo Zhou, Yiwen Wu, Yuanfang Zhang, Weibo Zheng, Donghong Han:
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture. 4792-4804 - Jinsheng Huang, Liang Chen, Taian Guo, Fu Zeng, Yusheng Zhao, Bohan Wu, Ye Yuan, Haozhe Zhao, Zhihui Guo, Yichi Zhang, Jingyang Yuan, Wei Ju, Luchen Liu, Tianyu Liu, Baobao Chang, Ming Zhang:
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation. 4805-4822 - Hanqing Wang, Yixia Li, Shuo Wang, Guanhua Chen, Yun Chen:
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning. 4823-4836 - Abhinav Menon, Manish Shrivastava, David Krueger, Ekdeep Singh Lubana:
Analyzing (In)Abilities of SAEs via Formal Languages. 4837-4862 - Subin Kim, Hoonrae Kim, Heejin Do, Gary Lee:
Multimodal Cognitive Reframing Therapy via Multi-hop Psychotherapeutic Reasoning. 4863-4880 - Wei Li, Wen Luo, Guangyue Peng, Houfeng Wang:
Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction. 4881-4897 - Shihao Yang, Ziyi Zhang, Yue Jiang, Chunsheng Qin, Shuhua Liu:
A Unified Supervised and Unsupervised Dialogue Topic Segmentation Framework Based on Utterance Pair Modeling. 4898-4908 - Borui Xu, Yao Chen, Zeyi Wen, Weiguo Liu, Bingsheng He:
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance. 4909-4922 - Sanwoo Lee, Jiahao Liu, Qifan Wang, Jingang Wang, Xunliang Cai, Yunfang Wu:
Dynamic Fisher-weighted Model Merging via Bayesian Optimization. 4923-4935 - Vilém Zouhar, Tom Kocmi, Mrinmaya Sachan:
AI-Assisted Human Evaluation of Machine Translation. 4936-4950 - Wentao Ge, Shunian Chen, Hardy Chen, Nuo Chen, Junying Chen, Zhihong Chen, Wenya Xie, Shuo Yan, ChenghaoZhu ChenghaoZhu, Ziyue Lin, Dingjie Song, Xidong Wang, Anningzhe Gao, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang:
MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria. 4951-4974 - Xinyi Mou, Jingcong Liang, Jiayu Lin, Xinnong Zhang, Xiawei Liu, Shiyue Yang, Rong Ye, Lei Chen, Haoyu Kuang, Xuanjing Huang, Zhongyu Wei:
AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios. 4975-5001 - Deren Lei, Yaxi Li, Siyao Li, Mengya Hu, Rui Xu, Ken Archer, Mingyu Wang, Emily Ching, Alex Deng:
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data. 5002-5020 - Lu Yang, Jiajia Li, En Ci, Lefei Zhang, Zuchao Li, Ping Wang:
Label Drop for Multi-Aspect Relation Modeling in Universal Information Extraction. 5021-5040 - Dongming Sheng, Kexin Han, Hao Li, Yan Zhang, Yucheng Huang, Jun Lang, Wenqiang Liu:
Test-Time Code-Switching for Cross-lingual Aspect Sentiment Triplet Extraction. 5041-5053 - Xiaoman Wang, Dan Yuan, Xin Liu, Yike Zhao, Xiaoxiao Zhang, Xizhi Chen, Yunshi Lan:
VisCGEC: Benchmarking the Visual Chinese Grammatical Error Correction. 5054-5068 - Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, Claire Barale, Robert McHardy, Joshua Harris, Jean Kaddour, Emile van Krieken, Pasquale Minervini:
Are We Done with MMLU? 5069-5096 - Yakun Zhu, Shaohang Wei, Xu Wang, Kui Xue, Shaoting Zhang, Xiaofan Zhang:
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling. 5097-5116 - Yu Zhao, Alessio Devoto, Giwon Hong, Xiaotang Du, Aryo Pradipta Gema, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini:
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering. 5117-5136 - Chen Zhang, Meizhi Zhong, Qimeng Wang, Xuantao Lu, Zheyu Ye, Chengqiang Lu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang, Dawei Song:
MoDification: Mixture of Depths Made Easy. 5137-5149 - Meng Tong, Kejiang Chen, Xiaojian Yuan, Jiayang Liu, Weiming Zhang, Nenghai Yu, Jie Zhang:
On the Vulnerability of Text Sanitization. 5150-5164 - Amey Hengle, Prasoon Bajpai, Soham Dan, Tanmoy Chakraborty:
Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models. 5165-5180 - Hoang Pham, Thanh-Do Nguyen, Khac-Hoai Nam Bui:
Verify-in-the-Graph: Entity Disambiguation Enhancement for Complex Claim Verification with Interactive Graph Representation. 5181-5197 - Yuxia Wu, Shujie Li, Yuan Fang, Chuan Shi:
Exploring the Potential of Large Language Models for Heterophilic Graphs. 5198-5211 - Qitan Lv, Tianyu Liu, Hong Wang:
Exploiting Edited Large Language Models as General Scientific Optimizers. 5212-5237 - Jingwei Ni, Tobias Schimanski, Meihong Lin, Mrinmaya Sachan, Elliott Ash, Markus Leippold:
DIRAS: Efficient LLM Annotation of Document Relevance for Retrieval Augmented Generation. 5238-5258 - Hao Li, Chenghao Yang, An Zhang, Yang Deng, Xiang Wang, Tat-Seng Chua:
Hello Again! LLM-powered Personalized Agent for Long-term Dialogue. 5259-5276 - Sandra Sandoval, Christabel Acquaye, Kwesi A. Cobbina, Mohammad Nayeem Teli, Hal Daumé III:
My LLM might Mimic AAE - But When Should It? 5277-5302 - Samuel Cahyawijaya, Delong Chen, Yejin Bang, Leila Khalatbari, Bryan Wilie, Ziwei Ji, Etsuko Ishii, Pascale Fung:
High-Dimension Human Value Representation in Large Language Models. 5303-5330 - Nihed Bendahman, Karen Pinel-Sauvagnat, Gilles Hubert, Mokhtar Boumedyen Billami:
Not all Hallucinations are Good to Throw Away When it Comes to Legal Abstractive Summarization. 5331-5344 - Jaeyoung Kim, Dohyeon Lee, Seung-won Hwang:
Query-focused Referentiability Learning for Zero-shot Retrieval. 5345-5358 - Aviya Maimon:
A Novel Computational Modeling Foundation for Automatic Coherence Assessment. 5359-5377 - Hakaze Cho, Yoshihiro Sakai, Mariko Kato, Kenshiro Tanaka, Akira Ishii, Naoya Inoue:
Token-based Decision Criteria Are Suboptimal in In-context Learning. 5378-5401 - Amey Hengle, Aswini Kumar Padhi, Anil Bandhakavi, Tanmoy Chakraborty:
CSEval: Towards Automated, Multi-Dimensional, and Reference-Free Counterspeech Evaluation using Auto-Calibrated LLMs. 5402-5419 - Menglong Cui, Pengzhi Gao, Wei Liu, Jian Luan, Bin Wang:
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study. 5420-5443 - Bang An, Shiyue Zhang, Mark Dredze:
RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models. 5444-5474 - Rui Xing, Timothy Baldwin, Jey Han Lau:
Evaluating Evidence Attribution in Generated Fact Checking Explanations. 5475-5496 - Taewhoo Lee, Chanwoong Yoon, Kyochul Jang, Donghyeon Lee, Minju Song, Hyunjae Kim, Jaewoo Kang:
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage. 5497-5512 - Georgios Chochlakis, Alexandros Potamianos, Kristina Lerman, Shrikanth Narayanan:
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors. 5513-5528 - Yasser Ashraf, Yuxia Wang, Bin Gu, Preslav Nakov, Timothy Baldwin:
Arabic Dataset for LLM Safeguard Evaluation. 5529-5546 - Siqi Ouyang, Oleksii Hrinchuk, Zhehuai Chen, Vitaly Lavrukhin, Jagadeesh Balam, Lei Li, Boris Ginsburg:
Anticipating Future with Large Language Model for Simultaneous Machine Translation. 5547-5557 - Jinhao Duan, Xinyu Zhao, Zhuoxuan Zhang, Eunhye Grace Ko, Lily Boddy, Chenan Wang, Tianhao Li, Alexander Rasgon, Junyuan Hong, Min Kyung Lee, Chenxi Yuan, Qi Long, Ying Ding, Tianlong Chen, Kaidi Xu:
GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing. 5558-5588 - Hanxu Hu, Simon Yu, Pinzhen Chen, Edoardo M. Ponti:
Fine-Tuning Large Language Models with Sequential Instructions. 5589-5610 - Mayank Kothyari, Sunita Sarawagi, Soumen Chakrabarti, Gaurav Arora, Srujana Merugu:
Diverse In-Context Example Selection After Decomposing Programs and Aligned Utterances Improves Semantic Parsing. 5611-5629 - Rujing Yao, Yang Wu, Chenghao Wang, Jingwei Xiong, Fang Wang, Xiaozhong Liu:
Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning. 5630-5642 - Yaya Sy, Christophe Cerisara, Irina Illina:
Efficient One-shot Compression via Low-Rank Local Feature Distillation. 5643-5661 - Damien de Mijolla, Hannan Saddiq, Kim Moore:
Waste Not, Want Not; Recycled Gumbel Noise Improves Consistency in Natural Language Generation. 5662-5686 - Kaustubh D. Dhole, Kai Shu, Eugene Agichtein:
ConQRet: A New Benchmark for Fine-Grained Automatic Evaluation of Retrieval Augmented Computational Argumentation. 5687-5713 - Daniil Moskovskiy, Nikita Sushko, Sergey Pletenev, Elena Tutubalina, Alexander Panchenko:
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators. 5714-5733 - Enfa Fane, Md Nayem Uddin, Oghenevovwe Ikumariegbe, Daniyal Kashif, Eduardo Blanco, Steven R. Corman:
BEMEAE: Moving Beyond Exact Span Match for Event Argument Extraction. 5734-5749 - Abdul Waheed, Karima Kadaoui, Bhiksha Raj, Muhammad Abdul-Mageed:
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes. 5750-5767 - Chung-En Sun, Xiaodong Liu, Weiwei Yang, Tsui-Wei Weng, Hao Cheng, Aidan San, Michel Galley, Jianfeng Gao:
Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities. 5768-5786 - Yifan Peng, Krishna C. Puvvada, Zhehuai Chen, Piotr Zelasko, He Huang, Kunal Dhawan, Ke Hu, Shinji Watanabe, Jagadeesh Balam, Boris Ginsburg:
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning. 5787-5802 - Kaitlyn Zhou, Haishan Gao, Sarah Li Chen, Dan Edelstein, Dan Jurafsky, Chen Shani:
Rethinking Word Similarity: Semantic Similarity through Classification Confusion. 5803-5817 - Venktesh V, Mandeep Rathee, Avishek Anand:
SUNAR: Semantic Uncertainty based Neighborhood Aware Retrieval for Complex QA. 5818-5835 - Kaige Xie, Philippe Laban, Prafulla Kumar Choubey, Caiming Xiong, Chien-Sheng Wu:
Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage. 5836-5849 - David Huang, Avidan Shah, Alexandre Araujo, David A. Wagner, Chawin Sitawarin:
Stronger Universal and Transferable Attacks by Suppressing Refusals. 5850-5876 - Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Guijin Son, Yejin Choi, Sheikh Shafayat, Jinheon Baek, Sue Hyun Park, Hyeonbin Hwang, Jinkyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang, Seonghyeon Ye, Bill Yuchen Lin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo:
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models. 5877-5919 - Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian:
DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback. 5920-5945 - Phillip Howard, Kathleen C. Fraser, Anahita Bhiwandiwalla, Svetlana Kiritchenko:
Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals. 5946-5991 - Shaona Ghosh, Prasoon Varshney, Makesh Narsimhan Sreedhar, Aishwarya Padmakumar, Traian Rebedea, Jibin Rajan Varghese, Christopher Parisien:
AEGIS2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails. 5992-6026 - Rebii Jamal, Mounir Ourekouch, Mohammed Erradi:
UOREX: Towards Uncertainty-Aware Open Relation Extraction. 6027-6040 - Yuchen Zhuang, Jingfeng Yang, Haoming Jiang, Xin Liu, Kewei Cheng, Sanket Lokegaonkar, Yifan Gao, Qing Ping, Tianyi Liu, Binxuan Huang, Zheng Li, Zhengyang Wang, Pei Chen, Ruijie Wang, Rongzhi Zhang, Nasser Zalmout, Priyanka Nigam, Bing Yin, Chao Zhang:
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training. 6041-6068 - Shengmin Piao, Sanghyun Park:
TinyThinker: Distilling Reasoning through Coarse-to-Fine Knowledge Internalization with Self-Reflection. 6069-6087 - Manan Suri, Puneet Mathur, Franck Dernoncourt, Kanika Goswami, Ryan A. Rossi, Dinesh Manocha:
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation. 6088-6109 - Ming Cheng, Jiaying Gong, Chenhan Yuan, William A. Ingram, Edward A. Fox, Hoda Eldardiry:
VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models. 6110-6130 - Jannik Brinkmann, Chris Wendler, Christian Bartelt, Aaron Mueller:
Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages. 6131-6150 - Weisi Liu, Guangzeng Han, Xiaolei Huang:
Examining and Adapting Time for Multilingual Classification via Mixture of Temporal Experts. 6151-6166 - Garrett Tanzer:
FLEURS-ASL: Including American Sign Language in Massively Multilingual Multitask Evaluation. 6167-6191 - Siyu Yuan, Kaitao Song, Jiangjie Chen, Xu Tan, Dongsheng Li, Deqing Yang:
EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms. 6192-6217 - Qiming Feng, Qiujie Xie, Xiaolong Wang, Qingqiu Li, Yuejie Zhang, Rui Feng, Tao Zhang, Shang Gao:
EmoCharacter: Evaluating the Emotional Fidelity of Role-Playing Agents in Dialogues. 6218-6240 - Dominic Sobhani, Ruiqi Zhong, Edison Marrese-Taylor, Keisuke Sakaguchi, Yutaka Matsuo:
Language Models can Categorize System Inputs for Performance Analysis. 6241-6257 - Xin Guo, Haotian Xia, Zhaowei Liu, Hanyang Cao, Zhi Yang, Zhiqiang Liu, Sizhe Wang, Jinyi Niu, Chuqi Wang, Yanhui Wang, Xiaolong Liang, Xiaoming Huang, Bing Zhu, Zhongyu Wei, Yun Chen, Weining Shen, Liwen Zhang:
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models. 6258-6292 - Fu Zhang, Xinlong Jin, Jingwei Cheng, Hongsen Yu, Huangming Xu:
Rethinking the Role of LLMs for Document-level Relation Extraction: a Refiner with Task Distribution and Probability Fusion. 6293-6312 - Qisheng Hu, Quanyu Long, Wenya Wang:
Decomposition Dilemmas: Does Claim Decomposition Boost or Burden Fact-Checking Performance? 6313-6336 - Huanqian Wang, Yang Yue, Rui Lu, Jingxin Shi, Andrew Zhao, Shenzhi Wang, Shiji Song, Gao Huang:
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing. 6337-6357 - Yongce Li, Chung-En Sun, Tsui-Wei Weng:
Effective Skill Unlearning through Intervention and Abstention. 6358-6371 - Lei Wang, Jianxun Lian, Yi Huang, Yanqi Dai, Haoxuan Li, Xu Chen, Xing Xie, Ji-Rong Wen:
CharacterBox: Evaluating the Role-Playing Capabilities of LLMs in Text-Based Virtual Worlds. 6372-6391 - Xiujie Song, Mengyue Wu, Kenny Q. Zhu, Chunhao Zhang, Yanyi Chen:
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models. 6392-6409 - Dahyun Jung, Jaehyung Seo, Jaewook Lee, Chanjun Park, Heuiseok Lim:
CoME: An Unlearning-based Approach to Conflict-free Model Editing. 6410-6422 - Tarek Naous, Wei Xu:
On The Origin of Cultural Biases in Language Models: From Pre-training Data to Linguistic Phenomena. 6423-6443 - Mounica Maddela, Fernando Alva-Manchego:
Adapting Sentence-level Automatic Metrics for Document-level Simplification Evaluation. 6444-6459 - Minghao Yan, Saurabh Agarwal, Shivaram Venkataraman:
Decoding Speculative Decoding. 6460-6473 - Siddharth Khincha, Tushar Kataria, Ankita Anand, Dan Roth, Vivek Gupta:
Leveraging LLM For Synchronizing Information Across Multilingual Tables. 6474-6492 - Saptarshi Ghosh, Tianyu Jiang:
ConMeC: A Dataset for Metonymy Resolution with Common Nouns. 6493-6509 - Hongru Wang, Boyang Xue, Baohang Zhou, Tianhua Zhang, Cunxiang Wang, Huimin Wang, Guanhua Chen, Kam-Fai Wong:
Self-DC: When to Reason and When to Act? Self Divide-and-Conquer for Compositional Unknown Questions. 6510-6525 - Abhilash Reddy Shankarampeta, Harsh Mahajan, Tushar Kataria, Dan Roth, Vivek Gupta:
TRANSIENTTABLES: Evaluating LLMs' Reasoning on Temporally Evolving Semi-structured Tables. 6526-6544 - Minbeom Kim, Hwanhee Lee, Joonsuk Park, Hwaran Lee, Kyomin Jung:
AdvisorQA: Towards Helpful and Harmless Advice-seeking Question Answering with Collective Intelligence. 6545-6565 - Dohyeon Lee, Jongyoon Kim, Jihyuk Kim, Seung-won Hwang, Joonsuk Park:
tRAG: Term-level Retrieval-Augmented Generation for Domain-Adaptive Retrieval. 6566-6578 - Gongyao Jiang, Xinran Shi, Qiong Luo:
JRE-L: Journalist, Reader, and Editor LLMs in the Loop for Science Journalism for the General Audience. 6579-6594 - Ziche Liu, Rui Ke, Yajiao Liu, Feng Jiang, Haizhou Li:
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models. 6595-6611 - Zijian Li, Qingyan Guo, Jiawei Shao, Lei Song, Jiang Bian, Jun Zhang, Rui Wang:
Graph Neural Network Enhanced Retrieval for Question Answering of Large Language Models. 6612-6633 - Nathan Brown, Vukosi Marivate:
Pula: Training Large Language Models for Setswana. 6634-6656 - Eri Onami, Taiki Miyanishi, Koki Maeda, Shuhei Kurita:
LegalViz: Legal Text Visualization by Text To Diagram Generation. 6657-6676 - Saeed Ahmadnia, Arash Yousefi Jordehi, Mahsa Hosseini Khasheh Heyran, Seyed Abolghasem Mirroshandel, Owen Rambow, Cornelia Caragea:
Active Few-Shot Learning for Text Classification. 6677-6694 - Cong-Duy T. Nguyen, Xiaobao Wu, Thong Thanh Nguyen, Shuai Zhao, Khoi M. Le, Viet-Anh Nguyen, Yichao Feng, Anh Tuan Luu:
Enhancing Multimodal Entity Linking with Jaccard Distance-based Conditional Contrastive Learning and Contextual Visual Augmentation. 6695-6708 - Jinheon Baek, Sujay Kumar Jauhar, Silviu Cucerzan, Sung Ju Hwang:
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models. 6709-6738 - Zixiao Zhu, Zijian Feng, Hanzhang Zhou, Junlang Qian, Kezhi Mao:
Logit Separability-Driven Samples and Multiple Class-Related Words Selection for Advancing In-Context Learning. 6739-6759 - Sibo Ma, Julian Nyarko:
Identifying Emerging Concepts in Large Corpora. 6760-6778 - Mukur Gupta, Noopur Bhatt, Suman Jana:
CodeSCM: Causal Analysis for Multi-Modal Code Generation. 6779-6793 - Thom Lake, Eunsol Choi, Greg Durrett:
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment. 6794-6814 - Mohan Zhang, Pingzhi Li, Jie Peng, Mufan Qiu, Tianlong Chen:
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design. 6815-6825 - Sachit Kuhar, Wasi Uddin Ahmad, Zijian Wang, Nihal Jain, Haifeng Qian, Baishakhi Ray, Murali Krishna Ramanathan, Xiaofei Ma, Anoop Deoras:
LibEvolutionEval: A Benchmark and Study for Version-Specific Code Generation. 6826-6840 - Yixiao He, Haifeng Sun, Pengfei Ren, Jingyu Wang, Huazheng Wang, Qi Qi, Zirui Zhuang, Jing Wang:
Evaluating and Mitigating Object Hallucination in Large Vision-Language Models: Can They Still See Removed Objects? 6841-6858 - Shaoyang Xu, Yongqi Leng, Linhao Yu, Deyi Xiong:
Self-Pluralising Culture Alignment for Large Language Models. 6859-6877 - Jeonghun Cho, Gary Lee:
K-COMP: Retrieval-Augmented Medical Domain Question Answering With Knowledge-Injected Compressor. 6878-6901 - Sami Baral, Li Lucy, Ryan Knight, Alice Ng, Luca Soldaini, Neil T. Heffernan, Kyle Lo:
DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images. 6902-6920 - Kinshuk Vasisht, Navreet Kaur, Danish Pruthi:
Knowledge Graph Guided Evaluation of Abstention Techniques. 6921-6939 - Keqi Deng, Guangzhi Sun, Philip C. Woodland:
Wav2Prompt: End-to-End Speech Prompt Learning and Task-based Fine-tuning for Text-based LLMs. 6940-6956 - Ang Li, Yiquan Wu, Ming Cai, Adam Jatowt, Xiang Zhou, Weiming Lu, Changlong Sun, Fei Wu, Kun Kuang:
Legal Judgment Prediction based on Knowledge-enhanced Multi-Task and Multi-Label Text Classification. 6957-6970 - Keyeun Lee, SeoHyeong Kim, Seolhee Lee, Jinsu Eun, Yena Ko, Hayeon Jeon, Esther Hehsun Kim, Seonghye Cho, Soeun Yang, Eun-mee Kim, Hajin Lim:
SPeCtrum: A Grounded Framework for Multidimensional Identity Representation in LLM-Based Agent. 6971-6991 - Ekaterina Artemova, Jason Samuel Lucas, Saranya Venkatraman, Jooyoung Lee, Sergei Tilga, Adaku Uchendu, Vladislav Mikhailov:
Beemo: Benchmark of Expert-edited Machine-generated Outputs. 6992-7018 - Daniel Guzman-Olivares, Lara Quijano Sánchez, Federico Liberatore:
SANDWiCH: Semantical Analysis of Neighbours for Disambiguating Words in Context ad Hoc. 7019-7033 - Simran Khanuja, Vivek Iyer, Xiaoyu He, Graham Neubig:
Towards Automatic Evaluation for Image Transcreation. 7034-7047 - Xijia Tao, Shuai Zhong, Lei Li, Qi Liu, Lingpeng Kong:
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image. 7048-7063 - Jinhao Jiang, Jiayi Chen, Junyi Li, Ruiyang Ren, Shijie Wang, Xin Zhao, Yang Song, Tao Zhang:
RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement. 7064-7074 - Ang Li, Jingqian Zhao, Bin Liang, Lin Gui, Hui Wang, Xi Zeng, Xingwei Liang, Kam-Fai Wong, Ruifeng Xu:
Mitigating Biases of Large Language Models in Stance Detection with Counterfactual Augmented Calibration. 7075-7092 - Junlang Qian, Zixiao Zhu, Hanzhang Zhou, Zijian Feng, Zepeng Zhai, Kezhi Mao:
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction. 7093-7115 - Donglei Yu, Xiaomian Kang, Yuchen Liu, Feifei Zhai, Nanchang Cheng, Yu Zhou, Chengqing Zong:
Investigating Hallucinations in Simultaneous Machine Translation: Knowledge Distillation Solution and Components Analysis. 7116-7131 - Wen Yang, Minpeng Liao, Kai Fan:
Markov Chain of Thought for Efficient Mathematical Reasoning. 7132-7157 - Varun Gumma, Pranjal A. Chitale, Kalika Bali:
Towards Inducing Long-Context Abilities in Multilingual Neural Machine Translation Models. 7158-7170 - Koji Inoue, Divesh Lala, Gabriel Skantze, Tatsuya Kawahara:
Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection. 7171-7181 - Zongqian Li, Yinhong Liu, Yixuan Su, Nigel Collier:
Prompt Compression for Large Language Models: A Survey. 7182-7195 - Joo Bon Maeng, Seongmin Lee, Seokin Seo, Kee-Eung Kim:
Goal-Conditioned DPO: Prioritizing Safety in Misaligned Instructions. 7196-7211 - Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Yan Xia, Man Lan, Furu Wei:
K-Level Reasoning: Establishing Higher Order Beliefs in Large Language Models for Strategic Reasoning. 7212-7234 - Magdalena Wysocka, Danilo S. Carvalho, Oskar Wysocki, Marco Valentino, André Freitas:
SylloBio-NLI: Evaluating Large Language Models on Biomedical Syllogistic Reasoning. 7235-7258 - Noam Dahan, Gabriel Stanovsky:
The State and Fate of Summarization Datasets: A Survey. 7259-7278 - Muhammad Arslan Manzoor, Ruihong Zeng, Dilshod Azizov, Preslav Nakov, Shangsong Liang:
MGM: Global Understanding of Audience Overlap Graphs for Predicting the Factuality and the Bias of News Media. 7279-7295 - Luca Mouchel, Debjit Paul, Shaobo Cui, Robert West, Antoine Bosselut, Boi Faltings:
A Logical Fallacy-Informed Framework for Argument Generation. 7296-7314 - Di Zhang, Jianbo Wu, Jingdi Lei, Tong Che, Jiatong Li, Tong Xie, Xiaoshui Huang, Shufei Zhang, Marco Pavone, Yuqiang Li, Wanli Ouyang, Dongzhan Zhou:
LLaMA-Berry: Pairwise Optimization for Olympiad-level Mathematical Reasoning via O1-like Monte Carlo Tree Search. 7315-7337 - Haebin Shin, Lei Ji, Yeyun Gong, Sungdong Kim, Eunbi Choi, Minjoon Seo:
Generative Prompt Internalization. 7338-7363 - Milind Agarwal, Joshua Otten, Antonios Anastasopoulos:
Script-Agnosticism and its Impact on Language Identification for Dravidian Languages. 7364-7384 - Renxi Wang, Xudong Han, Yixuan Zhang, Timothy Baldwin, Haonan Li:
NAT: Enhancing Agent Tuning with Negative Samples. 7385-7398 - Zirui Song, Guangxian Ouyang, Meng Fang, Hongbin Na, Zijing Shi, Zhenhao Chen, Yujie Fu, Zeyu Zhang, Shiyu Jiang, Miao Fang, Ling Chen, Xiuying Chen:
Hazards in Daily Life? Enabling Robots to Proactively Detect and Resolve Anomalies. 7399-7415 - Yusuke Ide, Yuto Nishida, Justin Vasselli, Miyu Oba, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe:
How to Make the Most of LLMs' Grammatical Knowledge for Acceptability Judgments. 7416-7432 - ChenghaoZhu ChenghaoZhu, Nuo Chen, Yufei Gao, Yunyi Zhang, Prayag Tiwari, Benyou Wang:
Is Your LLM Outdated? A Deep Look at Temporal Generalization. 7433-7457 - Julia Romberg, Maximilian Maurer, Henning Wachsmuth, Gabriella Lapesa:
Towards a Perspectivist Turn in Argument Quality Assessment. 7458-7485 - Haoxin Liu, Chenghao Liu, B. Aditya Prakash:
A Picture is Worth A Thousand Numbers: Enabling LLMs Reason about Time Series via Visualization. 7486-7518 - Jooyoung Lee, Toshini Agrawal, Adaku Uchendu, Thai Le, Jinghui Chen, Dongwon Lee:
PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection. 7519-7534 - Haohao Zhu, Xiaokun Zhang, Zeyuan Zeng, Junyu Lu, Zewen Bai, Liang Yang, Hongfei Lin:
Commonality and Individuality! Integrating Humor Commonality with Speaker Individuality for Humor Recognition. 7535-7547 - Yanan Ma, Chenghao Xiao, Chenhan Yuan, Sabine N. van der Veer, Lamiece Hassan, Chenghua Lin, Goran Nenadic:
CAST: Corpus-Aware Self-similarity Enhanced Topic modelling. 7548-7561 - Abdulfattah Safa, Gözde Gül Sahin:
A Zero-Shot Open-Vocabulary Pipeline for Dialogue Understanding. 7562-7579 - Somnath Banerjee, Sayan Layek, Hari Shrawgi, Rajarshi Mandal, Avik Halder, Shanu Kumar, Sagnik Basu, Parag Agrawal, Rima Hazra, Animesh Mukherjee:
Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models. 7580-7617 - Michael Toker, Ido Galil, Hadas Orgad, Rinon Gal, Yoad Tewel, Gal Chechik, Yonatan Belinkov:
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models. 7618-7632 - Stephanie Schoch, Yangfeng Ji:
In-Context Learning (and Unlearning) of Length Biases. 7633-7671 - Peinan Zhang, Yusuke Sakai, Masato Mita, Hiroki Ouchi, Taro Watanabe:
AdTEC: A Unified Benchmark for Evaluating Text Quality in Search Engine Advertising. 7672-7691 - Heejin Kook, Junyoung Kim, Seongmin Park, Jongwuk Lee:
Empowering Retrieval-based Conversational Recommendation with Contrasting User Preferences. 7692-7707 - Jung Hyun Lee, Jeonghoon Kim, June Yong Yang, Se Jung Kwon, Eunho Yang, Kang Min Yoo, Dongsoo Lee:
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices. 7708-7743 - Gaurav Arora, Srujana Merugu, Shreya Jain, Vaibhav Saxena:
Towards Robust Knowledge Representations in Multilingual LLMs for Equivalence and Inheritance based Consistent Reasoning. 7744-7762 - Eftekhar Hossain, Sanjeev Kumar Sinha, Naman Bansal, R. Alexander Knipper, Souvika Sarkar, John Salvador, Yash Mahajan, Sri Guttikonda, Mousumi Akter, Md. Mahadi Hassan, Matthew Freestone, Matthew C. Williams Jr., Dongji Feng, Santu Karmaker:
LLMs as Meta-Reviewers' Assistants: A Case Study. 7763-7803 - Shuheng Liu, Michael Best:
A Survey of NLP Progress in Sino-Tibetan Low-Resource Languages. 7804-7825 - Yihan Zhang, Jie Fu, Rongrong Ji, Jie Chen:
Enhancing Language Model Hypernetworks with Restart: A Study on Optimization. 7826-7838 - Zachary William Hopton, Yves Scherrer, Tanja Samardzic:
Functional Lexicon in Subword Tokenization. 7839-7853 - Haonan Wang, Minbin Huang, Runhui Huang, Lanqing Hong, Hang Xu, Tianyang Hu, Xiaodan Liang, Zhenguo Li, Hong Cheng, Kenji Kawaguchi:
Getting More Juice Out of Your Data: Hard Pair Refinement Enhances Visual-Language Models Without Extra Data. 7854-7873 - Erik Miehling, Michael Desmond, Karthikeyan Natesan Ramamurthy, Elizabeth M. Daly, Kush R. Varshney, Eitan Farchi, Pierre Dognin, Jesus Rios, Djallel Bouneffouf, Miao Liu, Prasanna Sattigeri:
Evaluating the Prompt Steerability of Large Language Models. 7874-7900 - Kento Watanabe, Masataka Goto:
A Data-Driven Method for Analyzing and Quantifying Lyrics-Dance Motion Relationships. 7901-7916 - Malvina Nikandrou, Georgios Pantazopoulos, Nikolas Vitsakis, Ioannis Konstas, Alessandro Suglia:
CROPE: Evaluating In-Context Adaptation of Vision and Language Models to Culture-Specific Concepts. 7917-7936 - Jihyun Lee, Yejin Jeon, Seungyeon Seo, Gary Lee:
PicPersona-TOD : A Dataset for Personalizing Utterance Style in Task-Oriented Dialogue with Image Persona. 7937-7958 - Kexun Zhang, Shang Zhou, Danqing Wang, William Yang Wang, Lei Li:
Scaling LLM Inference Efficiently with Optimized Sample Compute Allocation. 7959-7973 - Sara Rezaeimanesh, Faezeh Hosseini, Yadollah Yaghoobzadeh:
Large Language Models for Persian-English Idiom Translation. 7974-7985 - Kourosh T. Baghaei, Dieter Pfoser, Antonios Anastasopoulos:
Follow the Beaten Path: The Role of Route Patterns on Vision-Language Navigation Agents Generalization Abilities. 7986-8005 - Ananjan Nandi, Christopher D. Manning, Shikhar Murty:
Sneaking Syntax into Transformer Language Models with Tree Regularization. 8006-8024 - Sougata Saha, Saurabh Kumar Pandey, Monojit Choudhury:
Meta-Cultural Competence: Climbing the Right Hill of Cultural Awareness. 8025-8042 - Sougata Saha, Saurabh Kumar Pandey, Harshit Gupta, Monojit Choudhury:
Reading between the Lines: Can LLMs Identify Cross-Cultural Communication Gaps? 8043-8067 - Zifan He, Yingqi Cao, Zongyue Qin, Neha Prakriya, Yizhou Sun, Jason Cong:
HMT: Hierarchical Memory Transformer for Efficient Long Context Language Processing. 8068-8089 - Nikhil Sharma, Kenton Murray, Ziang Xiao:
Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models. 8090-8107 - Elias Stengel-Eskin, Peter Hase, Mohit Bansal:
Teaching Models to Balance Resisting and Accepting Persuasion. 8108-8122 - MohammadHossein Rezaei, Eduardo Blanco:
Making Language Models Robust Against Negation. 8123-8142 - Shijia Liu, David A. Smith:
Through the Lens of History: Methods for Analyzing Temporal Variation in Content and Framing of State-run Chinese Newspapers. 8143-8172 - Michael-Andrei Panaitescu-Liess, Pankayaraj Pathmanathan, Yigitcan Kaya, Zora Che, Bang An, Sicheng Zhu, Aakriti Agrawal, Furong Huang:
PoisonedParrot: Subtle Data Poisoning Attacks to Elicit Copyright-Infringing Content from Large Language Models. 8173-8190 - Abhinav Java, Simra Shahid, Chirag Agarwal:
Towards Operationalizing Right to Data Protection. 8191-8205 - Aliakbar Nafar, Kristen Brent Venable, Parisa Kordjamshidi:
Learning vs Retrieval: The Role of In-Context Examples in Regression with Large Language Models. 8206-8229 - Jack Boylan, Chris Hokamp, Demian Gholipour Ghalandari:
GLiREL - Generalist Model for Zero-Shot Relation Extraction. 8230-8245 - Sachin Kumar, Chan Young Park, Yulia Tsvetkov, Noah A. Smith, Hannaneh Hajishirzi:
ComPO: Community Preferences for Language Model Personalization. 8246-8279 - Harsh Kohli, Sachin Kumar, Huan Sun:
GroundCocoa: A Benchmark for Evaluating Compositional & Conditional Reasoning in Language Models. 8280-8295 - Aly M. Kassem, Omar Mahmoud, Niloofar Mireshghallah, Hyunwoo Kim, Yulia Tsvetkov, Yejin Choi, Sherif Saad, Santu Rana:
ALPACA AGAINST VICUNA: Using LLMs to Uncover Memorization of LLMs. 8296-8321 - Pamela D. Rivière, Anne L. Beatty-Martínez, Sean Trott:
Evaluating Contextualized Representations of (Spanish) Ambiguous Words: A New Lexical Resource and Empirical Analysis. 8322-8338 - Junjie Wu, Mo Yu, Lemao Liu, Dit-Yan Yeung, Jie Zhou:
Understanding LLMs' Fluid Intelligence Deficiency: An Analysis of the ARC Task. 8339-8360 - Guangji Bai, Yijiang Li, Zilinghan Li, Liang Zhao, Kibaek Kim:
FedSpaLLM: Federated Pruning of Large Language Models. 8361-8373 - Zhihan Zhang, Shiyang Li, Zixuan Zhang, Xin Liu, Haoming Jiang, Xianfeng Tang, Yifan Gao, Zheng Li, Haodong Wang, Zhaoxuan Tan, Yichuan Li, Qingyu Yin, Bing Yin, Meng Jiang:
IHEval: Evaluating Language Models on Following the Instruction Hierarchy. 8374-8398 - Mardhiyah Sanni, Tassallah Abdullahi, Devendra Deepak Kayande, Emmanuel Ayodele, Naome A. Etori, Michael S. Mollel, Moshood Yekini, Chibuzor Okocha, Lukman E. Ismaila, Folafunmi Omofoye, Boluwatife Adeleye Adewale, Tobi Olatunji:
Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond. 8399-8417 - Philip Schroeder, Nathaniel Morgan, Hongyin Luo, James R. Glass:
THREAD: Thinking Deeper with Recursive Spawning. 8418-8442 - Hyunji Lee, Franck Dernoncourt, Trung Bui, Seunghyun Yoon:
CORG: Generating Answers from Complex, Interrelated Contexts. 8443-8460 - Kang-il Lee, Hyukhun Koh, Dongryeol Lee, Seunghyun Yoon, Minsung Kim, Kyomin Jung:
Generating Diverse Hypotheses for Inductive Reasoning. 8461-8474 - Tianyang Zhao, Kunwar Yashraj Singh, Srikar Appalaraju, Peng Tang, Ying Nian Wu, Li Erran Li:
On the Analysis and Distillation of Emergent Outlier Properties in Pre-trained Language Models. 8475-8507 - Hung-Ting Chen, Eunsol Choi:
Open-World Evaluation for Retrieving Diverse Perspectives. 8508-8528 - Ryoma Kumon, Hitomi Yanaka:
Analyzing the Inner Workings of Transformers in Compositional Generalization. 8529-8540 - Francesca Lucchetti, Zixuan Wu, Arjun Guha, Molly Q. Feldman, Carolyn Jane Anderson:
Substance Beats Style: Why Beginning Students Fail to Code with LLMs. 8541-8610 - Justin Chih-Yao Chen, Zifeng Wang, Hamid Palangi, Rujun Han, Sayna Ebrahimi, Long T. Le, Vincent Perot, Swaroop Mishra, Mohit Bansal, Chen-Yu Lee, Tomas Pfister:
Reverse Thinking Makes LLMs Stronger Reasoners. 8611-8630 - Kai Tzu-iunn Ong, Namyoung Kim, Minju Gwak, Hyungjoo Chae, Taeyoon Kwon, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo:
Towards Lifelong Dialogue Agents via Timeline-based Memory Management. 8631-8661 - Ajay Patel, Jiacheng Zhu, Justin Qiu, Zachary Horvitz, Marianna Apidianaki, Kathleen McKeown, Chris Callison-Burch:
StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples. 8662-8685 - Junliang He, Ziyue Fan, Shaohui Kuang, Li Xiaoqing, Kai Song, Yaqian Zhou, Xipeng Qiu:
FiNE: Filtering and Improving Noisy Data Elaborately with Large Language Models. 8686-8707 - Ziyue Fan, Junliang He, Li Xiaoqing, Shaohui Kuang, Kai Song, Yaqian Zhou, Xipeng Qiu:
CAMIEval: Enhancing NLG Evaluation through Multidimensional Comparative Instruction-Following Analysis. 8708-8733 - Pei Chen, Hongye Jin, Cheng-Che Lee, Rulin Shao, Jingfeng Yang, Mingyu Zhao, Zhaoyu Zhang, Qin Lu, Kaiwen Men, Ning Xie, Huasheng Li, Bing Yin, Han Li, Lingyun Wang:
LongLeader: A Comprehensive Leaderboard for Large Language Models in Long-context Scenarios. 8734-8750 - Wang Bill Zhu, Ishika Singh, Robin Jia, Jesse Thomason:
Language Models Can Infer Action Semantics for Symbolic Planners from Environment Feedback. 8751-8773 - Xianyang Zhan, Agam Goyal, Yilun Chen, Eshwar Chandrasekharan, Koustuv Saha:
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation. 8774-8790 - David Wan, Jesse Vig, Mohit Bansal, Shafiq Joty:
On Positional Bias of Faithfulness for Long-form Summarization. 8791-8810 - Sizhe Wang, Yongqi Tong, Hengyuan Zhang, Dawei Li, Xin Zhang, Tianlong Chen:
BPO: Towards Balanced Preference Optimization between Knowledge Breadth and Depth in Alignment. 8811-8826 - Yijiang River Dong, Hongzhou Lin, Mikhail Belkin, Ramón Huerta, Ivan Vulic:
UNDIAL: Self-Distillation with Adjusted Logits for Robust Unlearning in Large Language Models. 8827-8840 - Nikhil Abhyankar, Vivek Gupta, Dan Roth, Chandan K. Reddy:
H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables. 8841-8863 - Yinghan Zhou, Juan Wen, Wanli Peng, Yiming Xue, Ziwei Zhang, Zhengxian Wu:
Kill two birds with one stone: generalized and robust AI-generated text detection via dynamic perturbations. 8864-8875 - Kanzhi Cheng, Yantao Li, Fangzhi Xu, Jianbing Zhang, Hao Zhou, Yang Liu:
Vision-Language Models Can Self-Improve Reasoning via Reflection. 8876-8892 - Deven Mahesh Mistry, Anooshka Bajaj, Yash Aggarwal, Sahaj Singh Maini, Zoran Tiganj:
Emergence of Episodic Memory in Transformers: Characterizing Changes in Temporal Structure of Attention Scores During Training. 8893-8911 - Xiangrong Zhu, Yuexiang Xie, Yi Liu, Yaliang Li, Wei Hu:
Knowledge Graph-Guided Retrieval Augmented Generation. 8912-8924 - Zeping Li, Xinlong Yang, Ziheng Gao, Ji Liu, Guanchen Li, Zhuang Liu, Dong Li, Jinzhang Peng, Lu Tian, Emad Barsoum:
Amphista: Bi-directional Multi-head Decoding for Accelerating LLM Inference. 8925-8938 - Sahana Ramnath, Kartik Pandey, Elizabeth Boschee, Xiang Ren:
CAVE: Controllable Authorship Verification Explanations. 8939-8961 - Dongryeol Lee, Yerin Hwang, Yongil Kim, Joonsuk Park, Kyomin Jung:
Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the effect of Epistemic Markers on LLM-based Evaluation. 8962-8984 - Shuyang Yu, Runxue Bao, Parminder Bhatia, Taha A. Kass-Hout, Jiayu Zhou, Cao Xiao:
Dynamic Uncertainty Ranking: Enhancing Retrieval-Augmented In-Context Learning for Long-Tail Knowledge in LLMs. 8985-8997 - Sun Ao, Weilin Zhao, Xu Han, Cheng Yang, Xinrong Zhang, Zhiyuan Liu, Chuan Shi, Maosong Sun:
Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training. 8998-9008 - Ivoline C. Ngong, Joseph P. Near, Niloofar Mireshghallah:
Differentially Private Learning Needs Better Model Initialization and Self-Distillation. 9009-9027 - Seokwon Song, Taehyun Lee, Jaewoo Ahn, Jae Hyuk Sung, Gunhee Kim:
Is a Peeled Apple Still Red? Evaluating LLMs' Ability for Conceptual Combination with Property Type. 9028-9048 - Atharva Naik, Marcus Alenius, Daniel Fried, Carolyn P. Rosé:
CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells. 9049-9076 - Fei Yuan, Chang Ma, Shuai Yuan, Qiushi Sun, Lei Li:
KS-Lottery: Finding Certified Lottery Tickets for Multilingual Transfer in Large Language Models. 9077-9090 - Jiayi Wu, Hengyi Cai, Lingyong Yan, Hao Sun, Xiang Li, Shuaiqiang Wang, Dawei Yin, Ming Gao:
PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization. 9091-9112 - Baizhou Huang, Xiao Pu, Xiaojun Wan:
B⁴: A Black-Box Scrubbing Attack on LLM Watermarks. 9113-9126 - Dayang Li, Fanxiao Li, Bingbing Song, Li Tang, Wei Zhou:
IMRRF: Integrating Multi-Source Retrieval and Redundancy Filtering for LLM-based Fake News Detection. 9127-9142 - Sara Bourbour Hosseinbeigi, Fatemeh Taherinezhad, Heshaam Faili, Hamed Baghbani, Fatemeh Nadi, Mostafa Amiri:
Matina: A Large-Scale 73B Token Persian Text Corpus. 9143-9157 - Saurabh Kumar Pandey, Sachin Vashistha, Debrup Das, Somak Aditya, Monojit Choudhury:
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation. 9158-9176 - Mahta Fetrat Qharabagh, Zahra Dehghanian, Hamid R. Rabiee:
ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages. 9177-9206 - Viet Thanh Pham, Zhuang Li, Lizhen Qu, Gholamreza Haffari:
CultureInstruct: Curating Multi-Cultural Instructions at Scale. 9207-9228 - Lovish Madaan, David Esiobu, Pontus Stenetorp, Barbara Plank, Dieuwke Hupkes:
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models. 9229-9242 - Wei He, Kai Han, Yehui Tang, Chengcheng Wang, Yujie Yang, Tianyu Guo, Yunhe Wang:
DenseSSM: State Space Models with Dense Hidden Connection for Efficient Large Language Models. 9243-9254 - Shengxiang Gao, Fang Nan, Yongbing Zhang, Yuxin Huang, Kaiwen Tan, Zhengtao Yu:
A Mixed-Language Multi-Document News Summarization Dataset and a Graphs-Based Extract-Generate Model. 9255-9265 - Jamie Hayes, Marika Swanberg, Harsh Chaudhari, Itay Yona, Ilia Shumailov, Milad Nasr, Christopher A. Choquette-Choo, Katherine Lee, A. Feder Cooper:
Measuring memorization in language models via probabilistic extraction. 9266-9291 - Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari:
Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models. 9292-9306 - Yunsheng Ni, Chuanjian Liu, Yehui Tang, Kai Han, Yunhe Wang:
EMS-SD: Efficient Multi-sample Speculative Decoding for Accelerating Large Language Models. 9307-9320 - Yuu Jinnai, Tetsuro Morimura, Kaito Ariu, Kenshi Abe:
Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment. 9321-9347 - Srija Mukhopadhyay, Abhishek Rajgaria, Prerana Khatiwada, Manish Shrivastava, Dan Roth, Vivek Gupta:
MAPWise: Evaluating Vision-Language Models for Advanced Map Queries. 9348-9378 - Min Xiao, Junnan Zhu, Feifei Zhai, Chengqing Zong, Yu Zhou:
Pay More Attention to Images: Numerous Images-Oriented Multimodal Summarization. 9379-9392 - Yuting Zeng, Weizhe Huang, Lei Jiang, Tongxuan Liu, Xitai Jin, Chen Tianying Tiana, Jing Li, Xiaohua Xu:
S²-MAD: Breaking the Token Barrier to Enhance Multi-Agent Debate Efficiency. 9393-9408 - Bingzheng Gan, Yufan Zhao, Tianyi Zhang, Jing Huang, Yusu Li, Shu Xian Teo, Changwang Zhang, Wei Shi:
MASTER: A Multi-Agent System with LLM Specialized MCTS. 9409-9426 - Yu-Chung Hsiao, Fedir Zubach, Gilles Baechler, Srinivas Sunkara, Victor Carbune, Jason Lin, Maria Wang, Yun Zhu, Jindong Chen:
ScreenQA: Large-Scale Question-Answer Pairs Over Mobile App Screenshots. 9427-9452 - Uri Berger, Edoardo M. Ponti:
Cross-Lingual and Cross-Cultural Variation in Image Descriptions. 9453-9465 - Anran Hao, Jian Su, Shuo Sun, Teo Yong Sen:
Soft Syntactic Reinforcement for Neural Event Extraction. 9466-9478 - Hyegang Son, Yonglak Son, Changhoon Kim, Young Geun Kim:
Not All Adapters Matter: Selective Adapter Freezing for Memory-Efficient Fine-Tuning of Language Models. 9479-9496 - Jaechang Kim, Jinmin Goh, Inseok Hwang, Jaewoong Cho, Jungseul Ok:
Bridging the Gap between Expert and Language Models: Concept-guided Chess Commentary Generation and Evaluation. 9497-9516 - Joonghyuk Hahn, Hyeseon Ahn, Jungin Kim, Soohan Lim, Yo-Sub Han:
TCProF:Time-Complexity Prediction SSL Framework. 9517-9542 - Suchae Jeong, Inseong Choi, Youngsik Yun, Jihie Kim:
Culture-TRIP: Culturally-Aware Text-to-Image Generation with Iterative Prompt Refinement. 9543-9573 - Sehun Lee, Kang-wook Kim, Gunhee Kim:
Behavior-SD: Behaviorally Aware Spoken Dialogue Generation with Large Language Models. 9574-9593 - Chaoqun Liu, Wenxuan Zhang, Yiran Zhao, Anh Tuan Luu, Lidong Bing:
Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models. 9594-9614 - Deepanway Ghosal, Vernon Toh, Yew Ken Chia, Soujanya Poria:
AlgoPuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Algorithmic Multimodal Puzzles. 9615-9632 - Abhinav Joshi, Areeb Ahmad, Divyaksh Shukla, Ashutosh Modi:
Towards Quantifying Commonsense Reasoning with Mechanistic Insights. 9633-9660 - Anirudh Phukan, Divyansh, Harshit Kumar Morj, Vaishnavi, Apoorv Saxena, Koustava Goswami:
Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs. 9661-9675 - Rishabh Maheshwary, Vikas Yadav, Hoang Nguyen, Khyati Mahajan, Sathwik Tejaswi Madhusudhan:
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models. 9676-9713 - Minh Duc Bui, Katharina von der Wense, Anne Lauscher:
Multi³Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision-Language Models. 9714-9731 - Max Glockner, Yufang Hou, Preslav Nakov, Iryna Gurevych:
Grounding Fallacies Misrepresenting Scientific Publications in Evidence. 9732-9767 - Paul Youssef, Zhixue Zhao, Christin Seifert, Jörg Schlötterer:
Has this Fact been Edited? Detecting Knowledge Edits in Language Models. 9768-9784 - Yiran Zhao, Wenxuan Zhang, Huiming Wang, Kenji Kawaguchi, Lidong Bing:
AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging. 9785-9800 - Haoyuan Li, Yusen Zhang, Rui Zhang, Snigdha Chaturvedi:
Coverage-based Fairness in Multi-document Summarization. 9801-9819 - Dominik Glandorf, Peng Cui, Detmar Meurers, Mrinmaya Sachan:
Grammar Control in Dialogue Response Generation for Language Learning Chatbots. 9820-9839 - Li Zhou, Taelin Karidi, Wanlong Liu, Nicolas Garneau, Yong Cao, Wenyu Chen, Haizhou Li, Daniel Hershcovich:
Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge. 9840-9867 - Zhe Yang, Yi Huang, Yaqin Chen, XiaotingWu XiaotingWu, Junlan Feng, Chao Deng:
Palette of Language Models: A Solver for Controlled Text Generation. 9868-9881 - David Wan, Justin Chih-Yao Chen, Elias Stengel-Eskin, Mohit Bansal:
MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration. 9882-9901 - Junqing He, Liang Zhu, Rui Wang, Xi Wang, Gholamreza Haffari, Jiaxing Zhang:
MADial-Bench: Towards Real-world Evaluation of Memory-Augmented Dialogue Generation. 9902-9921 - Albin Zehe, Elisabeth Fischer, Andreas Hotho:
Assessing the State of the Art in Scene Segmentation. 9922-9941 - Minyu Chen, Guoqiang Li, Ling-I Wu, Ruibang Liu:
DCE-LLM: Dead Code Elimination with Large Language Models. 9942-9955 - Liping Liu, Chunhong Zhang, Likang Wu, Chuang Zhao, Zheng Hu, Ming He, Jianping Fan:
Instruct-of-Reflection: Enhancing Large Language Models Iterative Reflection Capabilities via Dynamic-Meta Instruction. 9956-9978 - Sangwon Yu, Jongyoon Song, Bongkyu Hwang, Hoyoung Kang, Sooah Cho, Junhwa Choi, Seongho Joe, Taehee Lee, Youngjune Gwon, Sungroh Yoon:
Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment. 9979-10001 - Xiongtao Zhou, Jie He, Lanyu Chen, Jingyu Li, Haojing Chen, Víctor Gutiérrez-Basulto, Jeff Z. Pan, Hanjie Chen:
MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps. 10002-10039 - Zhenpeng Su, Xing Wu, Zijia Lin, Yizhe Xiong, Minxuan Lv, Guangyuan Ma, Hui Chen, Songlin Hu, Guiguang Ding:
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts. 10040-10055 - Amalie Brogaard Pauli, Isabelle Augenstein, Ira Assent:
Measuring and Benchmarking Large Language Models' Capabilities to Generate Persuasive Language. 10056-10075 - Sshubam Verma, Mohammed Safi Ur Rahman Khan, Vishwajeet Kumar, Rudra Murthy, Jaydeep Sen:
MILU: A Multi-task Indic Language Understanding Benchmark. 10076-10132 - Arihant Jain, Purav Aggarwal, Rishav Sahay, Chaosheng Dong, Anoop Saladi:
AutoEval-ToD: Automated Evaluation of Task-oriented Dialog Systems. 10133-10148 - Miles Williams, George Chrysostomou, Nikolaos Aletras:
Self-calibration for Language Model Quantization and Pruning. 10149-10167 - Tongxuan Liu, Wenjiang Xu, Weizhe Huang, Yuting Zeng, Jiaxing Wang, Xingyu Wang, Hailong Yang, Jing Li:
Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models. 10168-10185 - Tingyu Song, Guo Gan, Mingsheng Shang, Yilun Zhao:
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval. 10186-10204 - Yudong Zhang, Ruobing Xie, Jiansheng Chen, Xingwu Sun, Zhanhui Kang, Yu Wang:
QAVA: Query-Agnostic Visual Attack to Large Vision-Language Models. 10205-10218 - Jie He, Yijun Yang, Wanqiu Long, Deyi Xiong, Víctor Gutiérrez-Basulto, Jeff Z. Pan:
Evaluating and Improving Graph to Text Generation with Large Language Models. 10219-10244 - Sriram Ranga, Rui Mao, Erik Cambria, Anupam Chattopadhyay:
The Plagiarism Singularity Conjecture. 10245-10255 - Sungjin Park, Xiao Liu, Yeyun Gong, Edward Choi:
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning. 10256-10277 - Zhaoxin Yu, Xinglin Xiao, Wenji Mao:
One Unified Model for Diverse Tasks: Emotion Cause Analysis via Self-Promote Cognitive Structure Modeling. 10278-10293 - Ivan Vykopal, Simon Ostermann, Marián Simko:
Soft Language Prompts for Language Transfer. 10294-10313 - Sepideh Mamooler, Syrielle Montariol, Alexander Mathis, Antoine Bosselut:
PICLe: Pseudo-annotations for In-Context Learning in Low-Resource Named Entity Detection. 10314-10331 - Yoichi Ishibashi, Taro Yano, Masafumi Oyamada:
Can Large Language Models Invent Algorithms to Improve Themselves? 10332-10363 - Zheyuan Zhang, Daniel Zhang-Li, Jifan Yu, Linlu Gong, Jinchang Zhou, Zhanxin Hao, Jianxiao Jiang, Jie Cao, Huiqin Liu, Zhiyuan Liu, Lei Hou, Juanzi Li:
Simulating Classroom Education with LLM-Empowered Agents. 10364-10379 - Coleman Haley, Sharon Goldwater, Edoardo M. Ponti:
A Grounded Typology of Word Classes. 10380-10399 - Yixian Shen, Qi Bi, Jia-Hong Huang, Hongyi Zhu, Andy D. Pimentel, Anuj Pathania:
SSH: Sparse Spectrum Adaptation via Discrete Hartley Transformation. 10400-10415 - Sangyeop Kim, Sohhyung Park, Jaewon Jung, Jinseok Kim, Sungzoon Cho:
LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue. 10416-10430 - Sumin An, Junyoung Sung, Wonpyo Park, Chanjun Park, Paul Hongsuck Seo:
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs. 10431-10442 - Luke Bates, Peter Ebert Christensen, Preslav Nakov, Iryna Gurevych:
A Template Is All You Meme. 10443-10475 - Ján Cegin, Jakub Simko, Peter Brusilovsky:
LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs? 10476-10496 - Moran Yanuka, Assaf Ben-Kish, Yonatan Bitton, Idan Szpektor, Raja Giryes:
Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions. 10497-10518 - Jaehyeok Lee, Keisuke Sakaguchi, JinYeong Bak:
Self-Training Meets Consistency: Improving LLMs' Reasoning with Consistency-Driven Rationale Evaluation. 10519-10539 - Emily Allaway, Kathleen McKeown:
Evaluating Defeasible Reasoning in LLMs with DEFREASING. 10540-10558 - Jingyi Sun, Pepa Atanasova, Isabelle Augenstein:
Evaluating Input Feature Explanations through a Unified Diagnostic Evaluation Framework. 10559-10577 - Minsu Kim, Sangryul Kim, James Thorne:
From Evidence to Belief: A Bayesian Epistemology Approach to Language Models. 10578-10611 - Sebastian Ochs, Ivan Habernal:
Private Synthetic Text Generation with Diffusion Models. 10612-10626 - Yiwen Ding, Zhiheng Xi, Wei He, Lizhuoyuan Lizhuoyuan, Yitao Zhai, Shi Xiaowei, Xunliang Cai, Tao Gui, Qi Zhang, Xuanjing Huang:
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling. 10627-10646 - Mamta Mamta, Oana Cocarascu:
FactEval: Evaluating the Robustness of Fact Verification Systems in the Era of Large Language Models. 10647-10660 - Tarun Ram Menta, Susmit Agrawal, Chirag Agarwal:
Analyzing Memorization in Large Language Models through the Lens of Model Attribution. 10661-10689 - Bingfeng Chen, Shaobin Shi, Yongqi Luo, Boyan Xu, Ruichu Cai, Zhifeng Hao:
Track-SQL: Enhancing Generative Language Models with Dual-Extractive Modules for Schema and Context Tracking in Multi-turn Text-to-SQL. 10690-10708 - Kunal Dahiya, Diego Ortego, David Jimenez-Cabello:
Prototypical Extreme Multi-label Classification with a Dynamic Margin Loss. 10709-10727 - Zonghai Yao, Aditya Parashar, Huixue Zhou, Won Seok Jang, Feiyun Ouyang, Zhichao Yang, Hong Yu:
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback. 10728-10777 - Sameer Pimparkhede, Pushpak Bhattacharyya:
Main Predicate and Their Arguments as Explanation Signals For Intent Classification. 10778-10789 - Ruichu Cai, Junhao Lu, Zhongjie Chen, Boyan Xu, Zhifeng Hao:
Handling Missing Entities in Zero-Shot Named Entity Recognition: Integrated Recall and Retrieval Augmentation. 10790-10802 - Hyunjong Kim, Suyeon Lee, Yeongjae Cho, Eunseo Ryu, Yohan Jo, Suran Seong, Sungzoon Cho:
KMI: A Dataset of Korean Motivational Interviewing Dialogues for Psychotherapy. 10803-10828 - Dayeon Ki, Marine Carpuat:
Automatic Input Rewriting Improves Translation with Large Language Models. 10829-10856 - Vladimir Malinovskii, Andrei Panferov, Ivan Ilin, Han Guo, Peter Richtárik, Dan Alistarh:
HIGGS: Pushing the Limits of Large Language Model Quantization via the Linearity Theorem. 10857-10886 - Badr AlKhamissi, Greta Tuckute, Antoine Bosselut, Martin Schrimpf:
The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units. 10887-10911 - Xinyuan Wang, Yanchi Liu, Wei Cheng, Xujiang Zhao, Zhengzhang Chen, Wenchao Yu, Yanjie Fu, Haifeng Chen:
MixLLM: Dynamic Routing in Mixed Large Language Models. 10912-10922 - Shakib Yazdani, Josef van Genabith, Cristina España-Bonet:
Continual Learning in Multilingual Sign Language Translation. 10923-10938 - Junnan Liu:
Few-Shot Natural Language to First-Order Logic Translation via Code Generation. 10939-10960 - Ran Zhang, Wei Zhao, Steffen Eger:
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs. 10961-10988 - Salem Lahlou, Abdalgader Abubaker, Hakim Hacid:
PORT: Preference Optimization on Reasoning Traces. 10989-11005 - Xuan He, Da Yin, Nanyun Peng:
Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks? 11006-11046 - Faeze Ghorbanpour, Viktor Hangya, Alexander Fraser:
Fine-Grained Transfer Learning for Harmful Content Detection through Label-Specific Soft Prompt Tuning. 11047-11061 - Joongwon Kim, Anirudh Goyal, Aston Zhang, Bo Xiong, Rui Hou, Melanie Kambadur, Dhruv Mahajan, Hannaneh Hajishirzi, Liang Tan:
A Systematic Examination of Preference Learning through the Lens of Instruction-Following. 11062-11082 - Mohit Chandra, Siddharth Sriraman, Gaurav Verma, Harneet Singh Khanuja, Jose Suarez Campayo, Zihang Li, Michael L. Birnbaum, Munmun De Choudhury:
Lived Experience Not Found: LLMs Struggle to Align with Experts on Addressing Adverse Drug Reactions from Psychiatric Medication Use. 11083-11113 - Zhouhang Xie, Tushar Khot, Bhavana Dalvi Mishra, Harshit Surana, Julian J. McAuley, Peter Clark, Bodhisattwa Prasad Majumder:
Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision. 11114-11134 - Finnian Westenfelder, Erik Hemberg, Stephen Moskal, Una-May O'Reilly, Silviu Chiricescu:
LLM-Supported Natural Language to Bash Translation. 11135-11147 - Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, Nouha Dziri, Dan Jurafsky, Maarten Sap:
REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance. 11148-11167 - Leonardo Ranaldi, Marco Valentino, André Freitas:
Eliciting Critical Reasoning in Retrieval-Augmented Generation via Contrastive Explanations. 11168-11183 - Filippo Ficarra, Ryan Cotterell, Alex Warstadt:
A Distributional Perspective on Word Learning in Neural Language Models. 11184-11207 - Jacob Matthews, Laurent Dubreuil, Imane Terhmina, Yunci Sun, Matthew Wilkens, Marten van Schijndel:
Disentangling language change: sparse autoencoders quantify the semantic evolution of indigeneity in French. 11208-11222 - Max Zuo, Francisco Piedrahita Velez, Xiaochen Li, Michael Littman, Stephen H. Bach:
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages. 11223-11240 - Sonia K. Murthy, Tomer D. Ullman, Jennifer Hu:
One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity. 11241-11258 - Linsen Li, Aron Culotta, Nicholas Mattei:
Using Text-Based Causal Inference to Disentangle Factors Influencing Online Review Ratings. 11259-11277 - Xiaomeng Jin, Zhiqi Bu, Bhanukiran Vinzamuri, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, Mingyi Hong:
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate. 11278-11294 - Songyan Zhao, Bingxuan Li, Yufei Tian, Nanyun Peng:
REFFLY: Melody-Constrained Lyrics Editing Model. 11295-11315 - Anvesh Rao Vijjini, Somnath Basu Roy Chowdhury, Snigdha Chaturvedi:
Exploring Safety-Utility Trade-Offs in Personalized Language Models. 11316-11340 - Zifeng Zhu, Mengzhao Jia, Zhihan Zhang, Lang Li, Meng Jiang:
MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems. 11341-11359 - Iman Jundi, Eva Maria Vecchi, Carlotta Quensel, Neele Falk, Gabriella Lapesa:
It Is Not Only the Negative that Deserves Attention! Understanding, Generation & Evaluation of (Positive) Moderation. 11360-11395 - Sunny Rai, Khushang Jilesh Zaveri, Shreya Havaldar, Soumna Nema, Lyle H. Ungar, Sharath Chandra Guntuku:
Social Norms in Cinema: A Cross-Cultural Analysis of Shame, Pride and Prejudice. 11396-11415 - Mo Yu, Lemao Liu, Junjie Wu, Tsz Ting Chung, Shunchi Zhang, Jiangnan Li, Dit-Yan Yeung, Jie Zhou:
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding. 11416-11431 - Md. Nishat Raihan, Antonios Anastasopoulos, Marcos Zampieri:
mHumanEval - A Multilingual Benchmark to Evaluate Large Language Models for Code Generation. 11432-11461 - Michal Golovanevsky, William Rudman, Vedant Palit, Carsten Eickhoff, Ritambhara Singh:
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and Evaluation. 11462-11482 - Dingyi Pan, Benjamin K. Bergen:
Are explicit belief representations necessary? A comparison between Large Language Models and Bayesian probabilistic models. 11483-11498 - Yue Yu, Zhengxing Chen, Aston Zhang, Liang Tan, Chenguang Zhu, Richard Yuanzhe Pang, Yundi Qian, Xuewei Wang, Suchin Gururangan, Chao Zhang, Melanie Kambadur, Dhruv Mahajan, Rui Hou:
Self-Generated Critiques Boost Reward Modeling for Language Models. 11499-11514 - Juan Diego Rodriguez, Aaron Mueller, Kanishka Misra:
Characterizing the Role of Similarity in the Property Inferences of Language Models. 11515-11533 - Ran Xu, Hui Liu, Sreyashi Nag, Zhenwei Dai, Yaochen Xie, Xianfeng Tang, Chen Luo, Yang Li, Joyce C. Ho, Carl Yang, Qi He:
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains. 11534-11550 - Hongye Liu, Ricardo Henao:
Learning to Substitute Words with Model-based Score Ranking. 11551-11565 - Leonardo Ranaldi, Giulia Pucci:
Multilingual Reasoning via Self-training. 11566-11582 - Jianguo Zhang, Tian Lan, Ming Zhu, Zuxin Liu, Thai Hoang, Shirley Kokane, Weiran Yao, Juntao Tan, Akshara Prabhakar, Haolin Chen, Zhiwei Liu, Yihao Feng, Tulika Manoj Awalgaonkar, Rithesh R. N., Zeyuan Chen, Ran Xu, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong:
xLAM: A Family of Large Action Models to Empower AI Agent Systems. 11583-11597 - Kimihiro Hasegawa, Wiradee Imrattanatrai, Zhi-Qi Cheng, Masaki Asada, Susan Holm, Yuran Wang, Ken Fukuda, Teruko Mitamura:
ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding. 11598-11617 - Antonia Karamolegkou, Sandrine Schiller Hansen, Ariadni Christopoulou, Filippos Stamatiou, Anne Lauscher, Anders Søgaard:
Ethical Concern Identification in NLP: A Corpus of ACL Anthology Ethics Statements. 11618-11635 - Han Wang, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal:
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge. 11636-11652 - Yilun Zhao, Guo Gan, Chen Zhao, Arman Cohan:
Are Multimodal LLMs Robust Against Adversarial Perturbations? RoMMath: A Systematic Evaluation on Multimodal Math Reasoning. 11653-11665 - Kangjun Noh, Baekryun Seong, Hoyoon Byun, Youngjun Choi, Sungjin Song, Kyungwoo Song:
LBC: Language-Based-Classifier for Out-Of-Variable Generalization. 11666-11678 - Elita A. Lobo, Chirag Agarwal, Himabindu Lakkaraju:
On the Impact of Fine-Tuning on Chain-of-Thought Reasoning. 11679-11698 - Teng Xiao, Zhen Ge, Sujay Sanghavi, Tian Wang, Julian Katz-Samuels, Marc Versage, Qingjun Cui, Trishul Chilimbi:
InfoPO: On Mutual Information Maximization for Large Language Model Alignment. 11699-11711 - Zhenghao Zhou, Robert Frank, R. Thomas McCoy:
Is In-Context Learning a Type of Error-Driven Learning? Evidence from the Inverse Frequency Effect in Structural Priming. 11712-11725 - Kangyu Zhu, Ziyuan Qin, Huahui Yi, Zekun Jiang, Qicheng Lao, Shaoting Zhang, Kang Li:
Guiding Medical Vision-Language Models with Diverse Visual Prompts: Framework Design and Comprehensive Exploration of Prompt Variations. 11726-11739 - Ivano Lauriola, Stefano Campese, Alessandro Moschitti:
Analyzing and Improving Coherence of Large Language Models in Question Answering. 11740-11755 - Yanzhou Pan, Huawei Lin, Yide Ran, Jiamin Chen, Xiaodong Yu, Weijie Zhao, Denghui Zhang, Zhaozhuo Xu:
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Parity LLM Data Valuation. 11756-11771 - Hongbo Zheng, Suyuan Wang, Neeraj Gangwar, Nickvash Kani:
E-Gen: Leveraging E-Graphs to Improve Continuous Representations of Symbolic Expressions. 11772-11788 - Eric Battenberg, R. J. Skerry-Ryan, Daisy Stanton, Soroosh Mariooryad, Matt Shannon, Julian Salazar, David Kao:
Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech. 11789-11806 - Daniil Larionov, Steffen Eger:
PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics. 11807-11820 - Quazi Ishtiaque Mahmud, Ali TehraniJamsaz, Hung D. Phan, Le Chen, Mihai Capota, Theodore L. Willke, Nesreen K. Ahmed, Ali Jannesari:
AutoParLLM: GNN-guided Context Generation for Zero-Shot Code Parallelization using LLMs. 11821-11841 - Yinuo Xu, Hong Chen, Sushrita Rakshit, Aparna Ananthasubramaniam, Omkar Yadav, Mingqian Zheng, Michael Jiang, Lechen Zhang, Bowen Yi, Kenan Alkiek, Abraham Israeli, Bangzhao Shu, Hua Shen, Jiaxin Pei, Haotian Zhang, Miriam Schirmer, David Jurgens:
Causally Modeling the Linguistic and Social Factors that Predict Email Response. 11842-11866 - Zhe Su, Xuhui Zhou, Sanketh Rangreji, Anubha Kabra, Julia Mendelsohn, Faeze Brahman, Maarten Sap:
AI-LieDar : Examine the Trade-off Between Utility and Truthfulness in LLM Agents. 11867-11894 - Khaoula Chehbouni, Jonathan Colaço Carr, Yash More, Jackie CK Cheung, Golnoosh Farnadi:
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset. 11895-11925 - Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn J. Lawrie, Luca Soldaini:
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions. 11926-11942 - Jaehyung Kim, Yiming Yang:
Few-shot Personalization of LLMs with Mis-aligned Responses. 11943-11974 - Hoang Nguyen, Khyati Mahajan, Vikas Yadav, Julian Salazar, Philip S. Yu, Masoud Hashemi, Rishabh Maheshwary:
Prompting with Phonemes: Enhancing LLMs' Multilinguality for Non-Latin Script Languages. 11975-11994 - Margaret Mitchell, Giuseppe Attanasio, Ioana Baldini, Miruna Clinciu, Jordan Clive, Pieter Delobelle, Manan Dey, Sil Hamilton, Timm Dill, Jad Doughman, Ritam Dutt, Avijit Ghosh, Jessica Zosa Forde, Carolin Holtermann, Lucie-Aimée Kaffee, Tanmay Laud, Anne Lauscher, Roberto L. Lopez-Davila, Maraim Masoud, Nikita Nangia, Anaelia Ovalle, Giada Pistilli, Dragomir Radev, Beatrice Savoldi, Vipul Raheja, Jeremy Qin, Esther Ploeger, Arjun Subramonian, Kaustubh D. Dhole, Kaiser Sun, Amirbek Djanibekov, Jonibek Mansurov, Kayo Yin, Emilio Villa Cueva, Sagnik Mukherjee, Jerry Huang, Xudong Shen, Jay Gala, Hamdan Al-Ali, Tair Djanibekov, Nurdaulet Mukhituly, Shangrui Nie, Shanya Sharma, Karolina Stanczak, Eliza Szczechla, Tiago Timponi Torrent, Deepak Tunuguntla, Marcelo Viridiano, Oskar Van Der Wal, Adina Yakefu, Aurélie Névéol, Mike Zhang, Sydney Zink, Zeerak Talat:
SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models. 11995-12041 - Jacob K. Christopher, Brian R. Bartoldson, Tal Ben-Nun, Michael Cardei, Bhavya Kailkhura, Ferdinando Fioretto:
Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion. 12042-12059 - Allahsera Auguste Tapo, Kevin Assogba, Christopher M. Homan, M. Mustafa Rafique, Marcos Zampieri:
Bayelemabaga: Creating Resources for Bambara NLP. 12060-12070 - Soyoung Yang, Hojun Cho, Jiyoung Lee, Sohee Yoon, Edward Choi, Jaegul Choo, Won Ik Cho:
Single Ground Truth Is Not Enough: Adding Flexibility to Aspect-Based Sentiment Analysis Evaluation. 12071-12096 - Jianyu Liu, Hangyu Guo, Ranjie Duan, Xingyuan Bu, Yancheng He, Shilong Li, Hui Huang, Jiaheng Liu, Yucheng Wang, Chenchen Jing, Xingwei Qu, Xiao Zhang, Pei Wang, Yanan Wu, Jihao Gu, Yangguang Li, Jianke Zhu:
DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models. 12097-12118 - Amanda Bertsch, Maor Ivgi, Emily Xiao, Uri Alon, Jonathan Berant, Matthew R. Gormley, Graham Neubig:
In-Context Learning with Long-Context Models: An In-Depth Exploration. 12119-12149 - JoonHo Lee, JuYoun Son, Juree Seok, Wooseok Jang, Yeong-Dae Kwon:
Preference Consistency Matters: Enhancing Preference Learning in Language Models with Automated Self-Curation of Training Corpora. 12150-12169 - Sina Rismanchian, Yasaman Razeghi, Sameer Singh, Shayan Doroudi:
TurtleBench: A Visual Programming Benchmark in Turtle Geometry. 12170-12188 - Rakshitha Rao Ailneni, Sanda M. Harabagiu:
Automatically Discovering How Misogyny is Framed on Social Media. 12189-12208 - Mahnaz Koupaee, Jake W. Vincent, Saab Mansour, Igor Shalyminov, Han He, Hwanjun Song, Raphael Shu, Jianfeng He, Yi Nian, Amy Wing-mei Wong, Kyu J. Han, Hang Su:
Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation. 12209-12246 - Yixin Liu, Kejian Shi, Alexander R. Fabbri, Yilun Zhao, PeiFeng Wang, Chien-Sheng Wu, Shafiq Joty, Arman Cohan:
ReIFE: Re-evaluating Instruction-Following Evaluation. 12247-12287 - Yu Hou, Hal Daumé III, Rachel Rudinger:
Language Models Predict Empathy Gaps Between Social In-groups and Out-groups. 12288-12304 - Romain Storaï, Seung-won Hwang:
HARP: Hesitation-Aware Reframing in Transformer Inference Pass. 12305-12319 - Samar Mohamed Magdy, Sang Yun Kwon, Fakhraddin Alwajih, Safaa Taher Abdelfadil, Shady Shehata, Muhammad Abdul-Mageed:
JAWAHER: A Multidialectal Dataset of Arabic Proverbs for LLM Benchmarking. 12320-12341 - Sam Lin, Wenyue Hua, Zhenting Wang, Mingyu Jin, Lizhou Fan, Yongfeng Zhang:
EmojiPrompt: Generative Prompt Obfuscation for Privacy-Preserving Communication with Cloud-based LLMs. 12342-12361 - Nishant Subramani, Jason Eisner, Justin Svegliato, Benjamin Van Durme, Yu Su, Sam Thomson:
MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools. 12362-12375 - Ashish Seth, Ramaneswaran Selvakumar, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification. 12376-12394 - Justin Zhao, Flor Miriam Plaza del Arco, Amanda Cercas Curry:
Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks. 12395-12450 - Carter Teplica, Yixin Liu, Arman Cohan, Tim G. J. Rudner:
SCIURus: Shared Circuits for Interpretable Uncertainty Representations in Language Models. 12451-12469 - Sonal Kumar, Sreyan Ghosh, Utkarsh Tyagi, Anton Jeran Ratnarajah, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha:
ProSE: Diffusion Priors for Speech Enhancement. 12470-12483 - Meng Chen, Philip Arthur, Qianyu Feng, Cong Duy Vu Hoang, Yu-Heng Hong, Mahdi Kazemi Moghaddam, Omid Nezami, Duc Thien Nguyen, Gioacchino Tangari, Duy Vu, Thanh Vu, Mark Johnson, Krishnaram Kenthapadi, Don Dharmasiri, Long Duong, Yuan-Fang Li:
Mastering the Craft of Data Synthesis for CodeLLMs. 12484-12500 - Xingxuan Li, Xuan-Phi Nguyen, Shafiq Joty, Lidong Bing:
ParaICL: Towards Parallel In-Context Learning. 12501-12511 - Longxuan Yu, Delin Chen, Siheng Xiong, Qingyang Wu, Dawei Li, Zhikai Chen, Xiaoze Liu, Liangming Pan:
CausalEval: Towards Better Causal Reasoning in Language Models. 12512-12540 - Yang Ouyang, Hengrui Gu, Shuhang Lin, Wenyue Hua, Jie Peng, Bhavya Kailkhura, Meijun Gao, Tianlong Chen, Kaixiong Zhou:
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense. 12541-12554 - Suyoung Bae, YunSeok Choi, Jee-Hyong Lee:
DeCAP: Context-Adaptive Prompt Generation for Debiasing Zero-shot Question Answering in Large Language Models. 12555-12574 - Chia-Yu Hung, Navonil Majumder, Ambuj Mehrish, Soujanya Poria:
Reward-Guided Tree Search for Inference Time Alignment of Large Language Models. 12575-12593 - Xiaomeng Wang, Zhengyu Zhao, Martha A. Larson:
Typographic Attacks in a Multi-Image Setting. 12594-12604 - Haruki Sakajo, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe:
Tonguescape: Exploring Language Models Understanding of Vowel Articulation. 12605-12619 - YunSeok Choi, CheolWon Na, Jee-Hyong Lee:
CoRAC: Integrating Selective API Document Retrieval with Question Semantic Intent for Code Question Answering. 12620-12635 - Ander Corral, Ixak Sarasua, Xabier Saralegi:
Pipeline Analysis for Developing Instruct LLMs in Low-Resource Languages: A Case Study on Basque. 12636-12655 - Paul Youssef, Zhixue Zhao, Jörg Schlötterer, Christin Seifert:
How to Make LLMs Forget: On Reversing In-Context Knowledge Edits. 12656-12669 - Erfan Moosavi Monazzah, Vahid Rahimzadeh, Yadollah Yaghoobzadeh, Azadeh Shakery, Mohammad Taher Pilehvar:
PerCul: A Story-Driven Cultural Evaluation of LLMs in Persian. 12670-12687 - Soham Poddar, Paramita Koley, Janardan Misra, Niloy Ganguly, Saptarshi Ghosh:
Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models. 12688-12704 - Yijia Xiao, Runhui Wang, Luyang Kong, Davor Golac, Wei Wang:
CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories. 12705-12723 - Suyoung Bae, YunSeok Choi, Hyojun Kim, Jee-Hyong Lee:
SALAD: Improving Robustness and Generalization through Contrastive Learning with Structure-Aware and LLM-Driven Augmented Data. 12724-12738 - Jiwoong Sohn, Yein Park, Chanwoong Yoon, Sihyeon Park, Hyeon Hwang, Mujeen Sung, Hyunjae Kim, Jaewoo Kang:
Rationale-Guided Retrieval Augmented Generation for Medical Question Answering. 12739-12753 - Xi Chen, Min Zeng:
Prototype Conditioned Generative Replay for Continual Learning in NLP. 12754-12770 - James Hale, Sushrita Rakshit, Kushal Chawla, Jeanne M. Brett, Jonathan Gratch:
KODIS: A Multicultural Dispute Resolution Dialogue Corpus. 12771-12785

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
