


default search action
ACL 2025: Vienna, Austria
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar:
Findings of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1, 2025. Association for Computational Linguistics 2025, ISBN 979-8-89176-256-5 - Frontmatter.
- Yachao Zhao, Bo Wang, Yan Wang, Dongming Zhao, Ruifang He, Yuexian Hou:
Explicit vs. Implicit: Investigating Social Bias in Large Language Models through Self-Reflection. 1-12 - Yanbei Jiang, Yihao Ding, Chao Lei, Jiayang Ao, Jey Han Lau, Krista A. Ehinger:
Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task. 13-45 - Guhao Feng, Kai Yang, Yuntian Gu, Xinyue Ai, Shengjie Luo, Jiacheng Sun, Di He, Zhenguo Li, Liwei Wang:
How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs. 46-85 - Zeliang Zhang, Xiaodong Liu, Hao Cheng, Chenliang Xu, Jianfeng Gao:
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts. 86-102 - Dongshuo Liu, Zhijing Wu, Dandan Song, Heyan Huang:
A Persona-Aware LLM-Enhanced Framework for Multi-Session Personalized Dialogue Generation. 103-123 - Yanzhi Tian, Zeming Liu, Zhengyang Liu, Yuhang Guo:
Exploring In-Image Machine Translation with Real-World Background. 124-137 - Wei Li, Lujun Li, Mark G. Lee, Shengjie Sun, Lei Zhang, Wei Xue, Yike Guo:
BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios. 138-152 - Lingyuan Liu, Mengxiang Zhang:
GOLFer: Smaller LMs-Generated Documents Hallucination Filter & Combiner for Query Expansion in Information Retrieval. 153-162 - Lingyuan Liu, Mengxiang Zhang:
Exp4Fuse: A Rank Fusion Framework for Enhanced Sparse Retrieval using Large Language Model-based Query Expansion. 163-173 - Alexander Shvets:
Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification. 174-191 - Zifeng Cheng, Zhaoling Chen, Zhiwei Jiang, Yafeng Yin, Cong Wang, Shiping Ge, Qing Gu:
Multi-Prompting Decoder Helps Better Language Understanding. 192-208 - Sam O'Connor Russell, Naomi Harte:
Visual Cues Enhance Predictive Turn-Taking for Two-Party Human Interaction. 209-221 - Bingxiang He, Ning Ding, Cheng Qian, Jia Deng, Ganqu Cui, Lifan Yuan, Haiwen Hong, Huan-ang Gao, Longtao Huang, Hui Xue, Huimin Chen, Zhiyuan Liu, Maosong Sun:
The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning. 222-243 - Jie Zhu, Junhui Li, Yalong Wen, Xiandong Li, Lifan Guo, Feng Chen:
MFinMeeting: A Multilingual, Multi-Sector, and Multi-Task Financial Meeting Understanding Evaluation Dataset. 244-266 - Yijie Zhong, Yunfan Gao, Xiaolian Zhang, Haofen Wang:
ODDA: An OODA-Driven Diverse Data Augmentation Framework for Low-Resource Relation Extraction. 267-285 - Luca Cagliero, Lorenzo Vaiani, Eliana Pastor, Alkis Koudounas, Elena Baralis, Vittorio Mazzia, Sandro Pollastrini, Thomas Gueudré, Manuel Giollo, Daniele Amberti, Yue Wu:
Detecting and Mitigating Challenges in Zero-Shot Video Summarization with Video LLMs. 286-301 - Tarek Mahmoud, Zhuohan Xie, Dimitar Iliyanov Dimitrov, Nikolaos Nikolaidis, Purificação Silvano, Roman Yangarber, Shivam Sharma, Elisa Sartori, Nicolas Stefanovitch, Giovanni Da San Martino, Jakub Piskorski, Preslav Nakov:
Entity Framing and Role Portrayal in the News. 302-326 - Guangya Wan, Yuqi Wu, Hao Wang, Shengming Zhao, Jie Chen, Sheng Li:
Derailer-Rerailer: Adaptive Verification for Efficient and Reliable Language Model Reasoning. 327-348 - Yiming Li, Zhao Zhang:
Leveraging Large Language Models for Conversational Multi-Doc Question Answering: The First Place of WSDM Cup 2024. 349-355 - Wenyu Tao, Xiaofen Xing, Yirong Chen, Linyi Huang, Xiangmin Xu:
TreeRAG: Unleashing the Power of Hierarchical Storage for Enhanced Knowledge Retrieval in Long Documents. 356-371 - Qiang Ding, Lvzhou Luo, Yixuan Cao, Ping Luo:
Attention with Dependency Parsing Augmentation for Fine-Grained Attribution. 372-387 - Yikuan Hu, Chen Huang, Wenqiang Lei:
ASTRO: Automatic Strategy Optimization For Non-Cooperative Dialogues. 388-408 - Chen Xiong, Xiangyu Qi, Pin-Yu Chen, Tsung-Yi Ho:
Defensive Prompt Patch: A Robust and Generalizable Defense of Large Language Models against Jailbreak Attacks. 409-437 - Jessica Lin, Amir Zeldes:
GUM-SAGE: A Novel Dataset and Approach for Graded Entity Salience Prediction. 438-455 - Zacchary Sadeddine, Fabian M. Suchanek:
Verifying the Steps of Deductive Reasoning Chains. 456-475 - Pardis Sadat Zahraei, Ali Emami:
Translate With Care: Addressing Gender Bias, Neutrality, and Reasoning in Large Language Model Translations. 476-501 - Benjamin C. Warner, Ziqi Xu, Simon Haroutounian, Thomas George Kannampallil, Chenyan Lu:
Utilizing Semantic Textual Similarity for Clinical Survey Data Feature Selection. 502-520 - Runchu Tian, Yanghao Li, Yuepeng Fu, Siyang Deng, Qinyu Luo, Cheng Qian, Shuo Wang, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Huadong Wang, Xiaojiang Liu:
Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs. 521-533 - Razvan-Gabriel Dumitru, Vikas Yadav, Rishabh Maheshwary, Paul-Ioan Clotan, Sathwik Tejaswi Madhusudhan, Mihai Surdeanu:
Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs. 534-550 - Kazuki Irie:
Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? A Petroglyph Revisited. 551-559 - Guofeng Cui, Pichao Wang, Yang Liu, Zemian Ke, Zhu Liu, Vimal Bhat:
CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation. 560-574 - Nishanth Sridhar Nakshatri, Nikhil Mehta, Siyi Liu, Sihao Chen, Daniel Hopkins, Dan Roth, Dan Goldwasser:
Talking Point based Ideological Discourse Analysis in News Events. 575-594 - Runheng Liu, Xingchen Xiao, Heyan Huang, Zewen Chi, Zhijing Wu:
FlashBack: Efficient Retrieval-Augmented Language Modeling for Fast Inference. 595-608 - Guangya Yu, Yanhao Li, Zongying Jiang, Yuxiong Jin, Li Dai, Yupian Lin, Ruihui Hou, Weiyan Zhang, Yongqi Fan, Qi Ye, Jingping Liu, Tong Ruan:
CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation. 609-626 - Liyu Zhang, Weiqi Wang, Tianqing Fang, Yangqiu Song:
ConKE: Conceptualization-Augmented Knowledge Editing in Large Language Models for Commonsense Reasoning. 627-635 - ChengAo Shen, Zhengzhang Chen, Dongsheng Luo, Dongkuan Xu, Haifeng Chen, Jingchao Ni:
Exploring Multi-Modal Data with Tool-Augmented LLM Agents for Precise Causal Discovery. 636-660 - Yaxun Dai, Haiqin Yang, Hao Mou, Pingfu Chao:
PARSQL: Enhancing Text-to-SQL through SQL Parsing and Reasoning. 661-681 - Yuntai Bao, Xuhong Zhang, Tianyu Du, Xinkui Zhao, Zhengwen Feng, Hao Peng, Jianwei Yin:
Probing the Geometry of Truth: Consistency and Generalization of Truth Directions in LLMs Across Logical Transformations and Question Answering Tasks. 682-700 - Hritik Bansal, Ashima Suvarna, Gantavya Bhatt, Nanyun Peng, Kai-Wei Chang, Aditya Grover:
Comparing Bad Apples to Good Oranges Aligning Large Language Models via Joint Preference Optimization. 701-723 - Junhao Yu, Yan Zhuang, Yuxuan Sun, Weibo Gao, Qi Liu, Mingyue Cheng, Zhenya Huang, Enhong Chen:
TestAgent: An Adaptive and Intelligent Expert for Human Assessment. 724-747 - Quan Ze Chen, Kevin Feng, Chan Young Park, Amy X. Zhang:
SPICA: Retrieving Scenarios for Pluralistic In-Context Alignment. 748-765 - Kushal Jain, Moritz Miller, Niket Tandon, Kumar Shridhar:
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning. 766-778 - Wei Xiang, Chuanhong Zhan, Qing Zhang, Bang Wang:
Evaluating Instructively Generated Statement by Large Language Models for Directional Event Causality Identification. 779-785 - Chengwei Wei, Bin Wang, Jung-Jae Kim, Guimei Liu, Nancy F. Chen:
CoinMath: Harnessing the Power of Coding Instruction for Math LLM. 786-797 - Zain Muhammad Mujahid, Dilshod Azizov, Maha Tufail Agro, Preslav Nakov:
Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts. 798-819 - Kun Zhang, Oana Balalau, Ioana Manolescu:
Structured Discourse Representation for Factual Consistency Verification. 820-838 - Chuyi Kong, Ziyang Luo, Hongzhan Lin, Zhiyuan Fan, Yaxin Fan, Yuxi Sun, Jing Ma:
SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing LLMs. 839-866 - Luke Gessler, Alexis Palmer, Katharina von der Wense:
Understanding the Gap: an Analysis of Research Collaborations in NLP and Language Documentation. 867-877 - Juntao Tan, Liangwei Yang, Zuxin Liu, Zhiwei Liu, Rithesh R. N., Tulika Manoj Awalgaonkar, Jianguo Zhang, Weiran Yao, Ming Zhu, Shirley Kokane, Silvio Savarese, Huan Wang, Caiming Xiong, Shelby Heinecke:
PersonaBench: Evaluating AI Models on Understanding Personal Information through Accessing (Synthetic) Private User Data. 878-893 - Simret Araya Gebreegziabher, Kuangshi Ai, Zheng Zhang, Elena L. Glassman, Toby Jia-Jun Li:
Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning. 894-906 - Eric Modesitt, Ke Yang, Spencer Hulsey, Xin Liu, ChengXiang Zhai, Volodymyr V. Kindratenko:
ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study. 907-926 - Xiaobo Guo, Soroush Vosoughi:
Serial Position Effects of Large Language Models. 927-953 - Zhiyin Yu, Chao Zheng, Chong Chen, Xian-Sheng Hua, Xiao Luo:
scRAG: Hybrid Retrieval-Augmented Generation for LLM-based Cross-Tissue Single-Cell Annotation. 954-970 - Abu Ubaida Akash, Ahmed Fahmy, Amine Trabelsi:
Can Large Language Models Address Open-Target Stance Detection? 971-985 - Congchi Yin, Yongpeng Zhang, Xuyun Wen, Piji Li:
Improve Language Model and Brain Alignment via Associative Memory. 986-999 - Ziyang Ma, Xiquan Li, Yakun Song, Wenxi Chen, Chenpeng Du, Jian Wu, Yuanzhe Chen, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xie Chen:
Towards Reliable Large Audio Language Model. 1000-1014 - Sho Takase, Ryokan Ri, Shun Kiyono, Takuya Kato:
Large Vocabulary Size Improves Large Language Models. 1015-1026 - Zihan Wang, Xiaocui Yang, Yongkang Liu, Shi Feng, Daling Wang, Yifei Zhang:
MUSE: A Multimodal Conversational Recommendation Dataset with Scenario-Grounded User Profiles. 1027-1053 - Michelle Wastl, Jannis Vamvas, Rico Sennrich:
Machine Translation Models are Zero-Shot Detectors of Translation Direction. 1054-1074 - Jerry Huang, Prasanna Parthasarathi, Mehdi Rezagholizadeh, Boxing Chen, Sarath Chandar:
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination. 1075-1096 - Jie He, Jennifer Neville, Mengting Wan, Longqi Yang, Hui Liu, Xiaofeng Xu, Xia Song, Jeff Z. Pan, Pei Zhou:
GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation. 1097-1122 - Chengxing Xie, Bowen Li, Chang Gao, He Du, Wai Lam, Difan Zou, Kai Chen:
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution. 1123-1139 - Zixuan Wu, Yoolim Kim, Carolyn Jane Anderson:
GlyphPattern: An Abstract Pattern Recognition for Vision-Language Models. 1140-1175 - Qianli Wang, Nils Feldhus, Simon Ostermann, Luis Felipe Villa-Arenas, Sebastian Möller, Vera Schmitt:
FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation. 1176-1191 - Guocong Li, Weize Liu, Yihang Wu, Ping Wang, Shuaihan Huang, Hongxia Xu, Jian Wu:
From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs. 1192-1209 - Di Wu, Xin Lu, Yanyan Zhao, Bing Qin:
Separate the Wheat from the Chaff: A Post-Hoc Approach to Safety Re-Alignment for Fine-Tuned Language Models. 1210-1225 - Rongwu Xu, Xiaojian Li, Shuo Chen, Wei Xu:
Nuclear Deployed!: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents. 1226-1310 - Dacao Zhang, Kun Zhang, Shimao Chu, Le Wu, Xin Li, Si Wei:
MoRE: A Mixture of Low-Rank Experts for Adaptive Multi-Task Learning. 1311-1324 - Xin-Yu Xiao, Yalei Liu, Xiangyu Liu, Zengrui Li, Erwei Yin, Qianchen Xia:
Lunar Twins: We Choose to Go to the Moon with Large Language Models. 1325-1339 - Dora Zhao, Qianou Ma, Xinran Zhao, Chenglei Si, Chenyang Yang, Ryan Louie, Ehud Reiter, Diyi Yang, Tongshuang Wu:
SPHERE: An Evaluation Card for Human-AI Systems. 1340-1365 - Maximillian Chen, Ruoxi Sun, Sercan Ö. Arik:
Data-Centric Improvements for Enhancing Multi-Modal Understanding in Spoken Conversation Modeling. 1366-1387 - Haochen Liu, Song Wang, Chen Chen, Jundong Li:
Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models. 1388-1400 - Huaizhi Qu, Xinyu Zhao, Jie Peng, Kwonjoon Lee, Behzad Dariush, Tianlong Chen:
UQ-Merge: Uncertainty Guided Multimodal Large Language Model Merging. 1401-1417 - Korbinian Q. Weidinger, T. Y. S. S. Santosh, Oana Ichim, Matthias Grabmair:
AQuAECHR: Attributed Question Answering for European Court of Human Rights. 1418-1447 - Yuhao Zhang, Xiangnan Ma, Kaiqi Kou, Peizhuo Liu, Weiqiao Shan, Benyou Wang, Tong Xiao, Yuxin Huang, Zhengtao Yu, JingBo Zhu:
Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation. 1448-1460 - Yiqin Wang, Haoji Zhang, Jingqi Tian, Yansong Tang:
Ponder & Press: Advancing Visual GUI Agent towards General Computer Control. 1461-1473 - Jiayi Gui, Yiming Liu, Jiale Cheng, Xiaotao Gu, Xiao Liu, Hongning Wang, Yuxiao Dong, Jie Tang, Minlie Huang:
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models. 1474-1491 - Jiarui Ji, Runlin Lei, Jialing Bi, Zhewei Wei, Xu Chen, Yankai Lin, Xuchen Pan, Yaliang Li, Bolin Ding:
LLM-Based Multi-Agent Systems are Scalable Graph Generative Models. 1492-1523 - Tiankai Yang, Yi Nian, Li Li, Ruiyao Xu, Yuangang Li, Jiaqi Li, Zhuo Xiao, Xiyang Hu, Ryan A. Rossi, Kaize Ding, Xia Hu, Yue Zhao:
AD-LLM: Benchmarking Large Language Models for Anomaly Detection. 1524-1547 - Jie Liu, Guohua Wang, Ronghui Yang, Jiajie Zeng, Mengchen Zhao, Yi Cai:
RTADev: Intention Aligned Multi-Agent Framework for Software Development. 1548-1581 - Shivam Shandilya, Menglin Xia, Supriyo Ghosh, Huiqiang Jiang, Jue Zhang, Qianhui Wu, Victor Rühle, Saravan Rajmohan:
TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning. 1582-1597 - Kyeongman Park, Minbeom Kim, Kyomin Jung:
A Character-Centric Creative Story Generation via Imagination. 1598-1645 - Minghan Wang, Viet-Thanh Pham, Farhad Moghimifar, Thuy-Trang Vu:
Proverbs Run in Pairs: Evaluating Proverb Translation Capability of Large Language Model. 1646-1662 - Yang Zhang, Shixin Yang, Chenjia Bai, Fei Wu, Xiu Li, Zhen Wang, Xuelong Li:
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration. 1663-1699 - Chuanyuan Tan, Wenbiao Shao, Hao Xiong, Tong Zhu, Zhenhua Liu, Kai Shi, Wenliang Chen:
UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions. 1700-1715 - Minjie Qiang, Zhongqing Wang, Xiaoyi Bao, Haoyuan Ma, Shoushan Li, Guodong Zhou:
Exploring Knowledge Filtering for Retrieval-Augmented Discriminative Tasks. 1716-1729 - Chong Li, Yingzhuo Deng, Jiajun Zhang, Chengqing Zong:
Group then Scale: Dynamic Mixture-of-Experts Multilingual Language Model. 1730-1754 - Fangxu Yu, Junjie Guo, Zhen Wu, Xinyu Dai:
Beyond Verbal Cues: Emotional Contagion Graph Network for Causal Emotion Entailment. 1755-1767 - Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang, Le Sun:
Critic-CoT: Boosting the Reasoning Abilities of Large Language Model via Chain-of-Thought Critic. 1768-1806 - Sondre Wold, Lucas Georges Gabriel Charpentier, Étienne Simon:
Systematic Generalization in Language Models Scales with Information Entropy. 1807-1819 - Byung-Doh Oh, Hongao Zhu, William Schuler:
The Inverse Scaling Effect of Pre-Trained Language Model Surprisal Is Not Due to Data Leakage. 1820-1827 - Ganlin Xu, Zhoujia Zhang, Wangyi Mei, Jiaqing Liang, Weijia Lu, Xiaodong Zhang, Zhifei Yang, Xiaofeng Ma, Yanghua Xiao, Deqing Yang:
Logical Consistency is Vital: Neural-Symbolic Information Retrieval for Negative-Constraint Queries. 1828-1847 - Rena Wei Gao, Xuetong Wu, Siwen Luo, Caren Han, Feng Liu:
'No' Matters: Out-of-Distribution Detection in Multimodality Multi-Turn Interactive Dialogue Download PDF. 1848-1864 - Qizhi Wan, Liu Tao, Changxuan Wan, Rong Hu, Keli Xiao, Yuxin Shuai:
Event Pattern-Instance Graph: A Multi-Round Role Representation Learning Strategy for Document-Level Event Argument Extraction. 1865-1877 - Lukas Edman, Helmut Schmid, Alexander Fraser:
EXECUTE: A Multilingual Benchmark for LLM Token Understanding. 1878-1887 - Wei-Fan Chen, Zhixue Zhao, Akbar Karimi, Lucie Flek:
Explainable Hallucination through Natural Language Inference Mapping. 1888-1896 - Hao Liu, Zhengren Wang, Xi Chen, Zhiyu Li, Feiyu Xiong, Qinhan Yu, Wentao Zhang:
HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation. 1897-1913 - Markus Frohmann, Gabriel Meseguer-Brocal, Markus Schedl, Elena V. Epure:
Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion. 1914-1926 - Sangmin Woo, Donguk Kim, Jaehyuk Jang, Yubin Choi, Changick Kim:
Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models. 1927-1951 - Xiaoning Dong, Wenbo Hu, Wei Xu, Tianxing He:
SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage. 1952-1987 - Yifan Hu, Rui Liu, Yi Ren, Xiang Yin, Haizhou Li:
Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis. 1988-2003 - Aochuan Chen, Jiashun Cheng, Zijing Liu, Ziqi Gao, Fugee Tsung, Yu Li, Jia Li:
Parameter-Efficient Fine-Tuning via Circular Convolution. 2004-2019 - Jiahao Li, Zhendong Mao, Quan Wang:
Alleviating Hallucinations in Large Language Models via Truthfulness-driven Rank-adaptive LoRA. 2020-2031 - Xinye Li, Zunwen Zheng, Qian Zhang, Dekai Zhuang, Jiabao Kang, Liyan Xu, Qingbin Liu, Xi Chen, Zhiying Tu, Dianhui Chu, Dianbo Sui:
ScEdit: Script-based Assessment of Knowledge Editing. 2032-2052 - Seanie Lee, Dong Bok Lee, Dominik Wagner, Minki Kang, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang:
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models. 2053-2069 - Rena Wei Gao, Ming-Bin Chen, Lea Frermann, Jey Han Lau:
Moderation Matters: Measuring Conversational Moderation Impact in English as a Second Language Group Discussion. 2070-2095 - Katherine Atwell, Mandy Simons, Malihe Alikhani:
Measuring Bias and Agreement in Large Language Model Presupposition Judgments. 2096-2107 - Jeonghun Baek, Akiko Aizawa, Kiyoharu Aizawa:
Harnessing PDF Data for Improving Japanese Large Multimodal Models. 2108-2123 - Pranaydeep Singh, Eneko Agirre, Gorka Azkune, Orphée De Clercq, Els Lefever:
EnerGIZAr: Leveraging GIZA++ for Effective Tokenizer Initialization. 2124-2137 - Yuxiang Chai, Siyuan Huang, Yazhe Niu, Han Xiao, Liang Liu, Guozhi Wang, Dingyu Zhang, Shuai Ren, Hongsheng Li:
AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents. 2138-2156 - Houjun Liu, John Bauer, Christopher D. Manning:
Drop Dropout on Single Epoch Language Model Pretraining. 2157-2166 - Zongqi Wang, Baoyuan Wu, Jingyuan Deng, Yujiu Yang:
Robust and Minimally Invasive Watermarking for EaaS. 2167-2191 - Andrei Jarca, Florinel-Alin Croitoru, Radu Tudor Ionescu:
Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text. 2192-2201 - Taneesh Gupta, Shivam Shandilya, Xuchao Zhang, Rahul Madhavan, Supriyo Ghosh, Chetan Bansal, Huaxiu Yao, Saravan Rajmohan:
CARMO: Dynamic Criteria Generation for Context Aware Reward Modelling. 2202-2261 - Wenxi Chen, Ziyang Ma, Ruiqi Yan, Yuzhe Liang, Xiquan Li, Ruiyang Xu, Zhikang Niu, Yanqiao Zhu, Yifan Yang, Zhanxun Liu, Kai Yu, Yuxuan Hu, Jinyu Li, Yan Lu, Shujie Liu, Xie Chen:
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training. 2262-2282 - Yanyang Li, Tin Long Wong, Cheung To Hung, Jianqiao Zhao, Duo Zheng, Ka Wai Liu, Michael R. Lyu, Liwei Wang:
C²LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation. 2283-2306 - Wei Zhou, Mohsen Mesgar, Heike Adel, Annemarie Friedrich:
Texts or Images? A Fine-grained Analysis on the Effectiveness of Input Representations and Models for Table Question Answering. 2307-2318 - Keyeun Lee, Seolhee Lee, Esther Hehsun Kim, Yena Ko, Jinsu Eun, Dahee Kim, Hyewon Cho, Haiyi Zhu, Robert E. Kraut, Eunyoung Suh, Eun-mee Kim, Hajin Lim:
Adaptive-VP: A Framework for LLM-Based Virtual Patients that Adapts to Trainees' Dialogue to Facilitate Nurse Communication Training. 2319-2352 - Hai Huang, Yan Xia, Shengpeng Ji, Shulei Wang, Hanting Wang, Minghui Fang, Jieming Zhu, Zhenhua Dong, Sashuai Zhou, Zhou Zhao:
Enhancing Multimodal Unified Representations for Cross Modal Generalization. 2353-2366 - Da Ju, Hagen Blix, Adina Williams:
Domain Regeneration: How well do LLMs match syntactic properties of text domains? 2367-2388 - Raphaël Mouravieff, Benjamin Piwowarski, Sylvain Lamprier:
Structural Deep Encoding for Table Question Answering. 2389-2402 - Bo Li, Gexiang Fang, Wei Ye, Zhenghua Xu, Jinglei Zhang, Hao Cheng, Shikun Zhang:
MPL: Multiple Programming Languages with Large Language Models for Information Extraction. 2403-2414 - Zheng Chu, Huiming Fan, Jingchang Chen, Qianyu Wang, Mingda Yang, Jiafeng Liang, Zhongjie Wang, Hao Li, Guo Tang, Ming Liu, Bing Qin:
Self-Critique Guided Iterative Reasoning for Multi-hop Question Answering. 2415-2438 - Ruizhe Li, Yanjun Gao:
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions. 2439-2465 - Sreyan Ghosh, Mohammad Sadegh Rasooli, Michael Levit, Peidong Wang, Jian Xue, Dinesh Manocha, Jinyu Li:
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation. 2466-2482 - Ruikang Hu, Shaoyu Lin, Yeliang Xiu, Yongmei Liu:
LTRAG: Enhancing Autoformalization and Self-refinement for Logical Reasoning with Thought-Guided RAG. 2483-2493 - Giuseppe Ruggiero, Matteo Testa, Jurgen Van de Walle, Luigi Di Caro:
Eta-WavLM: Efficient Speaker Identity Removal in Self-Supervised Speech Representations Using a Simple Linear Equation. 2494-2504 - Ke Wang, Junting Pan, Linda Wei, Aojun Zhou, Weikang Shi, Zimu Lu, Han Xiao, Yunqiao Yang, Houxing Ren, Mingjie Zhan, Hongsheng Li:
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning. 2505-2534 - Boyang Xue, Hongru Wang, Rui Wang, Sheng Wang, Zezhong Wang, Yiming Du, Bin Liang, Wenxuan Zhang, Kam-Fai Wong:
MlingConf: A Comprehensive Study of Multilingual Confidence Estimation on Large Language Models. 2535-2556 - Keyuan Cheng, Zijian Kan, Zhuoran Zhang, Muhammad Asif Ali, Lijie Hu, Di Wang:
COMPKE: Complex Question Answering under Knowledge Editing. 2557-2576 - Junhao Hu, Wenrui Huang, Weidong Wang, Zhenwen Li, Tiancheng Hu, Zhixia Liu, Xusheng Chen, Tao Xie, Yizhou Shan:
RaaS: Reasoning-Aware Attention Sparsity for Efficient LLM Reasoning. 2577-2590 - Rongguang Ye, Ming Tang:
One-for-All Pruning: A Universal Model for Customized Compression of Large Language Models. 2591-2604 - Shangda Wu, Zhancheng Guo, Ruibin Yuan, Junyan Jiang, Seungheon Doh, Gus Xia, Juhan Nam, Xiaobing Li, Feng Yu, Maosong Sun:
CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages. 2605-2625 - Ming Zhang, Yuhui Wang, Yujiong Shen, Tingyi Yang, Changhao Jiang, Yilong Wu, Shihan Dou, Qinhao Chen, Zhiheng Xi, Zhihao Zhang, Yi Dong, Zhen Wang, Zhihui Fei, Mingyang Wan, Tao Liang, Guojun Ma, Qi Zhang, Tao Gui, Xuanjing Huang:
PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts. 2626-2649 - Lang Qin, Yao Zhang, Hongru Liang, Adam Jatowt, Zhenglu Yang:
Listening to Patients: Detecting and Mitigating Patient Misreport in Medical Dialogue System. 2650-2664 - Xiaoyang Hu, Richard L. Lewis:
Do Language Models Understand the Cognitive Tasks Given to Them? Investigations with the N-Back Paradigm. 2665-2677 - Yuxia Geng, Runkai Zhu, Jiaoyan Chen, Jintai Chen, Xiang Chen, Zhuo Chen, Shuofei Qiao, Yuxiang Wang, Xiaoliang Xu, Sheng-Jun Huang:
Graph-guided Cross-composition Feature Disentanglement for Compositional Zero-shot Learning. 2678-2690 - Wenhao Li, Yuxin Zhang, Gen Luo, Daohai Yu, Rongrong Ji:
Training Long-Context LLMs Efficiently via Chunk-wise Optimization. 2691-2700 - Jiashun Cheng, Aochuan Chen, Nuo Chen, Ziqi Gao, Yuhan Li, Jia Li, Fugee Tsung:
Revisiting LoRA through the Lens of Parameter Redundancy: Spectral Encoding Helps. 2701-2718 - Keyuan Cheng, Xudong Shen, Yihao Yang, TengyueWang TengyueWang, Yang Cao, Muhammad Asif Ali, Hanbin Wang, Lijie Hu, Di Wang:
CODEMENV: Benchmarking Large Language Models on Code Migration. 2719-2744 - V. S. D. S. Mahesh Akavarapu, Hrishikesh Terdalkar, Pramit Bhattacharyya, Shubhangi Agarwal, Vishakha Deulgaonkar, Chaitali Dangarikar, Pralay Manna, Arnab Bhattacharya:
A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs. 2745-2761 - Jilong Li, Zhenxi Song, Jiaqi Wang, Meishan Zhang, Honghai Liu, Min Zhang, Zhiguo Zhang:
BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation. 2762-2778 - Yahan Yu, Duzhen Zhang, Yong Ren, Xuanle Zhao, Xiuyi Chen, Chenhui Chu:
Progressive LoRA for Multimodal Continual Instruction Tuning. 2779-2796 - Lukasz Borchmann:
ARC 'Challenge' Is Not That Challenging. 2797-2804 - Vera Neplenbroek, Arianna Bisazza, Raquel Fernández:
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation. 2805-2830 - Tomás Vergara Browne, Alvaro Soto:
Tracr-Injection: Distilling Algorithms into Pre-trained Language Models. 2831-2843 - Ximing Dong, Shaowei Wang, Dayi Lin, Ahmed E. Hassan:
Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization. 2844-2859 - Wei Yao, Wenkai Yang, Ziqiao Wang, Yankai Lin, Yong Liu:
Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL. 2860-2888 - Felix Drinkall, Stefan Zohren, Michael McMahon, Janet B. Pierrehumbert:
Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups. 2889-2904 - Miao Yu, Shilong Wang, Guibin Zhang, Junyuan Mao, Chenlong Yin, Qijiong Liu, Kun Wang, Qingsong Wen, Yang Wang:
NetSafe: Exploring the Topological Safety of Multi-agent System. 2905-2938 - Qiji Zhou, Yifan Gong, Guangsheng Bao, Hongjie Qiu, Jinqiang Li, Xiangrong Zhu, Huajian Zhang, Yue Zhang:
Reasoning is All You Need for Video Generalization: A Counterfactual Benchmark with Sub-question Evaluation. 2939-2957 - Hanlun Zhu, Yunshi Lan, Xiang Li, Weining Qian:
Initializing and Retrofitting Key-Value Adaptors for Traceable Model Editing. 2958-2971 - Jiaqi Li, Yixuan Tang, Yi Yang:
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning. 2972-2989 - Siqi Fan, Xuezhi Fang, Xingrun Xing, Peng Han, Shuo Shang, Yequan Wang:
Position-Aware Depth Decay Decoding (D³): Boosting Large Language Model Inference Efficiency. 2990-3001 - Anirudh Maiya, Razan Alghamdi, Maria Leonor Pacheco, Ashutosh Trivedi, Fabio Somenzi:
Explaining Puzzle Solutions in Natural Language: An Exploratory Study on 6x6 Sudoku. 3002-3009 - Andrea Pedrotti, Michele Papucci, Cristiano Ciaccio, Alessio Miaschi, Giovanni Puccetti, Felice Dell'Orletta, Andrea Esuli:
Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors. 3010-3031 - Siqi Ouyang, Xi Xu, Lei Li:
InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model. 3032-3046 - Jiahui Geng, Qing Li, Zongxiong Chen, Yuxia Wang, Derui Zhu, Zhuohan Xie, Chenyang Lyu, Xiuying Chen, Preslav Nakov, Fakhri Karray:
VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration. 3047-3059 - Haozhe Wang, Long Li, Chao Qu, Weidi Xu, Fengming Zhu, Wei Chu, Fangzhen Lin:
To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization. 3060-3075 - Soo Kyung Kim, Hyunsoo Cho:
GOODLIAR: A Reinforcement Learning-Based Deceptive Agent for Disrupting LLM Beliefs on Foundational Principles. 3076-3101 - James Xu Zhao, Jimmy Z. J. Liu, Bryan Hooi, See-Kiong Ng:
How Does Response Length Affect Long-Form Factuality. 3102-3125 - Guiyang Hou, Wenqi Zhang, Zhe Zheng, Yongliang Shen, Weiming Lu:
Scaling LLMs' Social Reasoning: Sprinkle Cognitive "Aha Moment" into Fundamental Long-thought Logical Capabilities. 3126-3138 - Yuzheng Cai, Zhenyue Guo, Yiwen Pei, Wanrui Bian, Weiguo Zheng:
SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation. 3139-3158 - Bihan Zhou, Haopeng Ren, Li Yuan, Yi Cai, Liuwen Cao, Zikun Deng:
RuleEdit: Towards Rule-Level Knowledge Generalization to Mitigate Over-Editing in Large Language Models. 3159-3175 - Yifu Qiu, Varun R. Embar, Yizhe Zhang, Navdeep Jaitly, Shay B. Cohen, Benjamin Han:
Eliciting In-context Retrieval and Reasoning for Long-context Large Language Models. 3176-3192 - Haoyu Liu, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Furu Wei, Qi Zhang:
GeAR: Generation Augmented Retrieval. 3193-3207 - Yanzhen Shen, Yu Zhang, Yunyi Zhang, Jiawei Han:
A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion. 3208-3220 - Yuzhe Ding, Kang He, Bobo Li, Li Zheng, Haijun He, Fei Li, Chong Teng, Donghong Ji:
Zero-Shot Conversational Stance Detection: Dataset and Approaches. 3221-3235 - Cehao Yang, Xueyuan Lin, Chengjin Xu, Xuhui Jiang, Shengjie Ma, Aofan Liu, Hui Xiong, Jian Guo:
LongFaith: Enhancing Long-Context Reasoning in LLMs with Faithful Synthetic Data. 3236-3256 - Rongwen Zhao, Jeffrey Flanigan:
SYNTHVERIFY: Enhancing Zero-Shot Claim Verification through Step-by-Step Synthetic Data Generation. 3257-3274 - Xu Chu, Zhijie Tan, Hanlin Xue, Guanyu Wang, Tong Mo, Weiping Li:
Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains. 3275-3293 - Zihao Wu, YongXiang Hua, Yongxin Zhu, Fang Zhang, Linli Xu:
Dynamic Prefix as Instructor for Incremental Named Entity Recognition: A Unified Seq2Seq Generation Framework. 3294-3306 - Somin Wadhwa, Chantal Shaib, Silvio Amir, Byron C. Wallace:
Who Taught You That? Tracing Teachers in Model Distillation. 3307-3315 - Grace Byun, Jinho D. Choi:
D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Models. 3316-3349 - Jun Wang, Jiamu Zhou, Xihuai Wang, Xiaoyun Mo, Haoyu Zhang, Qiqiang Lin, Jincheng Jincheng, Muning Wen, Weinan Zhang, Qiuying Peng:
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Assistant Scenarios. 3350-3376 - Do Xuan Long, Duong Ngoc Yen, Do Xuan Trong, Anh Tuan Luu, Kenji Kawaguchi, Shafiq Joty, Min-Yen Kan, Nancy F. Chen:
Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines. 3377-3411 - Gabriele Tuccio, Luana Bulla, Maria Madonia, Aldo Gangemi, Misael Mongiovì:
GRAMMAR-LLM: Grammar-Constrained Natural Language Generation. 3412-3422 - Han Zhou, Qitong Xu, Yiheng Dong, Xin Yang:
MANBench: Is Your Multimodal Model Smarter than Human? 3423-3449 - Mahammed Kamruzzaman, Abdullah Al Monsur, Shrabon Kumar Das, Enamul Hassan, Gene Louis Kim:
BanStereoSet: A Dataset to Measure Stereotypical Social Biases in LLMs for Bangla. 3450-3460 - Matthieu Futeral, Armel Randy Zebaze, Pedro Ortiz Suarez, Julien Abadji, Rémi Lacroix, Cordelia Schmid, Rachel Bawden, Benoît Sagot:
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus. 3461-3494 - Vladislav Mikhailov, Tita Ranveig Enstad, David Samuel, Hans Christian Farsethås, Andrey Kutuzov, Erik Velldal, Lilja Øvrelid:
NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark. 3495-3541 - Thang Le, Huy Huu Nguyen, Anh Tuan Luu, Thien Huu Nguyen:
Massively Multilingual Instruction-Following Information Extraction. 3542-3585 - Kang He, Yuzhe Ding, Haining Wang, Fei Li, Chong Teng, Donghong Ji:
DALR: Dual-level Alignment Learning for Multimodal Sentence Representation Learning. 3586-3601 - Zhenyu Wang, Zikang Wang, Jiyue Jiang, Pengan Chen, Xiangyu Shi, Yu Li:
Large Language Models in Bioinformatics: A Survey. 3602-3615 - Xuanle Zhao, Xuexin Liu, Haoyue Yang, Xianzhen Luo, Fanhu Zeng, Jianling Li, Qi Shi, Chi Chen:
ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs' Capability via Chart Editing. 3616-3630 - Qin Liu, Chao Shang, Ling Liu, Nikolaos Pappas, Jie Ma, Neha Anna John, Srikanth Doss, Lluís Màrquez, Miguel Ballesteros, Yassine Benajiba:
Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models. 3631-3643 - Xiyue Zhu, Peng Tang, Haofu Liao, Srikar Appalaraju:
Turbocharging Web Automation: The Impact of Compressed History States. 3644-3651 - Weijie Shi, Hao Chen, Jiaming Li, Yao Zhao, Yazhong Zhang, Qijin Chen, Jipeng Zhang, Ruiyuan Zhang, Jia Zhu, Jiajie Xu, Xiaofang Zhou:
Making RALM Robust to Irrelevant Contexts via Layer Knowledge Guided Attention. 3652-3668 - Yuting Huang, Chengyuan Liu, Yifeng Feng, Yiquan Wu, Chao Wu, Fei Wu, Kun Kuang:
Rewrite to Jailbreak: Discover Learnable and Transferable Implicit Harmfulness Instruction. 3669-3690 - Mert Inan, Anthony Sicilia, Malihe Alikhani:
SignAlignLM: Integrating Multimodal Sign Language Processing into Large Language Models. 3691-3706 - Yuhui Zhang, Yuchang Su, Yiming Liu, Serena Yeung-Levy:
NegVQA: Can Vision Language Models Understand Negation? 3707-3716 - Debela Gemechu, Ramon Ruiz-Dolz, Henrike Beyer, Chris Reed:
Natural Language Reasoning in Large Language Models: Analysis and Evaluation. 3717-3741 - Haoran Wang, Zhenyu Hou, Yao Wei, Jie Tang, Yuxiao Dong:
SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling. 3742-3761 - Janek Bevendorff, Matti Wiegmann, Emmelie Richter, Martin Potthast, Benno Stein:
The Two Paradigms of LLM Detection: Authorship Attribution vs Authorship Verification. 3762-3787 - Yue Wan, Xiaowei Jia, Xiang Lorraine Li:
Unveiling Confirmation Bias in Chain-of-Thought Reasoning. 3788-3804 - Mufan Qiu, Xinyu Hu, Fengwei Zhan, Sukwon Yun, Jie Peng, Ruichen Zhang, Bhavya Kailkhura, Jiekun Yang, Tianlong Chen:
GRNFormer: A Biologically-Guided Framework for Integrating Gene Regulatory Networks into RNA Foundation Models. 3805-3819 - Yihang Cheng, Lan Zhang, Junyang Wang, Mu Yuan, Yunhao Yao:
RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service. 3820-3837 - Sharath Naganna, Saprativa Bhattacharjee, Biplab Banerjee, Pushpak Bhattacharyya:
"My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise. 3838-3858 - Xuanliang Zhang, Dingzirui Wang, Baoxin Wang, Longxu Dou, Xinyuan Lu, Keyan Xu, Dayong Wu, Qingfu Zhu:
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types. 3859-3881 - Yingtai Xiao, Yuqing Zhu, Sirat Samyoun, Wanrong Zhang, Jiachen T. Wang, Jian Du:
TokenShapley: Token Level Context Attribution with Shapley Value. 3882-3894 - Jinghan Zhang, Xiting Wang, Fengran Mo, Yeyang Zhou, Wanfu Gao, Kunpeng Liu:
Entropy-based Exploration Conduction for Multi-step Reasoning. 3895-3906 - Emily Corvi, Hannah Washington, Stefanie Reed, Chad Atalla, Alexandra Chouldechova, P. Alex Dow, Jean Garcia-Gathright, Nicholas J. Pangakis, Emily Sheng, Dan Vann, Matthew Vogel, Hanna M. Wallach:
Taxonomizing Representational Harms using Speech Act Theory. 3907-3932 - Prafulla Kumar Choubey, Xiangyu Peng, Shilpa Bhagavath, Caiming Xiong, Shiva Kumar Pentyala, Chien-Sheng Wu:
Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents. 3933-3954 - Hayden S. Helm, Aranyak Acharyya, Youngser Park, Brandon Duderstadt, Carey E. Priebe:
Statistical inference on black-box generative models in the data kernel perspective space. 3955-3970 - Sohee Yang, Nora Kassner, Elena Gribovskaya, Sebastian Riedel, Mor Geva:
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts? 3971-3992 - Zihan Liu, Yang Chen, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping:
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling. 3993-4015 - Yongan Yu, Qingchen Hu, Xianda Du, Jiayin Wang, Fengran Mo, Renée Sieber:
WXImpactBench: A Disruptive Weather Impact Understanding Benchmark for Evaluating Large Language Models. 4016-4035 - Yun Zhang, Xue Geng, Lizi Liao, Jintong Sun, Minghe Yu, Ge Yu:
MeMoTune: A Measure and Moment-Driven Fine-Tuning Framework for Quantized Large Language Models. 4036-4050 - Sagi Shaier, George Arthur Baker, Chiranthan Sridhar, Lawrence Hunter, Katharina von der Wense:
MALAMUTE: A Multilingual, Highly-granular, Template-free, Education-based Probing Dataset. 4051-4069 - Xiaoyi Bao, Jinghang Gu, Zhongqing Wang, Chu-Ren Huang:
Sentimental Image Generation for Aspect-based Sentiment Analysis. 4070-4081 - Zihang Liu, Jiawei Guo, Hao Zhang, Hongyang Chen, Jiajun Bu, Haishuai Wang:
Long-form Hallucination Detection with Self-elicitation. 4082-4100 - Qing Zong, Zhaowei Wang, Tianshi Zheng, Xiyu Ren, Yangqiu Song:
ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty. 4101-4117 - Rui He, Zhongqing Wang, Minjie Qiang, Hongling Wang, Yifan. zhang Yifan. zhang, Hua Xu, Shuai Fan, Guodong Zhou:
One-Dimensional Object Detection for Streaming Text Segmentation of Meeting Dialogue. 4118-4130 - Qingkai Zeng, Yuyang Bai, Zhaoxuan Tan, Zhenyu Wu, Shangbin Feng, Meng Jiang:
CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts. 4131-4144 - Yuqicheng Zhu, Daniel Hernández, Yuan He, Zifeng Ding, Bo Xiong, Evgeny Kharlamov, Steffen Staab:
Predicate-Conditional Conformalized Answer Sets for Knowledge Graph Embeddings. 4145-4167 - Yifan Zhang, Yifan Luo, Yang Yuan, Andrew C. Yao:
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts. 4168-4189 - Zhuochun Li, Yuelyu Ji, Rui Meng, Daqing He:
Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review. 4190-4205 - Orchid Chetia Phukan, Drishti Singh, Swarup Ranjan Behera, Arun Balaji Buduru, Rajesh Sharma:
Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution. 4206-4214 - Bryan Li, Fiona Luo, Samar Haider, Adwait Agashe, Siyu Li, Runqi Liu, Miranda Muqing Miao, Shriya Ramakrishnan, Yuan Yuan, Chris Callison-Burch:
Multilingual Retrieval Augmented Generation for Culturally-Sensitive Tasks: A Benchmark for Cross-lingual Robustness. 4215-4241 - Pengyue Jia, Derong Xu, Xiaopeng Li, Zhaocheng Du, Xiangyang Li, Yichao Wang, Yuhao Wang, Qidong Liu, Maolin Wang, Huifeng Guo, Ruiming Tang, Xiangyu Zhao:
Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation. 4242-4256 - Yifei He, Alon Benhaim, Barun Patra, Praneetha Vaddamanu, Sanchit Ahuja, Parul Chopra, Vishrav Chaudhary, Han Zhao, Xia Song:
Scaling Laws for Multilingual Language Models. 4257-4273 - Jinyan Su, Preslav Nakov, Claire Cardie:
Corpus Poisoning via Approximate Greedy Gradient Descent. 4274-4294 - Huitong Pan, Qi Zhang, Mustapha Adamu, Eduard C. Dragut, Longin Jan Latecki:
Taxonomy-Driven Knowledge Graph Construction for Domain-Specific Scientific Applications. 4295-4320 - Yifan Yang, Kai Zhen, Bhavana Ganesh, Aram Galstyan, Goeric Huybrechts, Markus Müller, Jonas M. Kübler, Rupak Vignesh Swaminathan, Athanasios Mouchtaris, Sravan Babu Bodapati, Nathan Susanj, Zheng Zhang, Jack FitzGerald, Abhishek Kumar:
Wanda++: Pruning Large Language Models via Regional Gradients. 4321-4333 - Vageesh Kumar Saxena, Benjamin Bashpole, Gijs van Dijck, Gerasimos Spanakis:
MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data. 4334-4373 - Shu Yang, Shenzhe Zhu, Zeyu Wu, Keyu Wang, Junchi Yao, Junchao Wu, Lijie Hu, Mengdi Li, Derek F. Wong, Di Wang:
Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements. 4374-4420 - Rafael Alberto Rivera Soto, Barry Y. Chen, Nicholas Andrews:
Mitigating Paraphrase Attacks on Machine-Text Detection via Paraphrase Inversion. 4421-4433 - Arijit Maji, Raghvendra Kumar, Akash Ghosh, Anushka, Sriparna Saha:
SANSKRITI: A Comprehensive Benchmark for Evaluating Language Models' Knowledge of Indian Culture. 4434-4451 - Lu Yan, Siyuan Cheng, Xuan Chen, Kaiyuan Zhang, Guangyu Shen, Xiangyu Zhang:
System Prompt Hijacking via Permutation Triggers in LLM Supply Chains. 4452-4473 - Akhilesh Kakolu Ramarao, Kevin Tang, Dinah Baer-Henney:
Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers. 4474-4489 - Gyeongeun Lee, Zhu Wang, Sathya N. Ravi, Natalie Parde:
From Heart to Words: Generating Empathetic Responses via Integrated Figurative Language and Semantic Context Signals. 4490-4502 - Nurul Fajrin Ariyani, Zied Bouraoui, Richard Booth, Steven Schockaert:
There's No Such Thing as Simple Reasoning for LLMs. 4503-4514 - Aaron Gluck, Katharina von der Wense, Maria Leonor Pacheco:
CLIX: Cross-Lingual Explanations of Idiomatic Expressions. 4515-4529 - Dang Nguyen, Ali Payani, Baharan Mirzasoleiman:
Beyond Semantic Entropy: Boosting LLM Uncertainty Quantification with Pairwise Semantic Similarity. 4530-4540 - Xiaoqiang Wang, Suyuchen Wang, Yun Zhu, Bang Liu:
R³Mem: Bridging Memory Retention and Retrieval via Reversible Compression. 4541-4557 - Tiejin Chen, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei:
Vision Language Model Helps Private Information De-Identification in Vision Data. 4558-4572 - Tiejin Chen, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei:
Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges. 4573-4586 - Yebowen Hu, Xiaoyang Wang, Wenlin Yao, Yiming Lu, Daoan Zhang, Hassan Foroosh, Dong Yu, Fei Liu:
DeFine: Decision-Making with Analogical Reasoning over Factor Profiles. 4587-4603 - Cheng Qian, Emre Can Acikgoz, Hongru Wang, Xiusi Chen, Avirup Sil, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji:
SMART: Self-Aware Agent for Tool Overuse Mitigation. 4604-4621 - Pablo Rodríguez, Silvia Paniagua Suárez, Pablo Gamallo, Susana Sotelo Docío:
Continued Pretraining and Interpretability-Based Evaluation for Low-Resource Languages: A Galician Case Study. 4622-4637 - Weixi Feng, Jiachen Li, Michael Saxon, Tsu-Jui Fu, Wenhu Chen, William Yang Wang:
TC-Bench: Benchmarking Temporal Compositionality in Conditional Video Generation. 4638-4662 - Hanzhi Zhang, Heng Fan, Kewei Sha, Yan Huang, Yunhe Feng:
DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration. 4663-4676 - Bhaktipriya Radharapu, Manon Revel, Megan Ung, Sebastian Ruder, Adina Williams:
Arbiters of Ambivalence: Challenges of using LLMs in No-Consensus tasks. 4677-4731 - Sireesh Gururaja, Nupoor Gandhi, Jeremiah Milbauer, Emma Strubell:
Beyond Text: Characterizing Domain Expert Needs in Document Research. 4732-4745 - Murong Yue, Ziyu Yao:
Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack. 4746-4761 - Shih-Han Chou, Shivam Chandhok, Jim Little, Leonid Sigal:
MM-R³: On (In-)Consistency of Vision-Language Models (VLMs). 4762-4788 - Yuepei Li, Kang Zhou, Qiao Qiao, Bach Nguyen, Qing Wang, Qi Li:
Investigating Context Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style. 4789-4807 - Ziyi Yin, Muchao Ye, Yuanpu Cao, Jiaqi Wang, Aofei Chang, Han Liu, Jinghui Chen, Ting Wang, Fenglong Ma:
Shadow-Activated Backdoor Attacks on Multimodal Large Language Models. 4808-4829 - Kung-Hsiang Huang, Can Qin, Haoyi Qiu, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu:
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding. 4830-4843 - Shihao Cai, Chongming Gao, Yang Zhang, Wentao Shi, Jizhi Zhang, Keqin Bao, Qifan Wang, Fuli Feng:
K-order Ranking Preference Optimization for Large Language Models. 4844-4859 - Xuyuan Liu, Lei Hsiung, Yaoqing Yang, Yujun Yan:
Spectral Insights into Data-Oblivious Critical Layers in Large Language Models. 4860-4877 - Xunzhu Tang, Jiechao Gao, Jin Xu, Tiezhu Sun, Yewei Song, Saad Ezzini, Wendkûuni C. Ouédraogo, Jacques Klein, Tegawendé F. Bissyandé:
SynFix: Dependency-Aware Program Repair via RelationGraph Analysis. 4878-4894 - Taeho Hwang, Sukmin Cho, Soyeong Jeong, Hoyun Song, SeungYoon Han, Jong C. Park:
EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation. 4895-4924 - Zhihu Wang, Shiwan Zhao, Yu Wang, Heyuan Huang, Sitao Xie, Yubo Zhang, Jiaxin Shi, Zhixing Wang, Hongyan Li, Junchi Yan:
Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives. 4925-4936 - Shuai Zhao, Xiaobao Wu, Cong-Duy T. Nguyen, Yanhao Jia, Meihuizi Jia, Yichao Feng, Anh Tuan Luu:
Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation. 4937-4952 - Shuhe Wang, Guoyin Wang, Yizhong Wang, Jiwei Li, Eduard H. Hovy, Chen Guo:
Packing Analysis: Packing Is More Appropriate for Large Models or Datasets in Supervised Fine-tuning. 4953-4967 - Yongkang Chen, Chongyang Zhao, Jianwentian Jianwentian, Guiling Cao, Hu Li, Xiaohui Kuang:
Better Red Teaming via Searching with Large Language Model. 4968-4984 - Jiayi Han, Liang Du, Yiwen Wu, Guanming Liang, Xiangguo Zhou, Weibo Zheng, Donghong Han, Zixun Sun:
AdaV: Adaptive Text-visual Redirection for Vision-Language Models. 4985-4997 - Qian Wang, Tianyu Wang, Zhenheng Tang, Qinbin Li, Nuo Chen, Jingsheng Liang, Bingsheng He:
MegaAgent: A Large-Scale Autonomous LLM-based Multi-Agent System Without Predefined SOPs. 4998-5036 - Xiaotian Zhang, Ruizhe Chen, Yang Feng, Zuozhu Liu:
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment. 5037-5049 - Hongfei Xu, Zhuofei Liang, Qiuhui Liu, Lingling Mu:
A Self-Distillation Recipe for Neural Machine Translation. 5050-5064 - Longguang Zhong, Fanqi Wan, Ruijun Chen, Xiaojun Quan, Liangzhi Li:
BlockPruner: Fine-grained Pruning for Large Language Models. 5065-5080 - Yuchen Wen, Keping Bi, Wei Chen, Jiafeng Guo, Xueqi Cheng:
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective. 5081-5097 - Jiajie Zhang, Yushi Bai, Xin Lv, Wanjun Gu, Danqing Liu, Minhao Zou, Shulin Cao, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li:
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-Context QA. 5098-5122 - Min Choi, Keonwoo Kim, Sungwon Chae, Sangyeob Baek:
An Empirical Study of Group Conformity in Multi-Agent Systems. 5123-5139 - Zhanglin Wu, Daimeng Wei, Xiaoyu Chen, Hengchao Shang, Jiaxin Guo, Zongyao Li, Yuanchang Luo, Jinlong Yang, Zhiqiang Rao, Hao Yang:
Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation. 5140-5148 - Yeyuan Wang, Dehong Gao, Rujiao Long, Lei Yi, Linbo Jin, Libin Yang, Xiaoyan Cai:
ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning. 5149-5160 - Meihan Tong, Shuai Wang:
NovelCR: A Large-Scale Bilingual Dataset Tailored for Long-Span Coreference Resolution. 5161-5173 - Huangyw Huangyw, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao:
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models. 5174-5193 - Weidong Wu, Qinlin Zhao, Hao Chen, Lexin Zhou, Defu Lian, Hong Xie:
Exploring the Choice Behavior of Large Language Models. 5194-5214 - Xueru Wen, Jie Lou, Xinyu Lu, Yuqiu Ji, Xinyan Guan, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Debing Zhang, Le Sun:
On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation. 5215-5231 - Yurun Song, Xiangqing Shen, Rui Xia:
From Phrases to Subgraphs: Fine-Grained Semantic Parsing for Knowledge Graph Question Answering. 5232-5246 - Zhicheng Guo, Sijie Cheng, Yuchen Niu, Hao Wang, Sicheng Zhou, Wenbing Huang, Yang Liu:
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7, 000+ Real-World APIs. 5247-5270 - Hoang Pham, Thanh-Do Nguyen, Khac-Hoai Nam Bui:
ClaimPKG: Enhancing Claim Verification via Pseudo-Subgraph Generation with Lightweight Specialized LLM. 5271-5290 - Baizhou Huang, Xiaojun Wan:
TriEmbed: Bridge the Gap between Text and Token Indices with Embedding Reparameterization. 5291-5297 - Cong Liu, Jie Wu, Weigang Wu, Xu Chen, Liang Lin, Wei-Shi Zheng:
Chain of Methodologies: Scaling Test Time Computation without Training. 5298-5312 - Jian Guan, Junfei Wu, Jia-Nan Li, Chuanqi Cheng, Wei Wu:
A Survey on Personalized Alignment - The Missing Piece for Large Language Models in Real-World Applications. 5313-5333 - Chenhao Ding, Jiangyang Li, Songlin Dong, Xinyuan Gao, Yuhang He, Yihong Gong:
SuLoRA: Subspace Low-Rank Adaptation for Parameter-Efficient Fine-Tuning. 5334-5349 - Yeong-Joon Ju, Ho-Joong Kim, Seong-Whan Lee:
MIRe: Enhancing Multimodal Queries Representation via Fusion-Free Modality Interaction for Multimodal Retrieval. 5350-5363 - Ruilin Zhao, Feng Zhao, Hong Zhang:
Correcting on Graph: Faithful Semantic Parsing over Knowledge Graphs with Large Language Models. 5364-5376 - Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Zhuo Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu:
COPR: Continual Human Preference Learning via Optimal Policy Regularization. 5377-5398 - Jie Sun, Junkang Wu, Jiancan Wu, Zhibo Zhu, Xingyu Lu, Jun Zhou, Lintao Ma, Xiang Wang:
Robust Preference Optimization via Dynamic Target Margins. 5399-5416 - Xiao Wang, Qingyi Si, Shiyu Zhu, Jianlong Wu, Li Cao, Liqiang Nie:
AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding. 5417-5432 - Hongru Wang, Wenyu Huang, Yufei Wang, Yuanhao Xi, Jianqiao Lu, Huan Zhang, Nan Hu, Zeming Liu, Jeff Z. Pan, Kam-Fai Wong:
Rethinking Stateful Tool Use in Multi-Turn Dialogues: Benchmarks and Challenges. 5433-5453 - Xiaochong Lan, Jie Feng, Yizhou Sun, Chen Gao, Jiahuan Lei, Xinleishi Xinleishi, Hengliang Luo, Yong Li:
Open-Set Living Need Prediction with Large Language Models. 5454-5472 - Ziyang Huang, Wangtao Sun, Jun Zhao, Kang Liu:
Improve Rule Retrieval and Reasoning with Self-Induction and Relevance ReEstimate. 5473-5488 - Mehdi Jafari, Yuncheng Hua, Hao Xue, Flora D. Salim:
Beyond Words: Integrating Theory of Mind into Conversational Agents for Human-Like Belief, Desire, and Intention Alignment. 5489-5508 - Zhiyuan Li, Heng Wang, Dongnan Liu, Chaoyi Zhang, Ao Ma, Jieting Long, Weidong Cai:
Multimodal Causal Reasoning Benchmark: Challenging Multimodal Large Language Models to Discern Causal Links Across Modalities. 5509-5533 - Litu Ou, Mirella Lapata:
Context-Aware Hierarchical Merging for Long Document Summarization. 5534-5561 - Xiangqing Shen, Fanfan Wang, Siwei Wu, Rui Xia:
VCD: A Dataset for Visual Commonsense Discovery in Images. 5562-5577 - Hongru Wang, Deng Cai, Wanjun Zhong, Shijue Huang, Jeff Z. Pan, Zeming Liu, Kam-Fai Wong:
Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst. 5578-5596 - Yongsen Zheng, Mingjie Qian, Guohua Wang, Yang Liu, Ziliang Chen, Mingzhi Mao, Liang Lin, Kwok-Yan Lam:
HyperCRS: Hypergraph-Aware Multi-Grained Preference Learning to Burst Filter Bubbles in Conversational Recommendation System. 5597-5608 - Junyu Lu, Kai Ma, Kaichun Wang, Kelaiti Xiao, Roy Ka-Wei Lee, Bo Xu, Liang Yang, Hongfei Lin:
Is LLM an Overconfident Judge? Unveiling the Capabilities of LLMs in Detecting Offensive Language with Annotation Disagreement. 5609-5626 - Kumara Kahatapitiya, Kanchana Ranasinghe, Jongwoo Park, Michael S. Ryoo:
Language Repository for Long Video Understanding. 5627-5646 - Jeonghyun Park, Hwanhee Lee:
Investigating Language Preference of Multilingual RAG Systems. 5647-5675 - Mei Guo, Chen Chen, Chunyan Hou, Yike Wu, Xiaojie Yuan:
FGDGNN: Fine-Grained Dynamic Graph Neural Network for Rumor Detection on Social Media. 5676-5687 - Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Yipeng Zhang, Haitao Mi, Helen M. Meng:
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching. 5688-5724 - Qingsong Zou, Jingyu Xiao, Qing Li, Zhi Yan, Yuhang Wang, Li Xu, Wenxuan Wang, Kuofeng Gao, Ruoyu Li, Yong Jiang:
QueryAttack: Jailbreaking Aligned Large Language Models Using Structured Non-natural Query Language. 5725-5741 - Chengzhi Li, Heyan Huang, Ping Jian, Zhen Yang, Chenxu Wang, Yifan Wang:
Memory or Reasoning? Explore How LLMs Compute Mixed Arithmetic Expressions. 5742-5763 - Yunxiao Shi, Wujiang Xu, Zeqi Zhang, Xing Zi, Qiang Wu, Min Xu:
PersonaX: A Recommendation Agent-Oriented User Modeling Framework for Long Behavior Sequence. 5764-5787 - Shuliang Liu, Xinze Li, Zhenghao Liu, Yukun Yan, Cheng Yang, Zheni Zeng, Zhiyuan Liu, Maosong Sun, Ge Yu:
Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models. 5788-5807 - Chiwei Zhu, Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Zhendong Mao:
Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability. 5808-5835 - Heng Yu, Junfeng Kang, Rui Li, Qi Liu, Liyang He, Zhenya Huang, Shuanghong Shen, Junyu Lu:
CA-GAR: Context-Aware Alignment of LLM Generation for Document Retrieval. 5836-5849 - Guhong Chen, Liyang Fan, Zihan Gong, Nan Xie, Zixuan Li, Ziqiang Liu, Chengming Li, Qiang Qu, Hamid Alinejad-Rokny, Shiwen Ni, Min Yang:
AgentCourt: Simulating Court with Adversarial Evolvable Lawyer Agents. 5850-5865 - Jinyang Huang, Xiachong Feng, Qiguang Chen, Hanjie Zhao, Zihui Cheng, Jiesong Bai, Jingxuan Zhou, Min Li, Libo Qin:
MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library Scenarios. 5866-5879 - Hui Huang, Xingyuan Bu, Hongli Zhou, Yingqi Qu, Jing Liu, Muyun Yang, Bing Xu, Tiejun Zhao:
An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model is not a General Substitute for GPT-4. 5880-5895 - Xueyang Feng, Jingsen Zhang, Jiakai Tang, Wei Li, Guohao Cai, Xu Chen, Quanyu Dai, Yue Zhu, Zhenhua Dong:
Expectation Confirmation Preference Optimization for Multi-Turn Conversational Recommendation Agent. 5896-5914 - Shuai Niu, Jing Ma, Hongzhan Lin, Liang Bai, Zhihua Wang, Wei Bi, Richard Yi Da Xu, Guo Li, Xian Yang:
ProMedTS: A Self-Supervised, Prompt-Guided Multimodal Approach for Integrating Medical Text and Time Series. 5915-5928 - Yu Li, Qizhi Pei, Mengyuan Sun, Honglin Lin, Chenlin Ming, Xin Gao, Jiang Wu, Conghui He, Lijun Wu:
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenge. 5929-5965 - Hwan Chang, Hwanhee Lee:
Which Retain Set Matters for LLM Unlearning? A Case Study on Entity Unlearning. 5966-5982 - Wenhao Liu, Siyu An, Junru Lu, Muling Wu, Tianlong Li, Xiaohua Wang, Changze Lv, Xiaoqing Zheng, Di Yin, Xing Sun, Xuanjing Huang:
Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing. 5983-6005 - Jianghao Chen, Zhenlin Wei, Zhenjiang Ren, Ziyong Li, Jiajun Zhang:
LR²Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems. 6006-6032 - Tian Lan, Xiangdong Su, Xu Liu, Ruirui Wang, Ke Chang, Jiang Li, Guanglai Gao:
McBE: A Multi-task Chinese Bias Evaluation Benchmark for Large Language Models. 6033-6056 - Yiwei Fu, Yuxing Zhang, Chunchun Chen, JianwenMa JianwenMa, Quan Yuan, Rong-Cheng Tu, Xinli Huang, Wei Ye, Xiao Luo, Minghua Deng:
MARK: Multi-agent Collaboration with Ranking Guidance for Text-attributed Graph Clustering. 6057-6072 - Jingbao Luo, Ming Liu, Ran Liu, Yongpan Sheng, Xin Hu, Gang Li, Peng Wu:
Can Language Models Capture Human Writing Preferences for Domain-Specific Text Summarization? 6073-6091 - Yijiong Yu, Huiqiang Jiang, Xufang Luo, Qianhui Wu, Chin-Yew Lin, Dongsheng Li, Yuqing Yang, Yongfeng Huang, Lili Qiu:
Mitigate Position Bias in LLMs via Scaling a Single Hidden States Channel. 6092-6111 - Ruiqiao Bai, Xue Han, Shuo Lei, Junlan Feng, Yanyan Luo, Chao Deng:
Self-attention-based Graph-of-Thought for Math Problem Solving. 6112-6125 - Weihong Du, Wenrui Liao, Binyu Yan, Hongru Liang, Anthony G. Cohn, Wenqiang Lei:
BAR: A Backward Reasoning based Agent for Complex Minecraft Tasks. 6126-6149 - Jiakai Tang, Shiqi Shen, Zhipeng Wang, Gong Zhi, Xueyang Feng, Zexu Sun, Haoran Tan, Xu Chen:
KAPA: A Deliberative Agent Framework with Tree-Structured Knowledge Base for Multi-Domain User Intent Understanding. 6150-6166 - Guofeng Quan, Wenfeng Feng, Chuzhan Hao, Guochao Jiang, Yuewei Zhang, Hao Henry Wang:
RASD: Retrieval-Augmented Speculative Decoding. 6167-6177 - Zengyi Gao, Yukun Cao, Hairu Wang, Ao Ke, Yuan Feng, S. Kevin Zhou, Xike Xie:
FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation based on Knowledge Graphs. 6178-6192 - Kening Zheng, Junkai Chen, Yibo Yan, Xin Zou, Huiyu Zhou, Xuming Hu:
Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models. 6193-6212 - Yilei Tu, Andrew Xue, Freda Shi:
Blessing of Multilinguality: A Systematic Analysis of Multilingual In-Context Learning. 6213-6248 - Lishui Fan, Mouxiang Chen, Zhongxin Liu:
SEK: Self-Explained Keywords Empower Large Language Models for Code Generation. 6249-6278 - Peng Ding, Jun Kuang, ZongYu Wang, Xuezhi Cao, Xunliang Cai, Jiajun Chen, Shujian Huang:
Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement. 6279-6299 - Vardaan Pahuja, Yadong Lu, Corby Rosset, Boyu Gou, Arindam Mitra, Spencer Whitehead, Yu Su, Ahmed Hassan Awadallah:
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents. 6300-6323 - Zhanpeng Chen, Mingxiao Li, Ziyang Chen, Nan Du, Xiaolong Li, Yuexian Zou:
Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding. 6324-6341 - Yuhao Dan, Jie Zhou, Qin Chen, Junfeng Tian, Liang He:
P-React: Synthesizing Topic-Adaptive Reactions of Personality Traits via Mixture of Specialized LoRA Experts. 6342-6362 - Jiamin Su, Yibo Yan, Fangteng Fu, Zhang Han, Jingheng Ye, Xiang Liu, Jiahao Huo, Huiyu Zhou, Xuming Hu:
EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models. 6363-6389 - Yuanjie Lyu, Chao Zhang, Yuhao Chen, Yong Chen, Tong Xu:
Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks. 6390-6404 - Jiayi He, Hehai Lin, Qingyun Wang, Yi R. Fung, Heng Ji:
Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks. 6405-6421 - Chenkai Sun, Denghui Zhang, ChengXiang Zhai, Heng Ji:
Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation. 6422-6434 - Yunqiao Yang, Houxing Ren, Zimu Lu, Ke Wang, Weikang Shi, Aojun Zhou, Junting Pan, Mingjie Zhan, Hongsheng Li:
Probability-Consistent Preference Optimization for Enhanced LLM Reasoning. 6435-6448 - Hongcheng Guo, Wei Zhang, Junhao Chen, Yaonan Gu, Jian Yang, Junjia Du, Shaosheng Cao, Binyuan Hui, Tianyu Liu, Jianxin Ma, Chang Zhou, Zhoujun Li:
IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web. 6449-6466 - Fan Gao, Jieyang Peng, Xiaoming Tao, Youzheng Wang:
TDCSA: LLM-Guided Top-Down Approach for Robust Citation Sentiment Analysis. 6467-6484 - Yi Liu, Hongji Zhang, Yunhao Zhou, Zhengyuan Shi, Changran Xu, Qiang Xu:
DeepRTL2: A Versatile Model for RTL-Related Tasks. 6485-6500 - Yutao Sun, Mingshuai Chen, Tiancheng Zhao, Ruochen Xu, Zilun Zhang, Jianwei Yin:
The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding? 6501-6512 - Long Chen, Shuoyu Guan, Xiaohua Huang, Wen-Jing Wang, Cai Xu, Ziyu Guan, Wei Zhao:
Cross-lingual Multimodal Sentiment Analysis for Low-Resource Languages via Language Family Disentanglement and Rethinking Transfer. 6513-6522 - Chengda Lu, Xiaoyu Fan, Yu Huang, Rongwu Xu, Jijie Li, Wei Xu:
Does Chain-of-Thought Reasoning Really Reduce Harmfulness from Jailbreaking? 6523-6546 - Yuhang Zang, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Ziyu Liu, Shengyuan Ding, Shenxi Wu, Yubo Ma, Haodong Duan, Wenwei Zhang, Kai Chen, Dahua Lin, Jiaqi Wang:
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model. 6547-6563 - Junjie Li, Nan Zhang, Xiaoyang Qu, Kai Lu, Guokuan Li, Jiguang Wan, Jianzong Wang:
RATE-Nav: Region-Aware Termination Enhancement for Zero-shot Object Navigation with Vision-Language Models. 6564-6574 - Zhentao Xie, Chengcheng Han, Jinxin Shi, Wenjun Cui, Xin Zhao, Xingjiao Wu, Jiabao Zhao:
RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation. 6575-6602 - Yuxin Jiang, Yufei Wang, Chuhan Wu, Xinyi Dai, Yan Xu, Weinan Gan, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Wei Wang:
Instruction-Tuning Data Synthesis from Scratch via Web Reconstruction. 6603-6618 - Lian Yan, Chen Tang, Yi Guan, Haotian Wang, Songyuan Wang, Haifeng Liu, Yang Yang, Jingchi Jiang:
RLKGF: Reinforcement Learning from Knowledge Graph Feedback Without Human Annotations. 6619-6633 - Baturay Saglam, Xinyang Hu, Zhuoran Yang, Dionysis Kalogerias, Amin Karbasi:
Learning Task Representations from In-Context Learning. 6634-6663 - Xiaohu Li, Yunfeng Ning, Zepeng Bao, Mayi Xu, Jianhao Chen, Tieyun Qian:
CAVGAN: Unifying Jailbreak and Defense of LLMs via Generative Adversarial Attacks on their Internal Representations. 6664-6678 - Yubo Li, Yidi Miao, Xueying Ding, Ramayya Krishnan, Rema Padman:
Firm or Fickle? Evaluating Large Language Models Consistency in Sequential Interactions. 6679-6700 - Pengzhou Cheng, Zheng Wu, Zongru Wu, Tianjie Ju, Aston Zhang, Zhuosheng Zhang, Gongshen Liu:
OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents. 6701-6725 - Pengfei He, Yuping Lin, Shen Dong, Han Xu, Yue Xing, Hui Liu:
Red-Teaming LLM Multi-Agent Systems via Communication Attacks. 6726-6747 - Zhihong Zhu, Yunyan Zhang, Xianwei Zhuang, Fan Zhang, Zhongwei Wan, Yuyan Chen, Qingqing Long, Yefeng Zheng, Xian Wu:
Can We Trust AI Doctors? A Survey of Medical Hallucination in Large Language and Large Vision-Language Models. 6748-6769 - Jiaan Wang, Fandong Meng, Yunlong Liang, Jie Zhou:
DRT: Deep Reasoning Translation via Long Chain-of-Thought. 6770-6782 - Fuying Wang, Feng Wu, Yihan Tang, Lequan Yu:
CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis. 6783-6799 - Dong Zhang, Haiyan Tian, Qingying Sun, Shoushan Li:
Vision-aided Unsupervised Constituency Parsing with Multi-MLLM Debating. 6800-6810 - Bingsen Chen, Shenji Wan, Xi Ye, Chen Zhao:
Inter-Passage Verification for Multi-evidence Multi-answer QA. 6811-6829 - Alan Chi-Man Lee, Wing-Sun Cheng, Calvin Chun-Kit Chan:
PROMTEC: Fast LLM Inference Decoding using Prompt Multi-Lookup with Template Database and Common Sequences. 6830-6842 - Haoqi Zheng, DongWang DongWang, Silin Yang, Yunpeng Qi, Ruochun Jin, Liyang Xu:
Logical DA: Enhancing Data Augmentation for Logical Reasoning via a Multi-Agent System. 6843-6855 - Yubai Wei, Jiale Han, Yi Yang:
Adapting General-Purpose Embedding Models to Private Datasets Using Keyword-based Retrieval. 6856-6870 - Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu:
SQL Injection Jailbreak: A Structural Disaster of Large Language Models. 6871-6891 - Jaewoo Lee, Keyang Xuan, Chanakya Ekbote, Sandeep Polisetty, Yi R. Fung, Paul Pu Liang:
TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language Models. 6892-6908 - Zihao Wang, Jiaxing Yu, Haoxuan Liu, Zehui Zheng, Yuhang Jin, Shuyu Li, Shulei Ji, Kejun Zhang:
Generative Music Models' Alignment with Professional and Amateur Users' Expectations. 6909-6920 - Xinrui He, Yikun Ban, Jiaru Zou, Tianxin Wei, Curtiss B. Cook, Jingrui He:
LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation. 6921-6936 - Yingjie Li, Yun Luo, Xiaotian Xie, Yue Zhang:
Task Calibration: Calibrating Large Language Models on Inference Tasks. 6937-6951 - Duy A. Nguyen, Rishi Kesav Mohan, Shimeng Yang, Pritom Saha Akash, Kevin Chen-Chuan Chang:
MiniELM: A Lightweight and Adaptive Query Rewriting Framework for E-Commerce Search Optimization. 6952-6964 - Ivory Yang, Chunhui Zhang, Yuxin Wang, Zhongyu Ouyang, Soroush Vosoughi:
Visibility as Survival: Generalizing NLP for Native Alaskan Language Identification. 6965-6979 - Zhangchen Xu, Yang Liu, Yueqin Yin, Mingyuan Zhou, Radha Poovendran:
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding. 6980-7008 - Xiaochuan Liu, Ruihua Song, Xiting Wang, Xu Chen:
Select, Read, and Write: A Multi-Agent Framework of Full-Text-based Related Work Generation. 7009-7028 - Pratik Rakesh Singh, Kritarth Prasad, Mohammadi Zaki, Pankaj Wasnik:
Graph-Assisted Culturally Adaptable Idiomatic Translation for Indic languages. 7029-7044 - Vincent Nguyen, Sarvnaz Karimi, Willow Hallgren, Mahesh Prakash:
Question Answering in Climate Adaptation for Agriculture: Model Development and Evaluation with Expert Feedback. 7045-7075 - Xinfeng Wang, Jin Cui, Fumiyo Fukumoto, Yoshimi Suzuki:
AGRec: Adapting Autoregressive Decoders with Graph Reasoning for LLM-based Sequential Recommendation. 7076-7090 - Jin Cui, Xinfeng Wang, Yoshimi Suzuki, Fumiyo Fukumoto:
Causal Denoising Prototypical Network for Few-Shot Multi-label Aspect Category Detection. 7091-7104 - Pengzuo Wu, Yuhang Yang, Guangcheng Zhu, Chao Ye, Hong Gu, Xu Lu, Ruixuan Xiao, Bowen Bao, Yijing He, Liangyu Zha, Wentao Ye, Junbo Zhao, Haobo Wang:
RealHiTBench: A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table Analysis. 7105-7137 - Zhiyang Zhang, Yaping Zhang, Yupu Liang, Zhiyuan Chen, Lu Xiang, Yang Zhao, Yu Zhou, Chengqing Zong:
A Query-Response Framework for Whole-Page Complex-Layout Document Image Translation with Relevant Regional Concentration. 7138-7149 - Junjia Du, Yadi Liu, Hongcheng Guo, Jiawei Wang, Haojian Huang, Yunyi Ni, Zhoujun Li:
DependEval: Benchmarking LLMs for Repository Dependency Understanding. 7150-7179 - Xu Zhang, Kun Zhang, Wenxin Ma, Rongsheng Wang, Chenxu Wu, Yingtai Li, S. Kevin Zhou:
A General Knowledge Injection Framework for ICD Coding. 7180-7189 - Jiahao Huo, Yibo Yan, Xu Zheng, Yuanhuiyi Lyu, Xin Zou, Zhihua Wei, Xuming Hu:
MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models. 7190-7206 - Wenjian Ding, Yao Zhang, Jun Wang, Adam Jatowt, Zhenglu Yang:
Generating Questions, Answers, and Distractors for Videos: Exploring Semantic Uncertainty of Object Motions. 7207-7220 - Xuan Luo, Weizhi Wang, Xifeng Yan:
DiffSkip: Differential Layer Skipping in Large Language Models. 7221-7231 - Zihao Jiang, Ben Liu, Miao Peng, Wenjie Xu, Yao Xiao, Zhenyan Shan, Min Peng:
Towards Explainable Temporal Reasoning in Large Language Models: A Structure-Aware Generative Framework. 7232-7251 - Jinghui Lu, Haiyang Yu, Yanjie Wang, Yongjie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao Liu, Can Huang:
A Bounding Box is Worth One Token - Interleaving Layout and Text in a Large Language Model for Document Understanding. 7252-7273 - Mingzhe Li, Xin Lu, Yanyan Zhao:
Self-Foveate: Enhancing Diversity and Difficulty of Synthesized Instructions from Unsupervised Text via Multi-Level Foveation. 7274-7289 - Mingyu Zheng, Zhifan Feng, Jia Wang, Lanrui Wang, Zheng Lin, Hao Yang, Weiping Wang:
TableDreamer: Progressive and Weakness-guided Data Synthesis from Scratch for Table Instruction Tuning. 7290-7315 - Nagham Hamad, Mohammed Khalilia, Mustafa Jarrar:
Konooz: Multi-domain Multi-dialect Corpus for Named Entity Recognition. 7316-7331 - Hongji Yang, Yucheng Zhou, Wencheng Han, Jianbing Shen:
Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation. 7332-7349 - Linhao Zhang, Daoguang Zan, Quanshun Yang, Zhirong Huang, Dong Chen, Bo Shen, Tianyu Liu, Yongshun Gong, Pengjie Huang, Xudong Lu, Guangtai Liang, Lizhen Cui, Qianxiang Wang:
CodeV: Issue Resolving with Visual Data. 7350-7361 - Hongbin Na, Yining Hua, Zimu Wang, Tao Shen, Beibei Yu, Lilin Wang, Wei Wang, John B. Torous, Ling Chen:
A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions. 7362-7376 - Tao He, Hao Li, Jingchang Chen, Runxuan Liu, Yixin Cao, Lizi Liao, Zihao Zheng, Zheng Chu, Jiafeng Liang, Ming Liu, Bing Qin:
Breaking the Reasoning Barrier A Survey on LLM Complex Reasoning through the Lens of Self-Evolution. 7377-7417 - Zhilin Wang, Yafu Li, Xiaoye Qu, Yu Cheng:
SEE: Continual Fine-tuning with Sequential Ensemble of Experts. 7418-7432 - Chi-Min Chan, Chunpu Xu, Junqi Zhu, Jiaming Ji, Donghai Hong, Pengcheng Wen, Chunyang Jiang, Zhen Ye, Yaodong Yang, Wei Xue, Sirui Han, Yike Guo:
Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA. 7433-7451 - Rui Hu, Delai Qiu, Shuyu Wei, Jiaming Zhang, Yining Wang, Shengping Liu, Jitao Sang:
Investigating and Enhancing Vision-Audio Capability in Omnimodal Large Language Models. 7452-7463 - Haote Yang, Xingjian Wei, Jiang Wu, Noémi Ligeti-Nagy, Jiaxing Sun, Yinfan Wang, Zijian Gyozo Yang, Junyuan Gao, Jingchao Wang, Bowen Jiang, Shasha Wang, Nanjun Yu, Zihao Zhang, Shixin Hong, Hongwei Liu, Wei Li, Songyang Zhang, Dahua Lin, Lijun Wu, Gábor Prószéky, Conghui He:
OpenHuEval: Evaluating Large Language Model on Hungarian Specifics. 7464-7520 - Sirui Huang, Yanggan Gu, Zhonghao Li, Xuming Hu, Qing Li, Guandong Xu:
StructFact: Reasoning Factual Knowledge from Structured Data with Large Language Models. 7521-7552 - Sirui Chen, Shu Yu, Shengjie Zhao, Chaochao Lu:
From Imitation to Introspection: Probing Self-Consciousness in Language Models. 7553-7583 - Mingxu Chai, Ziyu Shen, Chong Zhang, Yue Zhang, Xiao Wang, Shihan Dou, Jihua Kang, Jiazheng Zhang, Qi Zhang:
DocFusion: A Unified Framework for Document Parsing Tasks. 7584-7599 - Yue Li, Xin Yi, Dongsheng Shi, Gerard de Melo, Xiaoling Wang, Linlin Wang:
Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models. 7600-7612 - Bowen Ping, Jiali Zeng, Fandong Meng, Shuo Wang, Jie Zhou, Shanghang Zhang:
LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information. 7613-7632 - Quanyu Long, Jianda Chen, Zhengyuan Liu, Nancy F. Chen, Wenya Wang, Sinno Jialin Pan:
Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts. 7633-7651 - Bofei Gao, Yejie Wang, Yibo Miao, Ruoyu Wu, Feifan Song, Longhui Yu, Tianyu Liu, Baobao Chang:
Towards A Better Initial Policy Model For Scalable Long-CoT Reinforcement Learning. 7652-7665 - Tu Vu, Manh Do, Tung Nguyen, Ngo Van Linh, Sang Dinh, Thien Huu Nguyen:
Topic Modeling for Short Texts via Optimal Transport-Based Clustering. 7666-7680 - Colin Swaelens, Ilse De Vos, Els Lefever:
Lemmatisation & Morphological Analysis of Unedited Greek: Do Simple Tasks Need Complex Solutions? 7681-7689 - Chengzhang Yu, Yiming Zhang, Zhixin Liu, Zenghui Ding, Yining Sun, Zhanpeng Jin:
FRAME: Feedback-Refined Agent Methodology for Enhancing Medical Research Insights. 7690-7704 - Xi Li, Ruofan Mao, Yusen Zhang, Renze Lou, Chen Wu, Jiaqi Wang:
Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models. 7705-7727 - Pavel Posokhov, Sergei Masliukhin, Skrylnikov Stepan, Danil Tirskikh, Olesia Makhnytkina:
Relevance Scores Calibration for Ranked List Truncation via TMP Adapter. 7728-7734 - Chaona Kong, Jianyi Liu, Yifan Tang, Ru Zhang:
Neuron Activation Modulation for Text Style Transfer: Guiding Large Language Models. 7735-7747 - Jingqun Tang, Qi Liu, Yongjie Ye, Jinghui Lu, Shu Wei, An-Lan Wang, Chunhui Lin, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao Liu, Xiang Bai, Can Huang:
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. 7748-7763 - Xinyan Jiang, Hang Ye, Yongxin Zhu, Xiaoying Zheng, Zikang Chen, Jun Gong:
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models. 7764-7786 - Junchi Yao, Shu Yang, Jianhua Xu, Lijie Hu, Mengdi Li, Di Wang:
Understanding the Repeat Curse in Large Language Models from a Feature Perspective. 7787-7815 - Haneul Yoo, Cheonbok Park, Sangdoo Yun, Alice Oh, Hwaran Lee:
Code-Switching Curriculum Learning for Multilingual Transfer in LLMs. 7816-7836 - Yang Yao, Xuan Tong, Ruofan Wang, Yixu Wang, Lujundong Li, Liang Liu, Yan Teng, Yingchun Wang:
A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos. 7837-7855 - Yixuan Wang, Shiqi Zhou, Chuanzhe Guo, Qingfu Zhu:
Tag-Evol: Achieving Efficient Instruction Evolving via Tag Injection. 7856-7869 - Yao Huang, Yitong Sun, Shouwei Ruan, Yichi Zhang, Yinpeng Dong, Xingxing Wei:
Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space. 7870-7888 - Enzo Doyen, Amalia Todirascu:
GeNRe: A French Gender-Neutral Rewriting System Using Collective Nouns. 7889-7909 - Christian Jaumann, Andreas Wiedholz, Annemarie Friedrich:
LGAR: Zero-Shot LLM-Guided Neural Ranking for Abstract Screening in Systematic Literature Reviews. 7910-7927 - Ehud Malul, Oriel Perets, Ziv Mor, Yigal Kassel, Elior Sulem:
LCHAIM - Investigating Long Context Reasoning in Hebrew. 7928-7939 - Jiayuan Li, Lei Cui, Sen Zhao, Yun Yang, Lun Li, Hongsong Zhu:
CLeVeR: Multi-modal Contrastive Learning for Vulnerability Code Representation. 7940-7951 - Zilu Dong, Xiangqing Shen, Rui Xia:
MEMIT-Merge: Addressing MEMIT's Key-Value Conflicts in Same-Subject Batch Editing for LLMs. 7952-7960 - Qin Chen, Yuanyi Ren, Xiaojun Ma, Yuyang Shi:
Large Language Models for Predictive Analysis: How Far Are They? 7961-7978 - Xiaoxue Cheng, Junyi Li, Xin Zhao, Ji-Rong Wen:
Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking. 7979-7990 - Qitao Qin, Yucong Luo, Yihang Lu, Zhibo Chu, Xiaoman Liu, Xianwei Meng:
Towards Adaptive Memory-Based Optimization for Enhanced Retrieval-Augmented Generation. 7991-8004 - Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou:
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping. 8005-8018 - Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang:
A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models. 8019-8033 - Lingxiao Wei, He Yan, Xiangju Lu, Junmin Zhu, Jun Wang, Wei Zhang:
CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels. 8034-8062 - Zhitong Wang, Cheng Gao, Chaojun Xiao, Yufei Huang, Shuzheng Si, Kangyang Luo, Yuzhuo Bai, Wenhao Li, Tangjian Duan, Chuancheng Lv, Guoshan Lu, Gang Chen, Fanchao Qi, Maosong Sun:
Document Segmentation Matters for Retrieval-Augmented Generation. 8063-8075 - Xunzhi Wang, Zhuowei Zhang, Gaonan Chen, Qiongyu Li, Bitong Luo, Zhixin Han, Haotian Wang, Zhiyu Li, Hang Gao, Mengting Hu:
UBench: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions. 8076-8107 - Yusheng Zhao, Xiao Luo, Haomin Wen, Zhiping Xiao, Wei Ju, Ming Zhang:
Embracing Large Language Models in Traffic Flow Forecasting. 8108-8123 - Mengliang He, Jiayi Zeng, Yankai Jiang, Wei Zhang, Zeming Liu, Xiaoming Shi, Aimin Zhou:
Flow2Code: Evaluating Large Language Models for Flowchart-based Code Generation Capability. 8124-8146 - Romain Storaï, Jaeseong Lee, Seung-won Hwang:
Smarter, Not Harder: Training-Free Adaptive Computation for Transformers. 8147-8155 - Zhenhe Wu, Zhongqiu Li, Jie Zhang, Zhongjiang He, Jian Yang, Yu Zhao, Ruiyu Fang, Bing Wang, Hongyan Xie, Shuangyong Song, Zhoujun Li:
UCS-SQL: Uniting Content and Structure for Enhanced Semantic Bridging In Text-to-SQL. 8156-8168 - Qingyao Li, Xinyi Dai, Xiangyang Li, Weinan Zhang, Yasheng Wang, Ruiming Tang, Yong Yu:
CodePRM: Execution Feedback-enhanced Process Reward Model for Code Generation. 8169-8182 - Jiaru Zou, Qing Wang, Pratyush Thakur, Nickvash Kani:
STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing. 8183-8199 - Jihoon Lee, Min Song:
Retrieval Visual Contrastive Decoding to Mitigate Object Hallucinations in Large Vision-Language Models. 8200-8219 - Pramit Bhattacharyya, Arnab Bhattacharya:
Leveraging LLMs for Bangla Grammar Error Correction: Error Categorization, Synthetic Data, and Model Evaluation. 8220-8239 - Yimiao Qiu, Yang Deng, Quanming Yao, Zhimeng Zhang, Zhiang Dong, Chang Yao, Jingyuan Chen:
Think Both Ways: Teacher-Student Bidirectional Reasoning Enhances MCQ Generation and Distractor Quality. 8240-8253 - Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, Zhicheng Dou:
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data. 8254-8275 - Jeonghwan Choi, Minjeong Ban, Minseok Kim, Hwanjun Song:
Word2Passage: Word-level Importance Re-weighting for Query Expansion. 8276-8296 - Yangbo Wei, Zhen Huang, Fangzhou Zhao, Qi Feng, Wei W. Xing:
MECoT: Markov Emotional Chain-of-Thought for Personality-Consistent Role-Playing. 8297-8314 - Yuan Sui, Yufei He, Nian Liu, Xiaoxin He, Kun Wang, Bryan Hooi:
FiDeLiS: Faithful Reasoning in Large Language Models for Knowledge Graph Question Answering. 8315-8330 - Jingwen Cheng, Kshitish Ghate, Wenyue Hua, William Yang Wang, Hong Shen, Fei Fang:
REALM: A Dataset of Real-World LLM Use Cases. 8331-8341 - Tommaso Green, Félix Gaschi, Fabian David Schmidt, Simone Paolo Ponzetto, Goran Glavas:
BABELEDITS: A Benchmark and a Modular Approach for Robust Cross-lingual Knowledge Editing of Large Language Models. 8342-8369 - Haokun Zhao, Jinyi Han, Jiaqing Liang, Yanghua Xiao, Xiaojun Meng, Jiansheng Wei:
CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory. 8370-8393 - Xuetao Ma, Wenbin Jiang, Hua Huang:
Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning. 8394-8412 - Dipankar Srirag, Aditya Joshi, Jordan Painter, Diptesh Kanojia:
BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English. 8413-8429 - Zihan Wang, Yaohui Zhu, Gim Hee Lee, Yachun Fan:
NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM. 8430-8440 - Yu Guo, Dong Jin, Shenghao Ye, Shuangwu Chen, Jianyang Jianyang, Xiaobin Tan:
SQLForge: Synthesizing Reliable and Diverse Data to Enhance Text-to-SQL Reasoning in LLMs. 8441-8452 - Jiachen Zhu, Congmin Zheng, Jianghao Lin, Kounianhua Du, Ying Wen, Yong Yu, Jun Wang, Weinan Zhang:
Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning. 8453-8468 - Maike Züfle, Jan Niehues:
Contrastive Learning for Task-Independent SpeechLLM-Pretraining. 8469-8490 - Qirui Zhou, Shaohui Peng, Weiqiang Xiong, Haixin Chen, Yuanbo Wen, Haochen Li, Ling Li, Qi Guo, Yongwei Zhao, Ke Gao, Ruizhi Chen, Yanjun Wu, Zhao Chen, Yunji Chen:
QiMeng-Attention: SOTA Attention Operator is generated by SOTA Attention Algorithm. 8491-8505 - Yuechi Zhou, Chuyue Zhou, Jianxin Zhang, Juntao Li, Min Zhang:
ALW: Adaptive Layer-Wise contrastive decoding enhancing reasoning ability in Large Language Models. 8506-8524 - Xinlong Chen, Yuanxing Zhang, Qiang Liu, Junfei Wu, Fuzheng Zhang, Tieniu Tan:
Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language Models. 8525-8542 - Xinlong Chen, Yuanxing Zhang, Chongling Rao, Yushuo Guan, Jiaheng Liu, Fuzheng Zhang, Chengru Song, Qiang Liu, Di Zhang, Tieniu Tan:
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation. 8543-8563 - Chuan Gou, Bangwei Li, Jianhua Dai, Xiaoyang Han, Ming Cai:
Mitigating Demonstration Bias through Global Coevolutionary Reasoning. 8564-8578 - Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis:
A Representation Level Analysis of NMT Model Robustness to Grammatical Errors. 8579-8601 - Han Lin, Xiu Tang, Huan Li, Wenxue Cao, Sai Wu, Chang Yao, Lidan Shou, Gang Chen:
T²DR: A Two-Tier Deficiency-Resistant Framework for Incomplete Multimodal Learning. 8602-8616 - Shixin Jiang, Jiafeng Liang, Jiyuan Wang, Xuan Dong, Heng Chang, Weijiang Yu, Jinhua Du, Ming Liu, Bing Qin:
From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalities. 8617-8652 - Verena Blaschke, Masha Fedzechkina, Maartje ter Hoeve:
Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Experimental Setups Matter. 8653-8684 - Kristina Kobrock, Xenia Ohmer, Elia Bruni, Nicole Gotzner:
Agents generalize to novel levels of abstraction by using adaptive linguistic strategies. 8685-8699 - Dan Wang, Boxi Cao, Ning Bian, Xuanang Chen, Yaojie Lu, Hongyu Lin, Jia Zheng, Le Sun, Shanshan Jiang, Bin Dong, Xianpei Han:
The Linguistic Connectivities Within Large Language Models. 8700-8714 - Zhihan Zhang, Yixin Cao, Lizi Liao:
XFinBench: Benchmarking LLMs in Complex Financial Problem Solving and Reasoning. 8715-8758 - Hongzhe Huang, Jiang Liu, Zhewen Yu, Li Cai, Dian Jiao, Wenqiao Zhang, Siliang Tang, Juncheng Li, Hao Jiang, Haoyuan Li, Yueting Zhuang:
Align²LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation. 8759-8781 - Siqing Song, Chuang Wang, Rui-Qi Wang, Yi Yang, Xu-Yao Zhang:
Achieving binary weight and activation for LLMs using Post-Training Quantization. 8782-8795 - Wei Sun, Tingyu Qu, Mingxiao Li, Jesse Davis, Marie-Francine Moens:
Mitigating Negative Interference in Multilingual Knowledge Editing through Null-Space Constraints. 8796-8810 - Wenjing Xie, Xiaobo Liang, Juntao Li, Wanfu Wang, Kehai Chen, Qiaoming Zhu, Min Zhang:
From Awareness to Adaptability: Enhancing Tool Utilization for Scientific Reasoning. 8811-8831 - Qi Liu, Jingqing Ruan, Hao Li, Haodong Zhao, Desheng Wang, Jiansong Chen, Guanglu Wan, Xunliang Cai, Zhi Zheng, Tong Xu:
AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models. 8832-8866 - Junjie Zhang, Rushuai Yang, Shunyu Liu, Ting-En Lin, Fei Huang, Yi Chen, Yongbin Li, Dacheng Tao:
Supervised Optimism Correction: Be Confident When LLMs Are Sure. 8867-8880 - Huaijie Wang, Shibo Hao, Hanze Dong, Shenao Zhang, Yilin Bao, Ziran Yang, Yi Wu:
Offline Reinforcement Learning for LLM Multi-step Reasoning. 8881-8893 - Masahiro Kaneko, Youmi Ma, Yuki Wata, Naoaki Okazaki:
Sampling-based Pseudo-Likelihood for Membership Inference Attacks. 8894-8907 - Chengyou Jia, Minnan Luo, Zhuohang Dang, Qiushi Sun, Fangzhi Xu, Junlin Hu, Tianbao Xie, Zhiyong Wu:
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant. 8908-8934 - Xin-Cheng Wen, Yijun Yang, Cuiyun Gao, Yang Xiao, Deheng Ye:
Boosting Vulnerability Detection of LLMs via Curriculum Preference Optimization with Synthetic Reasoning Data. 8935-8949 - Yunyao Zhang, Zikai Song, Hang Zhou, Wenfeng Ren, Yi-Ping Phoebe Chen, Junqing Yu, Wei Yang:
GA-S³: Comprehensive Social Network Simulation with Group Agents. 8950-8970 - Kaijie Jiao, Quan Wang, Licheng Zhang, Zikang Guo, Zhendong Mao:
M-RangeDetector: Enhancing Generalization in Machine-Generated Text Detection through Multi-Range Attention Masks. 8971-8983 - Heeseung Kim, Che Hyun Lee, Sangkwon Park, Jiheum Yeom, Nohil Park, Sangwon Yu, Sungroh Yoon:
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models. 8984-9014 - Wangyun Gu, Qianghua Gao, Li-Xin Zhang, Xu Shen, Jieping Ye:
NeuronMerge: Merging Models via Functional Neuron Groups. 9015-9037 - Xiaoyuan Li, Moxin Li, Rui Men, Yichang Zhang, Keqin Bao, Wenjie Wang, Fuli Feng, Dayiheng Liu, Junyang Lin:
HellaSwag-Pro: A Large-Scale Bilingual Benchmark for Evaluating the Robustness of LLMs in Commonsense Reasoning. 9038-9072 - Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Ben He, Le Sun, Jingren Zhou, Junyang Lin:
Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models. 9073-9085 - King Zhu, Qianbo Zang, Shian Jia, Siwei Wu, Feiteng Fang, Yizhi Li, Shuyue Guo, Tianyu Zheng, Jiawei Guo, Bo Li, Haoning Wu, Xingwei Qu, Jian Yang, Ruibo Liu, Xiang Yue, Jiaheng Liu, Chenghua Lin, Hamid Alinejad-Rokny, Min Yang, Shiwen Ni, Wenhao Huang, Ge Zhang:
LIME: Less Is More for MLLM Evaluation. 9086-9121 - Xiaofeng Zhou, Heyan Huang, Lizi Liao:
Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement. 9122-9137 - Hong Yi Lin, Chunhua Liu, Haoyu Gao, Patanamon Thongtanunam, Christoph Treude:
CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models. 9138-9166 - Yulia Otmakhova, Lea Frermann:
Narrative Media Framing in Political Discourse. 9167-9196 - Yishuo Cai, Renjie Gu, Jiaxu Li, Xuancheng Huang, Junzhe Chen, Xiaotao Gu, Minlie Huang:
MHALO: Evaluating MLLMs as Fine-grained Hallucination Detectors. 9197-9222 - Barbara Scalvini, Alireza Mashaghi:
Semantic Topology: a New Perspective for Communication Style Characterization. 9223-9233 - Xiaoyu Li, Haoran Shi, Zengyi Yu, Yukun Tu, Chanjin Zheng:
Decoding LLM Personality Measurement: Forced-Choice vs. Likert. 9234-9247 - Koki Horiguchi, Tomoyuki Kajiwara, Takashi Ninomiya, Shoko Wakamiya, Eiji Aramaki:
MultiMSD: A Corpus for Multilingual Medical Text Simplification from Online Medical References. 9248-9258 - Ruyi Zhang, Songlei Jian, Yusong Tan, Heng Gao, Haifang Zhou, Kai Lu:
BadWindtunnel: Defending Backdoor in High-noise Simulated Training with Confidence Variance. 9259-9273 - Yue Gao, Jing Zhao, Shiliang Sun, Xiaosong Qiao, Tengfei Song, Hao Yang:
Multimodal Machine Translation with Text-Image In-depth Questioning. 9274-9287 - Xiaozhuang Song, Shufei Zhang, Tianshu Yu:
ReKG-MCTS: Reinforcing LLM Reasoning on Knowledge Graphs via Training-Free Monte Carlo Tree Search. 9288-9306 - Aziguli Wulamu, Lyu Zhengyu, Kaiyuan Gong, Yu Han, Zewen Wang, Zhihong Zhu, Bowen Xing:
HTML: Hierarchical Topology Multi-task Learning for Semantic Parsing in Knowledge Base Question Answering. 9307-9321 - Jinnan Li, Jinzhe Li, Yue Wang, Yi Chang, Yuan Wu:
StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following. 9322-9341 - Fanxiao Li, Jiaying Wu, Canyuan He, Wei Zhou:
CMIE: Combining MLLM Insights with External Evidence for Explainable Out-of-Context Misinformation Detection. 9342-9354 - Ashutosh Dwivedi, Siddhant Singh, Ashutosh Modi:
EtiCor++: Towards Understanding Etiquettical Bias in LLMs. 9355-9376 - Yuanjian Xu, Jianing Hao, Kunsheng Tang, Jingnan Chen, Anxian Liu, Peng Liu, Guang Zhang:
FinRipple: Aligning Large Language Models with Financial Market for Event Ripple Effect Awareness. 9377-9398 - Yingfeng Luo, Tong Zheng, Yongyu Mu, Bei Li, Qinghong Zhang, Yongqi Gao, Ziqiang Xu, Peinan Feng, Xiaoqian Liu, Tong Xiao, JingBo Zhu:
Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation. 9399-9431 - Nopporn Lekuthai, Nattawit Pewngam, Supitcha Sokrai, Titipat Achakulvisut:
EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria through Retrieval-Augmented Fine-Tuning. 9432-9444 - Elena Stringli, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorgos Stamou:
Pitfalls of Scale: Investigating the Inverse Task of Redefinition in Large Language Models. 9445-9469 - Tianhe Lin, Jian Xie, Siyu Yuan, Deqing Yang:
Implicit Reasoning in Transformers is Reasoning through Shortcuts. 9470-9487 - Kaishuai Xu, Tiezheng Yu, Yi Cheng, Wenjun Hou, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li:
Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework. 9488-9502 - Yiliu Sun, Zicheng Zhao, Sheng Wan, Chen Gong:
CortexDebate: Debating Sparsely and Equally for Multi-Agent Debate. 9503-9523 - Valentin Knappich, Anna Hätty, Simon Razniewski, Annemarie Friedrich:
PAP2PAT: Benchmarking Outline-Guided Long-Text Patent Generation with Patent-Paper Pairs. 9524-9554 - Xiaofeng Wang, Zhixin Zhang, Jin Guang Zheng, Yiming Ai, Rui Wang:
Debt Collection Negotiations with Large Language Models: An Evaluation System and Optimizing Decision Making with Multi-Agent. 9555-9577 - Kechi Zhang, Ge Li, Jia Li, Yihong Dong, Zhi Jin:
Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points. 9578-9591 - Elke Vandermeerschen, Miryam de Lhoneux:
Supervised and Unsupervised Probing of Shortcut Learning: Case Study on the Emergence and Evolution of Syntactic Heuristics in BERT. 9592-9604 - Florian Schneider, Carolin Holtermann, Chris Biemann, Anne Lauscher:
GIMMICK: Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking. 9605-9668 - Joonhyung Park, Peng Tang, Sagnik Das, Srikar Appalaraju, Kunwar Yashraj Singh, R. Manmatha, Shabnam Ghadar:
R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding. 9669-9685 - Xiaolong Wang, Yuanchi Zhang, Ziyue Wang, Yuzhuang Xu, Fuwen Luo, Yile Wang, Peng Li, Yang Liu:
Perspective Transition of Large Language Models for Solving Subjective Tasks. 9686-9704 - Kaimin Wang, Yuanzhe Shen, Changze Lv, Xiaoqing Zheng, Xuanjing Huang:
TripTailor: A Real-World Benchmark for Personalized Travel Planning. 9705-9723 - Florian Babl, Moritz Hennen, Jakob Murauer, Michaela Geierhos:
Random Splitting Negatively Impacts NER Evaluation: Quantifying and Eliminating the Overestimation of NER Performance. 9724-9738 - Lingwei Wei, Dou Hu, Wei Zhou, Philip S. Yu, Songlin Hu:
Structure-adaptive Adversarial Contrastive Learning for Multi-Domain Fake News Detection. 9739-9752 - Zhiting Fan, Ruizhe Chen, Zuozhu Liu:
BiasGuard: A Reasoning-Enhanced Bias Detection Tool for Large Language Models. 9753-9764 - Maiya Goloburda, Nurkhan Laiyk, Diana Turmakhan, Yuxia Wang, Mukhammed Togmanov, Jonibek Mansurov, Askhat Sametov, Nurdaulet Mukhituly, Minghan Wang, Daniil Orel, Zain Muhammad Mujahid, Fajri Koto, Timothy Baldwin, Preslav Nakov:
Qorǵau: Evaluating Safety in Kazakh-Russian Bilingual Contexts. 9765-9784 - Linjie Mu, Zhongzhen Huang, Shengqian Qin, Yakun Zhu, Shaoting Zhang, Xiaofan Zhang:
MMXU: A Multi-Modal and Multi-X-ray Understanding Dataset for Disease Progression. 9785-9803 - Ziyi Ni, Yifan Li, Ning Yang, Dou Shen, Pin Lyu, Daxiang Dong:
Tree-of-Code: A Self-Growing Tree Framework for End-to-End Code Generation and Execution in Complex Tasks. 9804-9819 - David Sasu, Zehui Wu, Ziwei Gong, Run Chen, Pengyuan Shi, Lin Ai, Julia Hirschberg, Natalie Schluter:
Akan Cinematic Emotions (ACE): A Multimodal Multi-party Dataset for Emotion Recognition in Movie Dialogues. 9820-9831 - Kaiyang Wan, Honglin Mu, Rui Hao, Haoran Luo, Tianle Gu, Xiuying Chen:
A Cognitive Writing Perspective for Constrained Long-Form Text Generation. 9832-9844 - You Li, Heyu Huang, Chi Chen, Kaiyu Huang, Chao Huang, Zonghao Guo, Zhiyuan Liu, Jinan Xu, Yuhua Li, Ruixuan Li, Maosong Sun:
Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models. 9845-9867 - Shivam Adarsh, Kumar Shridhar, Caglar Gulcehre, Nicholas Monath, Mrinmaya Sachan:
SIKeD: Self-guided Iterative Knowledge Distillation for Mathematical Reasoning. 9868-9880 - Xikang Yang, Biyu Zhou, Xuehai Tang, Jizhong Han, Songlin Hu:
Chain of Attack: Hide Your Intention through Multi-Turn Interrogation. 9881-9901 - Yicheng Chen, Yining Li, Kai Hu, Zerun Ma, Haochen Ye, Kai Chen:
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space. 9902-9915 - Yongchan Chun, Minhyuk Kim, Dongjun Kim, Chanjun Park, Heuiseok Lim:
Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval. 9916-9926 - Linhai Zhang, Ziyang Gao, Deyu Zhou, Yulan He:
Explainable Depression Detection in Clinical Interviews with Personalized Retrieval-Augmented Generation. 9927-9944 - Zheheng Luo, Chenhan Yuan, Qianqian Xie, Sophia Ananiadou:
EMPEC: A Comprehensive Benchmark for Evaluating Large Language Models Across Diverse Healthcare Professions. 9945-9958 - Fanzeng Xia, Hao Liu, Yisong Yue, Tongxin Li:
Beyond Numeric Rewards: In-Context Dueling Bandits with LLM Agents. 9959-9988 - Hyunbin Jin, Je Won Yeom, Seunghyun Bae, Taesup Kim:
"Well, Keep Thinking": Enhancing LLM Reasoning with Adaptive Injection Decoding. 9989-10018 - Xiangyu Zhang, Hexin Liu, Qiquan Zhang, Beena Ahmed, Julien Epps:
SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information. 10019-10030 - Jingxuan Han, Zhendong Mao, Yi Liu, Yexuan Che, Zheren Fu, Quan Wang:
Fine-grained Knowledge Enhancement for Retrieval-Augmented Generation. 10031-10044 - Chengkun Cai, Haoliang Liu, Xu Zhao, Zhongyu Jiang, Tianfang Zhang, Zongkai Wu, John Lee, Jenq-Neng Hwang, Lei Li:
Bayesian Optimization for Controlled Image Editing via LLMs. 10045-10056 - Francesco Cazzaro, Justin Kleindienst, Sofia Márquez Gomez, Ariadna Quattoni:
SPOT: Zero-Shot Semantic Parsing Over Property Graphs. 10057-10073 - Geonhee Kim, Marco Valentino, André Freitas:
Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic Inference. 10074-10095 - Maodong Li, Longyin Zhang, Fang Kong:
Multi-Hop Question Generation via Dual-Perspective Keyword Guidance. 10096-10112 - Harsh Bihany, Shubham Patel, Ashutosh Modi:
LoRMA: Low-Rank Multiplicative Adaptation for LLMs. 10113-10133 - Linghao Zhang, Junhao Wang, Shilin He, Chaoyun Zhang, Yu Kang, Bowen Li, Jiaheng Wen, Chengxing Xie, Maoquan Wang, Yufan Huang, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang, Qi Zhang:
DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale. 10134-10153 - Yunfan Xie, Lixin Zou, Dan Luo, Min Tang, Chenliang Li:
Weak-to-Strong Honesty Alignment via Learning-to-Rank Supervision. 10154-10168 - Mohammadamin Shafiei, Hamidreza Saffari, Nafise Sadat Moosavi:
MultiHoax: A Dataset of Multi-hop False-premise questions. 10169-10187 - Jinming Zhang, Yunfei Long:
Learning to Play Like Humans: A Framework for LLM Adaptation in Interactive Fiction Games. 10188-10205 - Zewen Bai, Liang Yang, Shengdi Yin, Junyu Lu, Jingjie Zeng, Haohao Zhu, Yuanyuan Sun, Hongfei Lin:
STATE ToxiCN: A Benchmark for Span-level Target-Aware Toxicity Extraction in Chinese Hate Speech Detection. 10206-10219 - Yifan Niu, Miao Peng, Nuo Chen, Yatao Bian, Tingyang Xu, Jia Li:
RelEdit: Evaluating Conceptual Knowledge Editing in Language Models via Relational Reasoning. 10220-10238 - Yonghua Hei, Yibo Yan, Shuliang Liu, Huiyu Zhou, Linfeng Zhang, Xuming Hu:
Unlocking Speech Instruction Data Potential with Query Rewriting. 10239-10260 - Tianle Gu, Kexin Huang, Ruilin Luo, Yuanqi Yao, Xiuying Chen, Yujiu Yang, Yan Teng, Yingchun Wang:
From Evasion to Concealment: Stealthy Knowledge Unlearning for LLMs. 10261-10279 - Baolong Bi, Shaohan Huang, Yiwei Wang, Tianchi Yang, Zihan Zhang, Haizhen Huang, Lingrui Mei, Junfeng Fang, Zehao Li, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang, Shenghua Liu:
Context-DPO: Aligning Language Models for Context-Faithfulness. 10280-10300 - Xiachong Feng, Longxu Dou, Lingpeng Kong:
Reasoning Does Not Necessarily Improve Role-Playing Ability. 10301-10314 - Xiaokang Zhang, Sijia Luo, Bohan Zhang, Zeyao Ma, Jing Zhang, Yang Li, Guanlin Li, Zijun Yao, Kangli Xu, Jinchang Zhou, Daniel Zhang-Li, Jifan Yu, Shu Zhao, Juanzi Li, Jie Tang:
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios. 10315-10344 - Wenxuan Wang, Zizhan Ma, Zheng Wang, Chenghan Wu, Jiaming Ji, Wenting Chen, Xiang Li, Yixuan Yuan:
A Survey of LLM-based Agents in Medicine: How far are we from Baymax? 10345-10359 - Haewon Park, Gyubin Choi, Minjun Kim, Yohan Jo:
Context-Robust Knowledge Editing for Language Models. 10360-10385 - Zhuoyun Du, Chen Qian, Wei Liu, Zihao Xie, Yifei Wang, Rennai Qiu, Yufan Dang, Weize Chen, Cheng Yang, Ye Tian, Xuantang Xiong, Lei Han:
Multi-Agent Collaboration via Cross-Team Orchestration. 10386-10406 - William Soto Martinez, Yannick Parmentier, Claire Gardent:
Semantic Evaluation of Multilingual Data-to-Text Generation via NLI Fine-Tuning: Precision, Recall and F1 scores. 10407-10427 - Kidist Amde Mekonnen, Yosef Worku Alemneh, Maarten de Rijke:
Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval. 10428-10445 - Yue Fang, Zhi Jin, Jie An, Hongshen Chen, Xiaohong Chen, Naijun Zhan:
Enhancing Transformation from Natural Language to Signal Temporal Logic Using LLMs with Diverse External Knowledge. 10446-10458 - Puli Chen, Cheng Yang, Xingmao Zhang, Qingbao Huang:
DAGS: A Dependency-Based Dual-Attention and Global Semantic Improvement Framework for Metaphor Recognition. 10459-10476 - Xiaofan Bai, Pingyi Hu, Xiaojing Ma, Linchen Yu, Dongmei Zhang, Qi Zhang, Bin Benjamin Zhu:
ESF: Efficient Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models. 10477-10494 - Zhenru Zhang, Chujie Zheng, Yangzhen Wu, Beichen Zhang, Runji Lin, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin:
The Lessons of Developing Process Reward Models in Mathematical Reasoning. 10495-10516 - Yongqi Fan, Yating Wang, Guandong Wang, Jie Zhai, Jingping Liu, Qi Ye, Tong Ruan:
MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs. 10517-10548 - Osman Alperen Koras, Rabi Bahnan, Jens Kleesiek, Amin Dada:
Towards Conditioning Clinical Text Generation for User Control. 10549-10569 - Daniil Orel, Dilshod Azizov, Preslav Nakov:
CoDet-M4: Detecting Machine-Generated Code in Multi-Lingual, Multi-Generator and Multi-Domain Settings. 10570-10593 - Tianqi Chen, Yuanteng Chen, Peisong Wang, Weixiang Xu, Zeyu Zhu, Jian Cheng:
Q-Mamba: Towards more efficient Mamba models via post-training quantization. 10594-10610 - Kaiwen Wei, Jie Yao, Jiang Zhong, Yangyang Kang, Jingyuan Zhang, Changlong Sun, Xin Zhang, Fengmao Lv, Li Jin:
P²Net: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts. 10611-10626 - Liyang He, Chenglong Liu, Rui Li, Zhenya Huang, Shulan Ruan, Jun Zhou, Enhong Chen:
Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models. 10627-10643 - Tianqi Chen, Peisong Wang, Weixiang Xu, Zeyu Zhu, Jian Cheng:
RQT: Hierarchical Residual Quantization for Multi-Model Compression. 10644-10660 - Stefanie Urchs, Veronika Thurner, Matthias Aßenmacher, Christian Heumann, Stephanie Thiemichen:
taz2024full: Analysing German Newspapers for Gender Bias and Discrimination across Decades. 10661-10671 - Marta R. Costa-jussà, Pierre Andrews, Mariano Coria Meglioli, Joy Chen, Joe Chuang, David Dale, Christophe Ropers, Alexandre Mourachko, Eduardo Sánchez, Holger Schwenk, Tuan Tran, Arina Turkatenko, Carleigh Wood:
LCFO: Long Context and Long Form Output Dataset and Benchmarking. 10672-10700 - Yang Hou, Zhenghua Li:
Span-based Semantic Role Labeling as Lexicalized Constituency Tree Parsing. 10701-10713 - Chanhwi Kim, Hyunjae Kim, Sihyeon Park, Jiwoo Lee, Mujeen Sung, Jaewoo Kang:
Learning from Negative Samples in Biomedical Generative Entity Linking. 10714-10730 - Tautvydas Misiunas, Hassan Mansoor, Jasper Uijlings, Oriana Riva, Victor Carbune:
Self-play through Computational Runtimes improves Chart Reasoning. 10731-10746 - Jiachun Li, Pengfei Cao, Yubo Chen, Jiexin Xu, Huaijun Li, Xiaojian Jiang, Kang Liu, Jun Zhao:
Towards Better Chain-of-Thought: A Reflection on Effectiveness and Faithfulness. 10747-10765 - Sinan Kurtyigit, Diego Frassinelli, Carina Silberer, Sabine Schulte im Walde:
A Couch Potato is not a Potato on a Couch: Prompting Strategies, Image Generation, and Compositionality Prediction for Noun Compounds. 10766-10776 - Beiduo Chen, Siyao Peng, Anna Korhonen, Barbara Plank:
A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI. 10777-10802 - Angelina Parfenova, Jürgen Pfeffer:
Measuring What Matters: Evaluating Ensemble LLMs with Label Refinement in Inductive Coding. 10803-10816 - Cong Gao, Bo Zhang, Linkang Yang, Minghao Hu, Zhunchen Luo, Xiaoying Bai, Guotong Geng, Jun Zhang, Yunhua Xue:
Dynamic Evil Score-Guided Decoding: An Efficient Decoding Framework For Red-Team Model. 10817-10833 - Divyaksh Shukla, Ritesh Baviskar, Dwijesh Gohil, Aniket Tiwari, Atul Shree, Ashutosh Modi:
CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations. 10834-10849 - Chris W. Jenkins, Filip Miletic, Sabine Schulte im Walde:
Multi-word Measures: Modeling Semantic Change in Compound Nouns. 10850-10864 - Jipeng Zhang, Jianshu Zhang, Yuanzhe Li, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, Tong Zhang:
Bridge-Coder: Transferring Model Capabilities from High-Resource to Low-Resource Programming Language. 10865-10882 - Yan Yang, Dongxu Li, Haoning Wu, Bei Chen, Liu Liu, Liyuan Pan, Junnan Li:
ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks. 10883-10892 - Marta R. Costa-jussà, Bokai Yu, Pierre Andrews, Belen Alastruey, Necati Cihan Camgöz, Joe Chuang, Jean Maillard, Christophe Ropers, Arina Turkatenko, Carleigh Wood:
2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset Download PDF. 10893-10904 - Naomi Baes, Raphaël Merx, Nick Haslam, Ekaterina Vylomova, Haim Dubossarsky:
LSC-Eval: A General Framework to Evaluate Methods for Assessing Dimensions of Lexical Semantic Change Using LLM-Generated Synthetic Data. 10905-10939 - Wenxuan Wang, Kuiyi Gao, Youliang Yuan, Jen-tse Huang, Qiuzhi Liu, Shuai Wang, Wenxiang Jiao, Zhaopeng Tu:
Chain-of-Jailbreak Attack for Image Generation Models via Step by Step Editing. 10940-10957 - Anna Wegmann, Dong Nguyen, David Jurgens:
Tokenization is Sensitive to Language Variation. 10958-10983 - Xin Li, Mengbing Liu, Li Wei, Jiancheng An, Mérouane Abdelkader Debbah, Chau Yuen:
WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications. 10984-11009 - Moxin Li, Yuantao Zhang, Wenjie Wang, Wentao Shi, Zhuo Liu, Fuli Feng, Tat-Seng Chua:
Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment. 11010-11031 - Zhijun Wang, Jiahuan Li, Hao Zhou, Rongxiang Weng, Jingang Wang, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang:
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training. 11032-11046 - Sougata Saha, Monojit Choudhury:
User Behavior Prediction as a Generic, Robust, Scalable, and Low-Cost Evaluation Strategy for Estimating Generalization in LLMs. 11047-11065 - Yueqi Song, Frank F. Xu, Shuyan Zhou, Graham Neubig:
Beyond Browsing: API-Based Web Agents. 11066-11085 - Chen Zhang, Mingxu Tao, Zhiyuan Liao, Yansong Feng:
MiLiC-Eval: Benchmarking Multilingual LLMs for China's Minority Languages. 11086-11102 - Maja Stahl, Timon Ziegenbein, Joonsuk Park, Henning Wachsmuth:
ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation. 11103-11127 - Yuanhe Zhang, Zhenhong Zhou, Wei Zhang, Xinyue Wang, Xiaojun Jia, Yang Liu, Sen Su:
Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings. 11128-11150 - Chenchen Yuan, Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci:
Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models. 11151-11168 - Haoke Zhang, Xiaobo Liang, Cunxiang Wang, Juntao Li, Min Zhang:
Unlocking Recursive Thinking of LLMs: Alignment via Refinement. 11169-11182 - Kepu Zhang, Weijie Yu, Sunhao Dai, Jun Xu:
CitaLaw: Enhancing LLM with Citations in Legal Domain. 11183-11196 - Jiyang Qiu, Xinbei Ma, Zhuosheng Zhang, Hai Zhao, Yun Li, Qianren Wang:
MEGen: Generative Backdoor into Large Language Models via Model Editing. 11197-11214 - Jiho Jin, Woosung Kang, Junho Myung, Alice Oh:
Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations. 11215-11228 - Junling Wang, Anna Rutkiewicz, April Yi Wang, Mrinmaya Sachan:
Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models. 11229-11257 - Baixuan Li, Yunlong Fan, Tianyi Ma, Miao Gao, Chuanqi Shi, Zhiqiang Gao:
RASPberry: Retrieval-Augmented Monte Carlo Tree Self-Play with Reasoning Consistency for Multi-Hop Question Answering. 11258-11276 - Jia Hao, Chunhong Zhang, Jiarun Liu, Haiyu Zhao, Zhiqiang Zhan, Zheng Hu:
All That Glitters is Not Gold: Improving Robust Retrieval-Augmented Language Models with Fact-Centric Preference Alignment. 11277-11292 - Yichen Li, Zhiting Fan, Ruizhe Chen, Xiaotang Gai, Luqi Gong, Yan Zhang, Zuozhu Liu:
FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering. 11293-11312 - Zhuofan Wen, Zheng Lian, Shun Chen, Hailiang Yao, Longjiang Yang, Bin Liu, Jianhua Tao:
Listen, Watch, and Learn to Feel: Retrieval-Augmented Emotion Reasoning for Compound Emotion Generation. 11313-11327 - Kangyang Luo, Yuzhuo Bai, Cheng Gao, Shuzheng Si, Zhu Liu, Yingli Shen, Zhitong Wang, Cunliang Kong, Wenhao Li, Yufei Huang, Ye Tian, Xuantang Xiong, Lei Han, Maosong Sun:
GLTW: Joint Improved Graph Transformer and LLM via Three-Word Language for Knowledge Graph Completion. 11328-11344 - Zheng Zhang, Shaocheng Lan, Lei Song, Jiang Bian, Yexin Li, Kan Ren:
Learning to Select In-Context Demonstration Preferred by Large Language Model. 11345-11360 - Cristiano Ciaccio, Marta Sartor, Alessio Miaschi, Felice Dell'Orletta:
Beyond the Spelling Miracle: Investigating Substring Awareness in Character-Blind Language Models. 11361-11372 - Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao, Yongbin Li:
DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling. 11373-11401 - Bowen Cao, Deng Cai, Wai Lam:
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation. 11402-11415 - Qiao Liang, Ying Shen, Tiantian Chen, Lin Zhang:
M3HG: Multimodal, Multi-scale, and Multi-type Node Heterogeneous Graph for Emotion Cause Triplet Extraction in Conversations. 11416-11431 - Pratik Kayal, Pascal Mettes, Nima Dehmamy, Minsu Park:
Large Language Models Are Natural Video Popularity Predictors. 11432-11464 - Yi Wang, Fenghua Weng, Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang:
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing. 11465-11481 - Frederic Kirstein, Muneeb Khan, Jan Philip Wahle, Terry Ruas, Bela Gipp:
You need to MIMIC to get FAME: Solving Meeting Transcript Scarcity with Multi-Agent Conversations. 11482-11525 - Igor Sterner, Simone Teufel:
Code-Switching and Syntax: A Large-Scale Experiment. 11526-11533 - Weize Chen, Jiarui Yuan, Chen Qian, Cheng Yang, Zhiyuan Liu, Maosong Sun:
Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System. 11534-11557 - Marinela Parovic, Ze Li, Jinhua Du:
Generating Domain-Specific Knowledge Graphs from Large Language Models. 11558-11574 - Chengzu Li, Han Zhou, Goran Glavas, Anna Korhonen, Ivan Vulic:
Large Language Models are Miscalibrated In-Context Learners. 11575-11596 - Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li:
STeCa: Step-level Trajectory Calibration for LLM Agent Learning. 11597-11614 - Zhuoshi Pan, Yu Li, Honglin Lin, Qizhi Pei, Zinan Tang, Wei Wu, Chenlin Ming, H. Vicky Zhao, Conghui He, Lijun Wu:
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs. 11615-11639 - Lars Benedikt Kaesberg, Jonas Becker, Jan Philip Wahle, Terry Ruas, Bela Gipp:
Voting or Consensus? Decision-Making in Multi-Agent Debate. 11640-11671 - Qingqing Hong, Dongyu Zhang, Jiayi Lin, Dapeng Yin, Shuyue Zhu, Junli Wang:
Rhetorical Device-Aware Sarcasm Detection with Counterfactual Data Augmentation. 11672-11685 - Jianfei Zhang, Bei Li, Jun Bai, Rumei Li, Yanmeng Wang, Chenghua Lin, Wenge Rong:
Selecting Demonstrations for Many-Shot In-Context Learning via Gradient Matching. 11686-11704 - Andrianos Michail, Juri Opitz, Yining Wang, Robin Meister, Rico Sennrich, Simon Clematide:
Cheap Character Noise for OCR-Robust Multilingual Embeddings. 11705-11716 - Kaiyue Feng, Yilun Zhao, Yixin Liu, Tianyu Yang, Chen Zhao, John Sous, Arman Cohan:
Physics: Benchmarking Foundation Models on University-Level Physics Problem Solving. 11717-11743 - Eliya Habba, Ofir Arviv, Itay Itzhak, Yotam Perlitz, Elron Bandel, Leshem Choshen, Michal Shmueli-Scheuer, Gabriel Stanovsky:
DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation. 11744-11763 - Hao Chen, Haoze Li, Zhiqing Xiao, Lirong Gao, Qi Zhang, Xiaomeng Hu, Ningtao Wang, Xing Fu, Junbo Zhao:
ALPS: Attention Localization and Pruning Strategy for Efficient Adaptation of Large Language Models. 11764-11780 - Yu Li, Han Jiang, Zhihua Wei:
DeTAM: Defending LLMs Against Jailbreak Attacks via Targeted Attention Modification. 11781-11797 - Yibo Yan, Jiamin Su, Jianxiang He, Fangteng Fu, Xu Zheng, Yuanhuiyi Lyu, Kun Wang, Shen Wang, Qingsong Wen, Xuming Hu:
A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges. 11798-11827 - Andrei Catalin Coman, Christos Theodoropoulos, Marie-Francine Moens, James Henderson:
Fast-and-Frugal Text-Graph Transformers are Effective Link Predictors. 11828-11841 - Max Glockner, Xiang Jiang, Leonardo F. R. Ribeiro, Iryna Gurevych, Markus Dreyer:
NeoQA: Evidence-based Question Answering with Generated News Events. 11842-11926 - Xinyi Jiang, Tianyi Hu, Yuheng Qin, Guoming Wang, Zhou Huan, Kehan Chen, Gang Huang, Rongxing Lu, Siliang Tang:
ChatMap: Mining Human Thought Processes for Customer Service Chatbots via Multi-Agent Collaboration. 11927-11947 - Xinyu Zhang, Yuanquan Hu, Fangchao Liu, Zhicheng Dou:
P3: Prompts Promote Prompting. 11948-11965 - Hugh Mee Wong, Rick Nouwen, Albert Gatt:
VAQUUM: Are Vague Quantifiers Grounded in Visual Data? 11966-11982 - William Rudman, Michal Golovanevsky, Amir Bar, Vedant Palit, Yann LeCun, Carsten Eickhoff, Ritambhara Singh:
Forgotten Polygons: Multimodal Large Language Models are Shape-Blind. 11983-11998 - Shuaike Li, Kai Zhang, Qi Liu, Enhong Chen:
MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality. 11999-12013 - Bowen Yan, Zhengsong Zhang, Liqiang Jing, Eftekhar Hossain, Xinya Du:
FIHA: Automated Fine-grained Hallucinations Evaluations in Large Vision Language Models with Davidson Scene Graphs. 12014-12026 - Elizabeth Spaulding, Shafiuddin Rehan Ahmed, James H. Martin:
On the Role of Semantic Proto-roles in Semantic Analysis: What do LLMs know about agency? 12027-12048 - Zhili Shen, Chenxin Diao, Pavlos Vougiouklis, Pascual Merita, Shriram Piramanayagam, Enting Chen, Damien Graux, André Melo, Ruofei Lai, Zeren Jiang, Zhongyang Li, Ye Qi, Yang Ren, Dandan Tu, Jeff Z. Pan:
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation. 12049-12072 - Michael Oliverio, Pier Felice Balestrucci, Alessandro Mazzei, Valerio Basile:
WebNLG-IT: Construction of an aligned RDF-Italian corpus through Machine Translation techniques. 12073-12083 - Hanyin Wang, Chufan Gao, Bolun Liu, Qiping Xu, Guleid Hussein, Mohamad El Labban, Kingsley Iheasirim, Hariprasad Reddy Korsapati, Chuck Outcalt, Jimeng Sun:
Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation. 12084-12117 - Mohammed Bouri, Adnane Saoud:
Bridging Robustness and Generalization Against Word Substitution Attacks in NLP via the Growth Bound Matrix Approach. 12118-12137 - Yuyao Zhang, Zhicheng Dou, Xiaoxi Li, Jiajie Jin, Yongkang Wu, Zhonghua Li, Ye Qi, Ji-Rong Wen:
Neuro-Symbolic Query Compiler. 12138-12155 - Wangjie You, Zecheng Tang, Juntao Li, Lili Yao, Min Zhang:
Revealing and Mitigating the Local Pattern Shortcuts of Mamba. 12156-12178 - Jiaqi Li, Chuanyi Zhang, Miaozeng Du, Hui Zhang, Yongrui Chen, Qianshan Wei, Junfeng Fang, Ruipeng Wang, Sheng Bi, Guilin Qi:
Forget the Token and Pixel: Rethinking Gradient Ascent for Concept Unlearning in Multimodal Generative Models. 12179-12200 - Gallil Maimon, Avishai Elmakies, Yossi Adi:
Slamming: Training a Speech Language Model on One GPU in a Day. 12201-12216 - Junhong Wu, Yang Zhao, Yangyifan Xu, Bing Liu, Chengqing Zong:
Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation. 12217-12236 - Berfin Aktas, Michael Roth:
Clarifying Underspecified Discourse Relations in Instructional Texts. 12237-12256 - Daniel Deutsch, Eleftheria Briakou, Isaac Rayburn Caswell, Mara Finkelstein, Rebecca Galor, Juraj Juraska, Geza Kovacs, Alison Lui, Ricardo Rei, Jason Riesa, Shruti Rijhwani, Parker Riley, Elizabeth Salesky, Firas Trabelsi, Stephanie Winkler, Biao Zhang, Markus Freitag:
WMT24++: Expanding the Language Coverage of WMT24 to 55 Languages & Dialects. 12257-12284 - Michael Sullivan:
Exploring Graph Representations of Logical Forms for Language Modeling. 12285-12307 - Yosephine Susanto, Adithya Venkatadri Hulagadri, Jann Railey Montalan, Jian Gang Ngui, Xianbin Yong, Wei Qi Leong, Hamsawardhini Rengarajan, Peerat Limkonchotiwat, Yifan Mai, William-Chandra Tjhi:
SEA-HELM: Southeast Asian Holistic Evaluation of Language Models. 12308-12336 - Wei Zou, Sen Yang, Yu Bao, Shujian Huang, Jiajun Chen, Shanbo Cheng:
TRANS-ZERO: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data. 12337-12347 - Gonçalo Emanuel Cavaco Gomes, Bruno Martins, Chrysoula Zerva:
A Conformal Risk Control Framework for Granular Word Assessment and Uncertainty Calibration of CLIPScore Quality Estimates. 12348-12365 - Wenqiao Zhu, Ji Liu, Lulu Wang, Jun Wu, Yulun Zhang:
SGDPO: Self-Guided Direct Preference Optimization for Language Model Alignment. 12366-12383 - Jiangbo Pei, Peiyu Liu, Xin Zhao, Aidong Men, Yang Liu:
Socratic Style Chain-of-Thoughts Help LLMs to be a Better Reasoner. 12384-12395 - Nikhita Vedula, Dushyanta Dhyani, Laleh Jalali, Boris N. Oreshkin, Mohsen Bayati, Shervin Malmasi:
Quantile Regression with Large Language Models for Price Prediction. 12396-12415 - Jian Wang, Yinpei Dai, Yichi Zhang, Ziqiao Ma, Wenjie Li, Joyce Chai:
Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors. 12416-12436 - Wenhua Zhang, Weicheng Li, Xuanrong Rao, Lixin Zou, Xiangyang Luo, Chubin Zhuang, Yongjie Hong, Zhen Qin, Hengyun Chang, Chenliang Li, Bo Zheng:
AIGuard: A Benchmark and Lightweight Detection for E-commerce AIGC Risks. 12437-12450 - Junhui He, Junna Xing, Nan Wang, Rui Xu, Shangyu Wu, Peng Zhou, Qiang Liu, Chun Jason Xue, Qingan Li:
A²ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization. 12451-12463 - Yuheng Lu, Qian Yu, Hongru Wang, Zeming Liu, Wei Su, Yanping Liu, Yuhang Guo, Maocheng Liang, Yunhong Wang, Haifeng Wang:
TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments. 12464-12478 - Jie Zeng, Qianyu He, Qingyu Ren, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao:
Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following. 12479-12492 - Xikang Guan, Zheng Gu, Jing Huo, Tianyu Ding, Yang Gao:
CoT-VTM: Visual-to-Music Generation with Chain-of-Thought Reasoning. 12493-12510 - Yang Zhong, Diane J. Litman:
A Tale of Evaluating Factual Consistency: Case Study on Long Document Summarization Evaluation. 12511-12532 - Ioana Ivan, Carlos Ramisch, Alexis Nasr:
Evaluating Pretrained Causal Language Models for Synonymy. 12533-12551 - Bohan Jin, Shuhan Qi, Kehai Chen, Xinyi Guo, Xuan Wang:
MDIT-Bench: Evaluating the Dual-Implicit Toxicity in Large Multimodal Models. 12552-12574 - Haochen Zhang, Tianyi Zhang, Junze Yin, Oren Gal, Anshumali Shrivastava, Vladimir Braverman:
CoVE: Compressed Vocabulary Expansion Makes Better LLM-based Recommender Systems. 12575-12591 - Huanshuo Liu, Hao Zhang, Zhijiang Guo, Jing Wang, Kuicai Dong, Xiangyang Li, Yi Quan Lee, Cong Zhang, Yong Liu:
CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control. 12592-12618 - Bowen Dong, Yilong Fan, Yutao Sun, Zhenyu Li, Tengyu Pan, Zhou Xun, Jianyong Wang:
Maximum Score Routing For Mixture-of-Experts. 12619-12632 - Ahmad Dawar Hakimi, Ali Modarressi, Philipp Wicke, Hinrich Schütze:
Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models. 12633-12653 - Feifan Song, Shaohang Wei, Wen Luo, Yuxuan Fan, Tianyu Liu, Guoyin Wang, Houfeng Wang:
Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding. 12654-12670 - Pedro Calais, Gabriel Franco, Zilu Tang, Themistoklis Nikas, Wagner Meira Jr., Evimaria Terzi, Mark Crovella:
Disentangling Text and Math in Word Problems: Evidence for the Bidimensional Structure of Large Language Models' Reasoning. 12671-12688 - Mingmeng Geng, Roberto Trotta:
Human-LLM Coevolution: Evidence from Academic Writing. 12689-12696 - Hao Dong, Ziyue Qiao, Zhiyuan Ning, Qi Hao, Yi Du, Pengyang Wang, Yuanchun Zhou:
Disentangled Multi-span Evolutionary Network against Temporal Knowledge Graph Reasoning. 12697-12707 - Cristian-George Craciun, Razvan-Alexandru Smadu, Dumitru-Clementin Cercel, Mihaela-Claudia Cercel:
GRAF: Graph Retrieval Augmented by Facts for Romanian Legal Multi-Choice Question Answering. 12708-12742 - Jiayi Kuang, Yinghui Li, Chen Wang, Haohao Luo, Ying Shen, Wenhao Jiang:
Express What You See: Can Multimodal LLMs Decode Visual Ciphers with Intuitive Semiosis Comprehension? 12743-12774 - Xiao Yu, Ruize Xu, Chengyuan Xue, Jinzhong Zhang, Xu Ma, Zhou Yu:
ConFit v2: Improving Resume-Job Matching using Hypothetical Resume Embedding and Runner-Up Hard-Negative Mining. 12775-12790 - Anum Afzal, Florian Matthes, Gal Chechik, Yftah Ziser:
Knowing Before Saying: LLM Representations Encode Information About Chain-of-Thought Success Before Completion. 12791-12806 - Gabriel Herbert Sarch, Balasaravanan Thoravi Kumaravel, Sahithya Ravi, Vibhav Vineet, Andrew D. Wilson:
Grounding Task Assistance with Multimodal Cues from a Single Demonstration. 12807-12833 - Adrian de Wynter:
Awes, Laws, and Flaws From Today's LLM Research. 12834-12854 - Siqi Liang, Sumyeong Ahn, Paramveer Dhillon, Jiayu Zhou:
Dual Debiasing for Noisy In-Context Learning for Text Generation. 12855-12868 - Zhecheng Li, Yiwei Wang, Bryan Hooi, Yujun Cai, Nanyun Peng, Kaiwei Chang:
DRS: Deep Question Reformulation With Structured Output. 12869-12882 - Happy Khairunnisa Sariyanto, Diclehan Ulucan, Oguzhan Ulucan, Marc Ebner:
Towards Explainable Hate Speech Detection. 12883-12893 - Yunsoo Kim, Yusuf Abdulle, Honghan Wu:
BioHopR: A Benchmark for Multi-Hop, Multi-Answer Reasoning in Biomedical Domain. 12894-12908 - Bradley McDanel, Sai Qian Zhang, Yunhai Hu, Zining Liu:
PipeSpec: Breaking Stage Dependencies in Hierarchical LLM Decoding. 12909-12920 - Thai Quoc Hoang, Kung-Hsiang Huang, Shirley Kokane, Jianguo Zhang, Zuxin Liu, Ming Zhu, Jake Grigsby, Tian Lan, Michael S. Ryoo, Chien-Sheng Wu, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles:
LAM SIMULATOR: Advancing Data Generation for Large Action Model Training via Online Exploration and Trajectory Feedback. 12921-12934 - Sahil Mishra, Kumar Arjun, Tanmoy Chakraborty:
Rank, Chunk and Expand: Lineage-Oriented Reasoning for Taxonomy Expansion. 12935-12953 - Gal Astrach, Yuval Pinter:
Probing Subphonemes in Morphology Models. 12954-12961 - Parishad BehnamGhader, Nicholas Meade, Siva Reddy:
Exploiting Instruction-Following Retrievers for Malicious Information Retrieval. 12962-12980 - Alicja Dobrzeniecka, Antske Fokkens, Pia Sommerauer:
Improving Causal Interventions in Amnesic Probing with Mean Projection or LEACE. 12981-12993 - Zixuan Xia, Haifeng Sun, Jingyu Wang, Qi Qi, Huazheng Wang, Xiaoyuan Fu, Jianxin Liao:
The Threat of PROMPTS in Large Language Models: A System and User Prompt Perspective. 12994-13035 - Tianci Liu, Haoxiang Jiang, Tianze Wang, Ran Xu, Yue Yu, Linjun Zhang, Tuo Zhao, Haoyu Wang:
RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization. 13036-13054 - Saurabh Srivastava, Sweta Pati, Ziyu Yao:
Instruction-Tuning LLMs for Event Extraction with Annotation Guidelines. 13055-13071 - Hellina Hailu Nigatu, Min Li, Maartje ter Hoeve, Saloni Potdar, Sarah E. Chasins:
mRAKL: Multilingual Retrieval-Augmented Knowledge Graph Construction for Low-Resourced Languages. 13072-13089 - Ala N. Tak, Amin Banayeeanzade, Anahita Bolourani, Mina Kian, Robin Jia, Jonathan Gratch:
Mechanistic Interpretability of Emotion Inference in Large Language Models. 13090-13120 - Xufeng Liu, Yixuan Ding, Jingxiang Qu, Yichi Zhang, Wenhan Gao, Yi Liu:
RL-Guider: Leveraging Historical Decisions and Feedback for Drug Editing with Large Language Models. 13121-13138 - Jesse Woo, Fateme Hashemi Chaleshtori, Ana Marasovic, Kenneth Marino:
BriefMe: A Legal NLP Benchmark for Assisting with Legal Briefs. 13139-13190 - Esam Ghaleb, Bulat Khaertdinov, Asli Özyürek, Raquel Fernández:
I see what you mean: Co-Speech Gestures for Reference Resolution in Multimodal Dialogue. 13191-13206 - Katarzyna Prus, Mark Steedman, Adam Lopez:
World Knowledge Resolves Some Aspectual Ambiguity. 13207-13220 - Dren Fazlija, Arkadij Orlov, Sandipan Sikdar:
ACCESS DENIED INC: The First Benchmark Environment for Sensitivity Awareness. 13221-13240 - Chi-Jane Chen, Yuhang Chen, Sukwon Yun, Natalie Stanley, Tianlong Chen:
Spatial Coordinates as a Cell Language: A Multi-Sentence Framework for Imaging Mass Cytometry Analysis. 13241-13252 - Zhaojian Yu, Yilun Zhao, Arman Cohan, Xiaoping Zhang:
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task. 13253-13279 - Yu Zhang, Wenxiang Guo, Changhao Pan, Dongyu Yao, Zhiyuan Zhu, Ziyue Jiang, Yuhan Wang, Tao Jin, Zhou Zhao:
TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis. 13280-13294 - Nicholas Roberts, Niladri S. Chatterji, Sharan Narang, Mike Lewis, Dieuwke Hupkes:
Compute Optimal Scaling of Skills: Knowledge vs Reasoning. 13295-13316 - Xinyu Wang, Yanzheng Xiang, Lin Gui, Yulan He:
PECAN: LLM-Guided Dynamic Progress Control with Attention-Guided Hierarchical Weighted Graph for Long-Document QA. 13317-13335 - Yash Kumar Atri, Ahmed M. Alaa, Thomas Hartvigsen:
Lifelong Model Editing with Graph-Based External Memory. 13336-13352 - Qitong Wang, Mohammed J. Zaki, Georgios Kollias, Vasileios Kalantzis:
Multi-Sense Embeddings for Language Models and Knowledge Distillation. 13353-13369 - Peter Jansen, Oyvind Tafjord, Marissa Radensky, Pao Siangliulue, Tom Hope, Bhavana Dalvi Mishra, Bodhisattwa Prasad Majumder, Daniel S. Weld, Peter Clark:
CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation. 13370-13467 - Chris Samarinas, Alexander Krubner, Alireza Salemi, Youngwoo Kim, Hamed Zamani:
Beyond Factual Accuracy: Evaluating Coverage of Diverse Factual Information in Long-form Text Generation. 13468-13482 - Jacob Nielsen, Peter Schneider-Kamp, Lukas Galke:
Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models? 13483-13493 - Hillary Dawkins, Kathleen C. Fraser, Svetlana Kiritchenko:
When Detection Fails: The Power of Fine-Tuned Models to Generate Human-Like Social Media Text. 13494-13527 - James A. Michaelov, Reeka Estacio, Zhien Zhang, Ben Bergen:
Not quite Sherlock Holmes: Language model predictions do not reliably differentiate impossible from improbable events. 13528-13551 - Ting-Rui Chiang, Dani Yogatama:
The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval. 13552-13562 - Kaiyu He, Mian Zhang, Shuo Yan, Peilin Wu, Zhiyu Chen:
IDEA: Enhancing the Rule Learning Ability of Large Language Model Agent through Induction, Deduction, and Abduction. 13563-13597 - Hainiu Xu, Siya Qi, Jiazheng Li, Yuxiang Zhou, Jinhua Du, Caroline Catmur, Yulan He:
EnigmaToM: Improve LLMs' Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States. 13598-13622 - Jiamu Zhang, Jiayi Yuan, Andrew Wen, Hoang Anh Duy Le, Yu-Neng Chuang, Soo-Hyun Choi, Rui Chen, Xia Hu:
ReasonerRank: Redefining Language Model Evaluation with Ground-Truth-Free Ranking Frameworks. 13623-13639 - Weizhi Tang, Yixuan Li, Chris Sypherd, Elizabeth Polgreen, Vaishak Belle:
HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation. 13640-13665 - Elfia Bezou-Vrakatseli, Oana Cocarascu, Sanjay Modgil:
Can Large Language Models Understand Argument Schemes? 13666-13681 - Shulin Tian, Ziniu Zhang, Liangyu Chen, Ziwei Liu:
MMInA: Benchmarking Multihop Multimodal Internet Agents. 13682-13697 - Xiaofei Wen, Wenxuan Zhou, Wenjie Jacky Mo, Muhao Chen:
ThinkGuard: Deliberative Slow Thinking Leads to Cautious Guardrails. 13698-13713 - Liang Cheng, Tianyi Li, Zhaowei Wang, Tianyang Liu, Mark Steedman:
Neutralizing Bias in LLM Reasoning using Entailment Graphs. 13714-13730 - Van Dai Do, Quan Hung Tran, Svetha Venkatesh, Hung Le:
Dynamic Steering With Episodic Memory For Large Language Models. 13731-13749 - Siyang Liu, Bianca Brie, Wenda Li, Laura Biester, Andrew Lee, James W. Pennebaker, Rada Mihalcea:
Eeyore: Realistic Depression Simulation via Expert-in-the-Loop Supervised and Preference Optimization. 13750-13770 - Gregory Price, Shaomei Wu:
Lost in Translation: Benchmarking Commercial Machine Translation Models for Dyslexic-Style Text. 13771-13782 - Xianren Zhang, Xianfeng Tang, Hui Liu, Zongyu Wu, Qi He, Dongwon Lee, Suhang Wang:
Divide-Verify-Refine: Can LLMs Self-align with Complex Instructions? 13783-13800 - Tuochao Chen, Nicholas Scott Batchelder, Alisa Liu, Noah A. Smith, Shyamnath Gollakota:
LlamaPIE: Proactive In-Ear Conversation Assistants. 13801-13824 - Jacob Daniel Devasier, Akshith Reddy Putta, Rishabh Mediratta, Chengkai Li:
Task-Oriented Automatic Fact-Checking with Frame-Semantics. 13825-13842 - Shi Yu, Zhiyuan Liu, Chenyan Xiong:
Craw4LLM: Efficient Web Crawling for LLM Pretraining. 13843-13851 - Zhenyuan Guo, Yi Shi, Wenlong Meng, Chen Gong, Chengkun Wei, Wenzhi Chen:
Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy. 13852-13871 - Mengqiao Liu, Tevin Wang, Cassandra A. Cohen, Sarah Li, Chenyan Xiong:
Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews. 13872-13893 - Hoang Tran Vuong, Tue Le, Tu Vu, Tung Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen:
HiCOT: Improving Neural Topic Models via Optimal Transport and Contrastive Learning. 13894-13920 - Guojun Xiong, Zhiyang Deng, Keyi Wang, Yupeng Cao, Haohang Li, Yangyang Yu, Xueqing Peng, Mingquan Lin, Kaleb E. Smith, Xiao-Yang Liu, Jimin Huang, Sophia Ananiadou, Qianqian Xie:
FLAG-TRADER: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading. 13921-13934 - Hongru Song, Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Jianming Lv, Maarten de Rijke, Xueqi Cheng:
The Silent Saboteur: Imperceptible Adversarial Attacks against Black-Box Retrieval-Augmented Generation Systems. 13935-13952 - Meng Lu, Yuzhang Xie, Zhenyu Bi, Shuxiang Cao, Xuan Wang:
CROSSAGENTIE: Cross-Type and Cross-Task Multi-Agent LLM Collaboration for Zero-Shot Information Extraction. 13953-13977 - Lishuai Hou, Zixiong Wang, Gaoyang Liu, Chen Wang, Wei Liu, Kai Peng:
Decoupling Memories, Muting Neurons: Towards Practical Machine Unlearning for Large Language Models. 13978-13999 - Xinyu Pang, Ruixin Hong, Hongming Zhang, Changshui Zhang:
Assimilation and Accommodation: Task-Adaptive Hierarchical Abstraction for Solving Web Tasks. 14000-14014 - Chuxue Cao, Han Zhu, Jiaming Ji, Qichao Sun, Zhenghao Zhu, Yinyu Wu, Josef Dai, Yaodong Yang, Sirui Han, Yike Guo:
SafeLawBench: Towards Safe Alignment of Large Language Models. 14015-14048 - Zhaoxi Zhang, Sanwoo Lee, Zhixiang Wang, Yunfang Wu:
3DM: Distill, Dynamic Drop, and Merge for Debiasing Multi-modal Large Language Models. 14049-14059 - Yuxi Sun, Aoqi Zuo, Wei Gao, Jing Ma:
CausalAbstain: Enhancing Multilingual LLMs with Causal Reasoning for Trustworthy Abstention. 14060-14076 - Kanzhi Cheng, Wenpo Song, Jiaxin Fan, Zheng Ma, Qiushi Sun, Fangzhi Xu, Chenyang Yan, Nuo Chen, Jianbing Zhang, Jiajun Chen:
CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era. 14077-14094 - Tianyi Ma, Yiyue Qian, Zehong Wang, Zheyuan Zhang, Chuxu Zhang, Yanfang Ye:
LLM-Empowered Class Imbalanced Graph Prompt Learning for Online Drug Trafficking Detection. 14095-14114 - Yiyun Zhou, Chang Yao, Jingyuan Chen:
CoLA: Collaborative Low-Rank Adaptation. 14115-14130 - Hao Fang, Yuejie Zhang, Rui Feng, Yingwen Wang, Qing Wang, Wen He, Xiaobo Zhang, Tao Zhang, Shang Gao:
GLiM: Integrating Graph Transformer and LLM for Document-Level Biomedical Relation Extraction with Incomplete Labeling. 14131-14146 - Yang Xiao, Tianyi Peng, Rohan Kumar Das, Yuchen Hu, Huiping Zhuang:
AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting. 14147-14158 - Taedong Yun, Eric Yang, Mustafa Safdari, Jong Ha Lee, Vaishnavi Vinod Kumar, S. Sara Mahdavi, Jonathan Amar, Derek Peyton, Reut Aharony, Andreas Michaelides, Logan Douglas Schneider, Isaac R. Galatzer-Levy, Yugang Jia, John Canny, Arthur Gretton, Maja J. Mataric:
Sleepless Nights, Sugary Days: Creating Synthetic Users with Health Conditions for Realistic Coaching Agent Interactions. 14159-14181 - Suho Yoo, Hyunjong Ok, Jaeho Lee:
Imagine to Hear: Auditory Knowledge Generation can be an Effective Assistant for Language Models. 14182-14193 - Junkai Chen, Zhijie Deng, Kening Zheng, Yibo Yan, Shuliang Liu, PeiJun Wu, Peijie Jiang, Jia Liu, Xuming Hu:
SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning. 14194-14224 - Chan-Yang Ju, Dong-Ho Lee:
Prediction-Augmented Generation for Automatic Diagnosis Tasks. 14225-14246 - Zongkai Zhao, Guozeng Xu, Xiuhua Li, Kaiwen Wei, Jiang Zhong:
FedLEKE: Federated Locate-then-Edit Knowledge Editing for Multi-Client Collaboration. 14247-14258 - Ting Sun, Penghan Wang, Fan Lai:
DiSCo: Device-Server Collaborative LLM-based Text Streaming Services. 14259-14277 - Keqin Bao, Ming Yan, Yang Zhang, Jizhi Zhang, Wenjie Wang, Fuli Feng, Xiangnan He:
Customizing In-context Learning for Dynamic Interest Adaption in LLM-based Recommendation. 14278-14291 - Xinyue Cui, Johnny Tian-Zheng Wei, Swabha Swayamdipta, Robin Jia:
Robust Data Watermarking in Language Models by Injecting Fictitious Knowledge. 14292-14306 - Jiale Chen, Xuelian Dong, Wenxiu Xie, Ru Peng, Kun Zeng, Tianyong Hao:
LLM-Enhanced Query Generation and Retrieval Preservation for Task-Oriented Dialogue. 14307-14321 - Quang Hieu Pham, Thuy Duong Nguyen, Tung Pham, Anh Tuan Luu, Dat Quoc Nguyen:
ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations. 14322-14329 - Beining Huang, Du Su, Fei Sun, Qi Cao, Huawei Shen, Xueqi Cheng:
Low-Entropy Watermark Detection via Bayes' Rule Derived Detector. 14330-14344 - Junying Chen, Chi Gui, Anningzhe Gao, Ke Ji, Xidong Wang, Xiang Wan, Benyou Wang:
CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis. 14345-14368 - Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, Ning An:
DaNet: Dual-Aware Enhanced Alignment Network for Multimodal Aspect-Based Sentiment Analysis. 14369-14381 - Shujian Yang, Shiyao Cui, Chuanrui Hu, Haicheng Wang, Tianwei Zhang, Minlie Huang, Jialiang Lu, Han Qiu:
Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and Findings. 14382-14396 - Yile Wang, Zhanyu Shen, Hui Huang:
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations. 14397-14409 - Weiqin Wang, Yile Wang, Hui Huang:
Ranked Voting based Self-Consistency of Large Language Models. 14410-14426 - Jihui Yan, Xiaocui Yang, Daling Wang, Shi Feng, Yifei Zhang, Yinzhi Zhao:
SemanticCamo: Jailbreaking Large Language Models through Semantic Camouflage. 14427-14452 - Yoonjun Cho, Soeun Kim, Dongjae Jeon, Kyelim Lee, Beomsoo Lee, Albert No:
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition. 14453-14470 - Wenxiang Chen, Wei He, Zhiheng Xi, Honglin Guo, Boyang Hong, Jiazheng Zhang, Nijun Li, Tao Gui, Yun Li, Qi Zhang, Xuanjing Huang:
Better Process Supervision with Bi-directional Rewarding Signals. 14471-14485 - Yuxin Zuo, Wenxuan Jiang, Wenxuan Liu, Zixuan Li, Long Bai, Hanbin Wang, Yutao Zeng, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng:
KnowCoder-X: Boosting Multilingual Information Extraction via Code. 14486-14509 - Zhongwei Wan, Che Liu, Xin Wang, Chaofan Tao, Hui Shen, Jing Xiong, Rossella Arcucci, Huaxiu Yao, Mi Zhang:
MEIT: Multimodal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation. 14510-14527 - Zhenyu Lei, Yushun Dong, Weiyu Li, Rong Ding, Qi R. Wang, Jundong Li:
Harnessing Large Language Models for Disaster Management: A Survey. 14528-14551 - Junying Chen, Zhenyang Cai, Ke Ji, Xidong Wang, Wanlong Liu, Rongsheng Wang, Benyou Wang:
Towards Medical Complex Reasoning with LLMs through Medical Verifiable Problems. 14552-14573 - Yurui Chang, Bochuan Cao, Lu Lin:
Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation. 14574-14587 - Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Tianyu Liu, Baobao Chang:
LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback. 14588-14604 - Xiao Yu, Yi Yu, Dongrui Liu, Kejiang Chen, Weiming Zhang, Nenghai Yu, Jing Shao:
EvoBench: Towards Real-world LLM-Generated Text Detection Benchmarking for Evolving Large Language Models. 14605-14620 - Xinwu Ye, Chengfan Li, Siming Chen, Wei Wei, Robert Tang:
MMSciBench: Benchmarking Language Models on Chinese Multimodal Scientific Problems. 14621-14663 - Minjoo Son, Jonghak Jang, Misuk Kim:
Lightweight Query Checkpoint: Classifying Faulty User Queries to Mitigate Hallucinations in Large Language Model Question Answering. 14664-14677 - Seiji Shimizu, Shohei Hisada, Yutaka Uno, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki:
Exploring LLM Annotation for Adaptation of Clinical Information Extraction Models under Data-sharing Restrictions. 14678-14694 - Yifan Sun, Danding Wang, Qiang Sheng, Juan Cao, Jintao Li:
Enhancing the Comprehensibility of Text Explanations via Unsupervised Concept Discovery. 14695-14713 - Seiji Shimizu, Ibrahim Baroud, Lisa Raithel, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki:
RecordTwin: Towards Creating Safe Synthetic Clinical Corpora. 14714-14726 - Shiyu Xiang, Ansen Zhang, Yanfei Cao, Fan Yang, Ronghao Chen:
Beyond Surface-Level Patterns: An Essence-Driven Defense Framework Against Jailbreak Attacks in LLMs. 14727-14742 - Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, Ning An:
Multimodal Invariant Sentiment Representation Learning. 14743-14755 - Yan Li, Caren Han, Yue Dai, Feiqi Cao:
ChuLo: Chunk-Level Key Information Representation for Long Document Understanding. 14756-14773 - Tomer Ashuach, Martin Tutek, Yonatan Belinkov:
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space. 14774-14797 - Quang Minh Nguyen, Taegyoon Kim:
Is External Information Useful for Stance Detection with LLMs? 14798-14807 - Marc E. Canby, Xinchi Chen, Xing Niu, Jifan Chen, Bonan Min, Sergül Aydöre, Vittorio Castelli:
Benchmarking Query-Conditioned Natural Language Inference. 14808-14835 - Yuuki Yamanaka, Hiroshi Takahashi, Tomoya Yamashita:
Flowchart-Based Decision Making with Large Language Models. 14836-14842 - Jun Zhong, Longwei Xu, Li Kong, Xianzhuo Li, Dandan Liang, Junsheng Zhou:
NarGINA: Towards Accurate and Interpretable Children's Narrative Ability Assessment via Narrative Graphs. 14843-14860 - Dongyang Li, Zeyang Li, Bosheng Liu, Jigang Wu:
Improving Efficiency in Large Language Models via Extendable Block Floating Point Representation. 14861-14873 - Mingxu Tao, Jie Hu, Mingchuan Yang, Yunhuai Liu, Dongyan Zhao, Yansong Feng:
EpiCoDe: Boosting Model Performance Beyond Training with Extrapolation and Contrastive Decoding. 14874-14885 - Md. Arid Hasan, Maram Hasanain, Fatema Ahmad, Sahinur Rahman Laskar, Sunaya Upadhyay, Vrunda N. Sukhadia, Mucahid Kutlu, Shammur Absar Chowdhury, Firoj Alam:
NativQA: Multilingual Culturally-Aligned Natural Query for LLMs. 14886-14909 - Xinglin Lyu, Wei Tang, Yuang Li, Xiaofeng Zhao, Ming Zhu, Junhui Li, Yunfei Lu, Min Zhang, Daimeng Wei, Hao Yang:
DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation. 14910-14924 - Bolei He, Xinran He, Mengke Chen, Xianwei Xue, Ying Zhu, Zhen-Hua Ling:
RISE: Reasoning Enhancement via Iterative Self-Exploration in Multi-hop Question Answering. 14925-14948 - Vishnu Prabhakaran, Purav Aggarwal, Vinay Kumar Verma, Gokul Swamy, Anoop Saladi:
VADE: Visual Attention Guided Hallucination Detection and Elimination. 14949-14965 - Zouying Cao, Runze Wang, Yifei Yang, Xinbei Ma, Xiaoyong Zhu, Bo Zheng, Hai Zhao:
PGPO: Enhancing Agent Reasoning via Pseudocode-style Planning Guided Preference Optimization. 14966-14985 - Cory Paik, Katharina von der Wense:
The Effectiveness of Uncased Tokeniziaion for Clinical Notes. 14986-14992 - Janghwan Lee, Jiwoong Park, Jinseok Kim, Yongjik Kim, Jungju Oh, Jinwook Oh, Jungwook Choi:
AMXFP4: Taming Activation Outliers with Asymmetric Microscaling Floating-Point for 4-bit LLM Inference. 14993-15013 - Ruicheng Yin, Xuan Gao, Changze Lv, Xiaohua Wang, Xiaoqing Zheng, Xuanjing Huang:
Improving Continual Pre-training Through Seamless Data Packing. 15014-15032 - Mahammed Kamruzzaman, Gene Louis Kim:
The Impact of Name Age Perception on Job Recommendations in LLMs. 15033-15058 - Cho Hyeonsu, Dooyoung Kim, Youngjoong Ko:
DAPI: Domain Adaptive Toxicity Probe Vector Intervention, for Fine-Grained Detoxification. 15059-15069 - Yukun Zhao, Lingyong Yan, Zhenyang Li, Shuaiqiang Wang, Zhumin Chen, Zhaochun Ren, Dawei Yin:
Task Knowledge Injection via Interpolations and Reinstatement for Large Language Model Generalization. 15070-15080 - Wenxiang Guo, Yu Zhang, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, ZheTao Chen, Wenhao Xu, Fei Wu, Zhou Zhao:
STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation. 15081-15093 - Xinghao Chen, Zhijing Sun, Wenjin Guo, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen:
Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning. 15094-15119 - Di Wu, Liting Jiang, Bohui Mao, Hongyan Xie, Haoxiang Su, Zhongjiang He, Ruiyu Fang, Shuangyong Song, Hao Huang, Xuelong Li:
INT: Establishing Information Transfer for Multilingual Intent Detection and Slot Filling. 15120-15142 - Dongyoon Hahm, Woogyeol Jin, June Suk Choi, Sungsoo Ahn, Kimin Lee:
Enhancing LLM Agent Safety via Causal Influence Prompting. 15143-15168 - Fabio Massimo Zanzotto, Elena Sofia Ruzzetti, Giancarlo A. Xompero, Leonardo Ranaldi, Davide Venditti, Federico Ranaldi, Cristina Giannone, Andrea Favalli, Raniero Romagnoli:
Position Paper: MeMo: Towards Language Models with Associative Memory Mechanisms. 15169-15180 - Solee Im, Wonjun Lee, Jinmyeong An, Yunsu Kim, Jungseul Ok, Gary Geunbae Lee:
DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction. 15181-15193 - Yanyue Zhang, Yulan He, Deyu Zhou:
Rehearse With User: Personalized Opinion Summarization via Role-Playing based on Large Language Models. 15194-15211 - Soichiro Murakami, Peinan Zhang, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura:
AdParaphrase v2.0: Generating Attractive Ad Texts Using a Preference-Annotated Paraphrase Dataset. 15212-15230 - Calogero Jerik Scozzaro, Matteo Delsanto, Daniele Paolo Radicioni:
Beyond the Average Reader: the Reader Embedding Approach. 15231-15244 - Lorenzo Pacchiardi, Konstantinos Voudouris, Ben Slater, Fernando Martínez-Plumed, José Hernández-Orallo, Lexin Zhou, Wout Schellaert:
PredictaBoard: Benchmarking LLM Score Predictability. 15245-15266 - Yaxin Du, Rui Ye, Fengting Yuchi, Wanru Zhao, Jingjing Qu, Yanfeng Wang, Siheng Chen:
FedDQC: Data Quality Control in Federated Instruction-tuning of Large Language Models. 15267-15291 - Bo Yuan, Yulin Chen, Yin Zhang:
Weed Out, Then Harvest: Dual Low-Rank Adaptation is an Effective Noisy Label Detector for Noise-Robust Learning. 15292-15311 - Esra Dönmez, Agnieszka Falenska:
"I understand your perspective": LLM Persuasion through the Lens of Communicative Action Theory. 15312-15327 - Kyuhee Kim, Sangah Lee:
Nunchi-Bench: Benchmarking Language Models on Cultural Reasoning with a Focus on Korean Superstition. 15328-15342 - Kangyang Luo, Zichen Ding, Zhenmin Weng, Lingfeng Qiao, Meng Zhao, Xiang Li, Di Yin, Jinlong Shu:
Let's Be Self-generated via Step by Step: A Curriculum Learning Approach to Automated Reasoning with Large Language Models. 15343-15420 - Zhengze Zhang, Shiqi Wang, Yiqun Shen, Simin Guo, Dahua Lin, Xiaoliang Wang, Cam-Tu Nguyen, Fei Tan:
daDPO: Distribution-Aware DPO for Distilling Conversational Abilities. 15421-15437 - Chuanghao Ding, Jiaping Wang, Ziqing Yang, Xiaoliang Wang, Dahua Lin, Nguyen Cam-Tu, Fei Tan:
Consultant Decoding: Yet Another Synergistic Mechanism. 15438-15452 - Liang Lin, Siyuan Chai, Jiahao Wu, Hongbing Hu, Xiaotao Gu, Hao Hu, Fan Zhang, Wei Wang, Dan Zhang:
IntelliCockpitBench: A Comprehensive Benchmark to Evaluate VLMs for Intelligent Cockpit. 15453-15475 - Akram Elbouanani, Evan Dufraisse, Adrian Popescu:
Analyzing Political Bias in LLMs via Target-Oriented Sentiment Classification. 15476-15505 - Maxime Louis, Hervé Déjean, Stéphane Clinchant:
PISCO: Pretty Simple Compression for Retrieval-Augmented Generation. 15506-15521 - Tianshi Ming, Xian Wu, Yingying Zhang, Zichuan Fu, Dawei Cheng:
AnchorCoT: Anchors Pave the Way for Multi-hop Reasoning. 15522-15536 - Zichen Wen, Yifeng Gao, Weijia Li, Conghui He, Linfeng Zhang:
Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem? 15537-15549 - Zhen Qin, Zhaomin Wu, Bingsheng He, Shuiguang Deng:
Federated Data-Efficient Instruction Tuning for Large Language Models. 15550-15568 - Walter Paci, Alessandro Panunzi, Sandro Pezzelle:
They want to pretend not to understand: The Limits of Current LLMs in Interpreting Implicit Content of Political Discourse. 15569-15593 - Alessio Cocchieri, Marcos Martínez Galindo, Giacomo Frisoni, Gianluca Moro, Claudio Sartori, Giuseppe Tagliavini:
ZeroNER: Fueling Zero-Shot Named Entity Recognition via Entity Type Descriptions. 15594-15616 - Jaewook Lee, Woojin Lee, Oh-Woog Kwon, Harksoo Kim:
Do Large Language Models Have "Emotion Neurons"? Investigating the Existence and Role. 15617-15639 - Qingyuan Liang, Zhao Zhang, Zeyu Sun, Zheng Lin, Qi Luo, Yueyi Xiao, Yizhou Chen, Yuqun Zhang, Haotian Zhang, Lu Zhang, Chenbin Chenbin, Yingfei Xiong:
Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs? 15640-15653 - Yujie Lin, Ante Wang, Moye Chen, Jingyao Liu, Hao Liu, Jinsong Su, Xinyan Xiao:
Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study. 15654-15667 - Xinyi Liu, Xiaoyi Zhang, Ziyun Zhang, Yan Lu:
UI-E2I-Synth: Advancing GUI Grounding with Large-Scale Instruction Synthesis. 15668-15684 - Jonas Wallat, Abdelrahman Abdallah, Adam Jatowt, Avishek Anand:
A Study into Investigating Temporal Robustness of LLMs. 15685-15705 - Zijing Zhang, Zhanpeng Chen, He Zhu, Ziyang Chen, Nan Du, Xiaolong Li:
ToolExpNet: Optimizing Multi-Tool Selection in LLMs with Similarity and Dependency-Aware Experience Networks. 15706-15722 - I-Fan Lin, Faegheh Hasibi, Suzan Verberne:
SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models. 15723-15737 - Rui Li, Heming Xia, Xinfeng Yuan, Qingxiu Dong, Lei Sha, Wenjie Li, Zhifang Sui:
How Far are LLMs from Being Our Digital Twins? A Benchmark for Persona-Based Behavior Chain Simulation. 15738-15763 - Michele Luca Contalbo, Sara Pederzoli, Francesco Del Buono, Venturelli Valeria, Francesco Guerra, Matteo Paganelli:
GRI-QA: a Comprehensive Benchmark for Table Question Answering over Environmental Data. 15764-15779 - Zhiyu Lin, Zhengda Zhou, Zhiyuan Zhao, Tianrui Wan, Yilun Ma, Junyu Gao, Xuelong Li:
WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code. 15780-15797 - Linjiaen Linjiaen, Jingyu Liu, Yingbo Liu:
Optimizing Multi-Hop Document Retrieval Through Intermediate Representations. 15798-15809 - Patomporn Payoungkhamdee, Pume Tuchinda, Jinheon Baek, Samuel Cahyawijaya, Can Udomcharoenchaikit, Potsawee Manakul, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, Sarana Nutanong:
Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments. 15810-15828 - Kseniia Petukhova, Ekaterina Kochmar:
A Fully Automated Pipeline for Conversational Discourse Annotation: Tree Scheme Generation and Labeling with Large Language Models. 15829-15852 - Xiaojing Zhang, Bochen Lyu:
Can Language Models Serve as Analogy Annotators? 15853-15883 - Tianyi Alex Qiu, Fanzhi Zeng, Jiaming Ji, Dong Yan, Kaile Wang, Jiayi Zhou, Yang Han, Josef Dai, Xuehai Pan, Yaodong Yang:
Reward Generalization in RLHF: A Topological Perspective. 15884-15930 - Tianpeng Bu, Minying Zhang, Hongtao Duan, Shurui Li, Lulu Hu, Yu Li:
Enhanced Data Synthesis for LLM through Reasoning Structures Generated by Hierarchical GFlowNet. 15931-15958 - Yanggan Gu, Junzhuo Li, Sirui Huang, Xin Zou, Zhenghua Li, Xuming Hu:
Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models. 15959-15973 - Zihao Li, Xuekong Xu, Ziyao Chen, Lixin Zou, Ethanhjwu Ethanhjwu, Qiang Chen, Chenliang Li:
Token-level Preference Self-Alignment Optimization for Multi-style Outline Controllable Generation. 15974-16007 - Naquee Rizwan, Seid Muhie Yimam, Daryna Dementieva, Florian Skupin, Tim Fischer, Daniil Moskovskiy, Aarushi Ajay Borkar, Robert Geislinger, Punyajoy Saha, Sarthak Roy, Martin Semmann, Alexander Panchenko, Chris Biemann, Animesh Mukherjee:
HatePRISM: Policies, Platforms, and Research Integration. Advancing NLP for Hate Speech Proactive Mitigation. 16008-16022 - Sara Rajaee, Kumar Pratik, Gabriele Cesa, Arash Behboodi:
Local Look-Ahead Guidance via Verifier-in-the-Loop for Automated Theorem Proving. 16023-16040 - Hongzhi Qi, Nan Bai, Jianqiang Li, Wei Zhai, Qing Zhao, Qi Gao, Bing Xiang Yang, Guanghui Fu:
Generalizable Cross-Lingual Cognitive Distortion Detection with Standardized Annotations and Multi-Task Learning. 16041-16051 - Constanza Fierro, Negar Foroutan, Desmond Elliott, Anders Søgaard:
How Do Multilingual Language Models Remember Facts? 16052-16106 - Ting Xu, Zhichao Huang, Jiankai Sun, Shanbo Cheng, Wai Lam:
SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation. 16107-16123 - Ayuto Tsutsumi, Yuu Jinnai:
Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales. 16124-16146 - Hongzhi Luan, Changxin Tian, Zhaoxin Huan, Xiaolu Zhang, Kunlong Chen, Zhiqiang Zhang, Jun Zhou:
BOSE: A Systematic Evaluation Method Optimized for Base Models. 16147-16158 - Zhonghao Sun, Zhiliang Tian, Yiping Song, Yuyi Si, Juhua Zhang, Minlie Huang, Kai Lu, Zeyu Xiong, Xinwang Liu, Dongsheng Li:
DPGA-TextSyn: Differentially Private Genetic Algorithm for Synthetic Text Generation. 16159-16179 - Seungyoon Lee, Seongtae Hong, Hyeonseok Moon, Heuiseok Lim:
Semantic Aware Linear Transfer by Recycling Pre-trained Language Models for Cross-lingual Transfer. 16180-16193 - Kounianhua Du, Hanjing Wang, Jianxing Liu, Jizheng Chen, Xinyi Dai, Yasheng Wang, Ruiming Tang, Yong Yu, Jun Wang, Weinan Zhang:
Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation. 16194-16204 - Guozheng Li, Peng Wang, Wenjun Ke, Zijie Xu, Jiajun Liu, Ziyu Shang:
On the Consistency of Commonsense in Large Language Models. 16205-16225 - Ahmed Elshabrawy, Thanh-Nhi Nguyen, Yeeun Kang, Lihan Feng, Annant Jain, Faadil Abdullah Shaikh, Jonibek Mansurov, Mohamed Fazli Mohamed Imam, Jesús-Germán Ortiz-Barajas, Rendi Chevi, Alham Fikri Aji:
Statement-Tuning Enables Efficient Cross-lingual Generalization in Encoder-only Models. 16226-16248 - Jane Arleth dela Cruz, Iris Hendrickx, Martha A. Larson:
Evaluating Large Language Models for Confidence-based Check Set Selection. 16249-16265 - Claudiu Daniel Hromei, Federico Borazio, Andrea Sensi, Elisa Passone, Danilo Croce, Roberto Basili:
Training Multi-Modal LLMs through Dialogue Planning for HRI. 16266-16284 - Fabian David Schmidt, Florian Schneider, Chris Biemann, Goran Glavas:
MVL-SIB: A Massively Multilingual Vision-Language Benchmark for Cross-Modal Topical Matching. 16285-16312 - Yihong Tang, Kehai Chen, Xuefeng Bai, Zheng-Yu Niu, Bo Wang, Jie Liu, Min Zhang:
The Rise of Darkness: Safety-Utility Trade-Offs in Role-Playing Dialogue Agents. 16313-16337 - Xin Zhang, Qiyu Wei, Yingjie Zhu, Linhai Zhang, Deyu Zhou, Sophia Ananiadou:
SynGraph: A Dynamic Graph-LLM Synthesis Framework for Sparse Streaming User Sentiment Modeling. 16338-16356 - Yue Cui, Liuyi Yao, Shuchang Tao, Weijie Shi, Yaliang Li, Bolin Ding, Xiaofang Zhou:
Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists. 16357-16375 - Khalid N. Elmadani, Nizar Habash, Hanada Taha-Thomure:
A Large and Balanced Corpus for Fine-grained Arabic Readability Assessment. 16376-16400 - Che Liu, Zhongwei Wan, Haozhe Wang, Yinda Chen, Talha Qaiser, Chen Jin, Nikolay Burlutskiy, Fariba Yousefi, Rossella Arcucci:
Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data? 16401-16421 - Jihao Gu, Yingyao Wang, Pi Bu, Chen Wang, Ziming Wang, Tengtao Song, Donglai Wei, Jiale Yuan, Yingxiu Zhao, Yancheng He, Shilong Li, Jiaheng Liu, Meng Cao, Jun Song, Yingshui Tan, Xiang Li, Wenbo Su, Xiaoyong Zhu, Bo Zheng:
See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models. 16422-16447 - Che Liu, Zhongwei Wan, Yuqi Wang, Hui Shen, Haozhe Wang, Kangyu Zheng, Mi Zhang, Rossella Arcucci:
Argus: Benchmarking and Enhancing Vision-Language Models for 3D Radiology Report Generation. 16448-16460 - Binquan Ji, Haibo Luo, YifeiLu YifeiLu, Lei Hei, Jiaqi Wang, Tingjing Liao, Lingyu Wang, Shichao Wang, Feiliang Ren:
Resource-Friendly Dynamic Enhancement Chain for Multi-Hop Question Answering. 16461-16479 - Siya Qi, Rui Cao, Yulan He, Zheng Yuan:
Evaluating LLMs' Assessment of Mixed-Context Hallucination Through the Lens of Summarization. 16480-16503 - Xuanli He, Jun Wang, Qiongkai Xu, Pasquale Minervini, Pontus Stenetorp, Benjamin I. P. Rubinstein, Trevor Cohn:
TUBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning. 16504-16544 - Daniela Gottesman, Mor Geva, Dana Ramati:
Eliciting Textual Descriptions from Representations of Continuous Prompts. 16545-16562 - Yuhan Fu, Ruobing Xie, Xingwu Sun, Zhanhui Kang, Xirong Li:
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization. 16563-16577 - Jiangxu Wu, Cong Wang, Tianhuang Su, Jun Yang, Haozhi Lin, Chao Zhang, Ming Peng, Kai Shi, Songpan Yang, Binqing Pan, Zixian Li:
Review-Instruct: A Review-Driven Multi-Turn Conversations Generation Method for Large Language Models. 16578-16595 - Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi:
Why Uncertainty Estimation Methods Fall Short in RAG: An Axiomatic Analysis. 16596-16616 - Daniel Russo, Fariba Sadeghi, Stefano Menini, Marco Guerini:
EuroVerdict: A Multilingual Dataset for Verdict Generation Against Misinformation. 16617-16634 - Sona Elza Simon, Soumen Kumar Mondal, Abhishek Singhania, Sayambhu Sen, Preethi Jyothi:
LoFTI: Localization and Factuality Transfer to Indian Locales. 16635-16662 - Jaeyoung Choe, Jihoon Kim, Woohwan Jung:
Hierarchical Retrieval with Evidence Curation for Open-Domain Financial Question Answering on Standardized Documents. 16663-16681 - Costas Mavromatis, George Karypis:
GNN-RAG: Graph Neural Retrieval for Efficient Large Language Model Reasoning on Knowledge Graphs. 16682-16699 - Yajie Vera He, Mohita Chowdhury, Jared Joselowitz, Aisling Higham, Ernest Lim:
ASTRID - An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems. 16700-16716 - Masaki Sakata, Benjamin Heinzerling, Sho Yokoi, Takumi Ito, Kentaro Inui:
On Entity Identification in Language Models. 16717-16741 - Hongchao Gu, Dexun Li, Kuicai Dong, Hao Zhang, Hang Lv, Hao Wang, Defu Lian, Yong Liu, Enhong Chen:
RAPID: Efficient Retrieval-Augmented Long Text Generation with Writing Planning and Information Discovery. 16742-16763 - Abbas Ghaddar, David Alfonso-Hermelo, Philippe Langlais, Boxing Chen, Prasanna Parthasarathi:
CHARPEVAL: Benchmarking Large Language Models' Contextual Reasoning in Knowledge-Grounded Dialogue. 16764-16775 - Mohammad Mahdi Abootorabi, Amirhosein Zobeiri, Mahdi Dehghani, Mohammadali Mohammadkhani, Bardia Mohammadi, Omid Ghahroodi, Mahdieh Soleymani Baghshah, Ehsaneddin Asgari:
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation. 16776-16809 - Shaowei Zhang, Deyi Xiong:
Debate4MATH: Multi-Agent Debate for Fine-Grained Reasoning in Math. 16810-16824 - Irina Saparina, Mirella Lapata:
Disambiguate First, Parse Later: Generating Interpretations for Ambiguity Resolution in Semantic Parsing. 16825-16839 - Katharina Beckh, Elisa Studeny, Sujan Sai Gannamaneni, Dario Antweiler, Stefan Rüping:
The Anatomy of Evidence: An Investigation Into Explainable ICD Coding. 16840-16851 - Zhibin Lan, Liqiang Niu, Fandong Meng, Wenbo Li, Jie Zhou, Jinsong Su:
AVG-LLaVA: An Efficient Large Multimodal Model with Adaptive Visual Granularity. 16852-16869 - Chenxi Wang, Tianle Gu, Zhongyu Wei, Lang Gao, Zirui Song, Xiuying Chen:
Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia. 16870-16885 - Andong Chen, Kehai Chen, Yang Xiang, Xuefeng Bai, Muyun Yang, Yang Feng, Tiejun Zhao, Min Zhang:
LLM-based Translation Inference with Iterative Bilingual Understanding. 16886-16902 - Yurong Wu, Fangwen Mu, Qiuhong Zhang, Jinjing Zhao, Xinrun Xu, Lingrui Mei, Yang Wu, Lin Shi, Junjie Wang, Zhiming Ding, Yiwei Wang:
Vulnerability of Text-to-Image Models to Prompt Template Stealing: A Differential Evolution Approach. 16903-16916 - Justin Qiu, Jiacheng Zhu, Ajay Patel, Marianna Apidianaki, Chris Callison-Burch:
mStyleDistance: Multilingual Style Embeddings and their Evaluation. 16917-16931 - Shanbao Qiao, Xuebing Liu, Akshat Gupta, Seung-Hoon Na:
SeqMMR: Sequential Model Merging and LLM Routing for Enhanced Batched Sequential Knowledge Editing. 16932-16947 - Jiaqi Li, Xinyi Dong, Yang Liu, Zhizhuo Yang, Quansen Wang, Xiaobo Wang, Song-Chun Zhu, Zixia Jia, Zilong Zheng:
ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection. 16948-16966 - Shuo Yang, Caren Han, Siwen Luo, Eduard H. Hovy:
MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering. 16967-16986 - Injae Na, Keonwoong Noh, Woohwan Jung:
Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language Models. 16987-17004 - Yibo Zhong, Jinman Zhao, Yao Zhou:
Low-Rank Interconnected Adaptation across Layers. 17005-17029 - Ionut-Teodor Sorodoc, Leonardo F. R. Ribeiro, Rexhina Blloshmi, Christopher Davis, Adrià de Gispert:
GaRAGe: A Benchmark with Grounding Annotations for RAG Evaluation. 17030-17049 - Yi Li, Yunbin Tu, Liang Li, Li Su, Qingming Huang:
Change Entity-guided Heterogeneous Representation Disentangling for Change Captioning. 17050-17060 - Zhuoran Jin, Hongbang Yuan, Tianyi Men, Pengfei Cao, Yubo Chen, Jiexin Xu, Huaijun Li, Xiaojian Jiang, Kang Liu, Jun Zhao:
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment. 17061-17090 - Kun Li, Tianhua Zhang, Yunxiang Li, Hongyin Luo, Abdalla Mohamed Salama Sayed Moustafa, Xixin Wu, James R. Glass, Helen M. Meng:
Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution. 17091-17105 - Afonso Sousa, Henrique Lopes Cardoso:
PAM: Paraphrase AMR-Centric Evaluation Metric. 17106-17121 - Hongze Mi, Jinyuan Li, Zhangxuying Zhangxuying, Haoran Cheng, Jiahao Wang, Di Sun, Gang Pan:
VP-MEL: Visual Prompts Guided Multimodal Entity Linking. 17122-17137 - Bruno Puri, Aakriti Jain, Elena Golimblevskaia, Patrick Kahardipraja, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin:
FADE: Why Bad Descriptions Happen to Good Features. 17138-17160 - Anna Mosolova, Marie Candito, Carlos Ramisch:
In the LLM era, Word Sense Induction remains unsolved. 17161-17178 - Chadi Helwe, Oana Balalau, Davide Ceolin:
Navigating the Political Compass: Evaluating Multilingual LLMs across Languages and Nationalities. 17179-17204 - Wanqi Yang, Yanda Li, Meng Fang, Yunchao Wei, Ling Chen:
Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Audio-Language Models. 17205-17220 - Sibo Yi, Tianshuo Cong, Xinlei He, Qi Li, Jiaxing Song:
Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models. 17221-17234 - Yifei Chen, Ruihui Hou, Jingping Liu, Tong Ruan:
EMRs2CSP : Mining Clinical Status Pathway from Electronic Medical Records. 17235-17251 - Jiaxin Shen, Jinan Xu, Huiqi Hu, Luyi Lin, Guoyang Ma, Fei Zheng, Fandong Meng, Jie Zhou, Wenjuan Han:
A Law Reasoning Benchmark for LLM with Tree-Organized Structures including Factum Probandum, Evidence and Experiences. 17252-17274 - Xi Zhang, Zaiqiao Meng, Jake Lever, Edmond S. L. Ho:
Libra: Leveraging Temporal Images for Biomedical Radiology Analysis. 17275-17303 - Aditya Tomar, V. Rudra Murthy, Pushpak Bhattacharyya:
Stereotype Detection as a Catalyst for Enhanced Bias Detection: A Multi-Task Learning Approach. 17304-17317 - Omar Momen, Manuel Schaaf, Alexander Mehler:
Filling the Temporal Void: Recovering Missing Publication Years in the Project Gutenberg Corpus Using LLMs. 17318-17334 - Martina Miliani, Serena Auriemma, Alessandro Bondielli, Emmanuele Chersoni, Lucia C. Passaro, Irene Sucameli, Alessandro Lenci:
ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models. 17335-17355 - Leila Moudjari, Farah Benamara:
Are Dialects Better Prompters? A Case Study on Arabic Subjective Text Classification. 17356-17371 - Jihao Shi, Xiao Ding, Kai Xiong, Hengwei Zhao, Bing Qin, Ting Liu:
Natural Logic at the Core: Dynamic Rewards for Entailment Tree Generation. 17372-17382 - Wenlong Meng, Guo Zhenyuan, Lenan Wu, Chen Gong, Wenyan Liu, Weixian Li, Chengkun Wei, Wenzhi Chen:
R.R.: Unveiling LLM Training Privacy through Recollection and Ranking. 17383-17397 - Shuhan Guo, Nan Yin, James Kwok, Quanming Yao:
Nested-Refinement Metamorphosis: Reflective Evolution for Efficient Optimization of Networking Problems. 17398-17429 - Junzhe Zhang, Huixuan Zhang, Xunjian Yin, Baizhou Huang, Xu Zhang, Xinyu Hu, Xiaojun Wan:
MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency. 17430-17445 - Alessio Galatolo, Zhenbang Dai, Katie Winkle, Meriem Beloucif:
Visualising Policy-Reward Interplay to Inform Zeroth-Order Preference Optimisation of Large Language Models. 17446-17461 - Elisa Sanchez-Bayona, Rodrigo Agerri:
Metaphor and Large Language Models: When Surface Features Matter More than Deep Understanding. 17462-17477 - Dayeon Ki, Kevin Duh, Marine Carpuat:
AskQE: Question Answering as Automatic Evaluation for Machine Translation. 17478-17515 - Alireza Salemi, Julian Killingback, Hamed Zamani:
ExPerT: Effective and Explainable Evaluation of Personalized Long-Form Text Generation. 17516-17532 - Yujie Zhang, Weikang Yuan, Zhuoren Jiang:
Bridging Intuitive Associations and Deliberate Recall: Empowering LLM Personal Assistant with Graph-Structured Long-term Memory. 17533-17547 - Huachi Zhou, Jiahe Du, Chuang Zhou, Chang Yang, Yilin Xiao, Yuxuan Xie, Xiao Huang:
Each graph is a new language: Graph Learning with LLMs. 17548-17559 - Van Yang, Hongye Jin, Shaochen Zhong, Song Jiang, Qifan Wang, Vipin Chaudhary, Xiaotian Han:
100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability? 17560-17576 - Hai Yu, Chong Deng, Qinglin Zhang, Jiaqing Liu, Qian Chen, Wen Wang:
Multimodal Fusion and Coherence Modeling for Video Topic Segmentation. 17577-17593 - Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen:
Are Your LLMs Capable of Stable Reasoning? 17594-17632 - He Zhu, Yifan Ding, Yicheng Tao, Zhiwen Ruan, Yixia Li, Wenjia Zhang, Yun Chen, Guanhua Chen:
FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only. 17633-17653 - William Xia, Ishita Unde, Brian David Ondov, Dina Demner-Fushman:
JEBS: A Fine-grained Biomedical Lexical Simplification Task. 17654-17666 - Simon Welz, Lucie Flek, Akbar Karimi:
Multi-Hop Reasoning for Question Answering with Hyperbolic Representations. 17667-17679 - Yunsoo Kim, Jinge Wu, Su Hwan Kim, Pardeep Vasudev, Jiashu Shen, Honghan Wu:
Look & Mark: Leveraging Radiologist Eye Fixations and Bounding boxes in Multimodal Large Language Models for Chest X-ray Report Generation. 17680-17694 - He Zhu, Zhiwen Ruan, Junyou Su, Xingwei He, Yun Chen, Wenjia Zhang, Guanhua Chen:
Tag-Instruct: Controlled Instruction Complexity Enhancement through Structure-based Augmentation. 17708-17729 - Tengfei Wen, Xuanang Chen, Ben He, Le Sun:
Code-SPA: Style Preference Alignment to Large Language Models for Effective and Robust Code Debugging. 17730-17743 - Xinhao Tan, Songhua Liu, Xia Cong, Kunjun Li, Xinchao Wang:
Open-World Authorship Attribution. 17744-17758 - Sahil Manchanda, Pannaga Shivaswamy:
What is in a name? Mitigating Name Bias in Text Embedding Similarity via Anonymization. 17759-17781 - Kawsar Ahmed, Md Osama, Omar Sharif, Eftekhar Hossain, Mohammed Moshiul Hoque:
BenNumEval: A Benchmark to Assess LLMs' Numerical Reasoning Capabilities in Bengali. 17782-17799 - Harsh Jhamtani, Jacob Andreas, Benjamin Van Durme:
LLM Agents for Coordinating Multi-User Information Gathering. 17800-17826 - Xiao Chen, Changyi Ma, Wenqi Fan, Zhaoxiang Zhang, Qing Li:
C2KD: Cross-layer and Cross-head Knowledge Distillation for Small Language Model-based Recommendation. 17827-17838 - Yao Wan, Yang Wu, Zhen Li, Guobiao Zhang, Hongyu Zhang, Zhou Zhao, Hai Jin, April Wang:
Sign2Vis: Automated Data Visualization from Sign Language. 17839-17857 - Jiajun Shen, Tong Zhou, Yubo Chen, Delai Qiu, Shengping Liu, Kang Liu, Jun Zhao:
Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation. 17858-17877 - Muyao Li, Zihao Wang, Kaichen He, Xiaojian Ma, Yitao Liang:
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse. 17878-17899 - Linli Yao, Haoning Wu, Kun Ouyang, Yuanxing Zhang, Caiming Xiong, Bei Chen, Xu Sun, Junnan Li:
Generative Frame Sampler for Long Video Understanding. 17900-17917 - Davide Bassi, Dimitar Iliyanov Dimitrov, Bernardo D'Auria, Firoj Alam, Maram Hasanain, Christian Moro, Luisa Orrù, Gian Piero Turchi, Preslav Nakov, Giovanni Da San Martino:
Annotating the Annotators: Analysis, Insights and Modelling from an Annotation Campaign on Persuasion Techniques Detection. 17918-17929 - Suhas Kamasetty Ramesh, Ayan Sengupta, Tanmoy Chakraborty:
On the Generalization vs Fidelity Paradox in Knowledge Distillation. 17930-17951 - Iqra Zahid, Youcheng Sun, Riza Batista-Navarro:
BEDAA: Bayesian Enhanced DeBERTa for Uncertainty-Aware Authorship Attribution. 17952-17966 - Tom Calamai, Oana Balalau, Fabian M. Suchanek:
Benchmarking the Benchmarks: Reproducing Climate-Related NLP Tasks. 17967-18009 - Matthew Shardlow, Ashley Williams, Charlie Roadhouse, Filippos Ventirozos, Piotr Przybyla:
Exploring Supervised Approaches to the Detection of Anthropomorphic Language in the Reporting of NLP Venues. 18010-18022 - Zheng Zhao, Clara Vania, Subhradeep Kayal, Naila Khan, Shay B. Cohen, Emine Yilmaz:
PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants. 18023-18055 - Wujiang Xu, Yunxiao Shi, Zujie Liang, Xuying Ning, Kai Mei, Kun Wang, Xi Zhu, Min Xu, Yongfeng Zhang:
iAgent: LLM Agent as a Shield between User and Recommender Systems. 18056-18084 - Kushan Mitra, Dan Zhang, Sajjadur Rahman, Estevam Hruschka:
FactLens: Benchmarking Fine-Grained Fact Verification. 18085-18096 - Shimao Zhang, Xiao Liu, Xin Zhang, Junxiao Liu, Zheheng Luo, Shujian Huang, Yeyun Gong:
Process-based Self-Rewarding Language Models. 18097-18110 - Benedikt Ebing, Goran Glavas:
The Devil Is in the Word Alignment Details: On Translation-Based Cross-Lingual Transfer for Token Classification Tasks. 18111-18128 - Zitao Xuan, Xiaofeng Mao, Da Chen, Xin Zhang, Yuhan Dong, Jun Zhou:
ShieldHead: Decoding-time Safeguard for Large Language Models. 18129-18143 - Shuliang Liu, Hongyi Liu, Aiwei Liu, Bingchen Duan, Zheng Qi, Yibo Yan, He Geng, Peijie Jiang, Jia Liu, Xuming Hu:
A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models. 18144-18155 - Alexey Tikhonov, Sergei Shteiner, Anna Bykova, Ivan P. Yamshchikov:
Smotrom tvoja på ander drogoj verden! Resurrecting Dead Pidgin with Generative Models: Russenorsk Case Study. 18156-18166 - Xueliang Zhao, Wei Wu, Jian Guan, Lingpeng Kong:
PromptCoT: Synthesizing Olympiad-level Problems for Mathematical Reasoning in Large Language Models. 18167-18188 - Szymon Kobus, Deniz Gündüz:
Speculative Sampling via Exponential Races. 18189-18204 - Jorge Iranzo-Sánchez, Javier Iranzo-Sánchez, Adrià Giménez, Jorge Civera:
Going Beyond Your Expectations in Latency Metrics for Simultaneous Speech Translation. 18205-18228 - Chaoran Chen, Bingsheng Yao, Ruishi Zou, Wenyue Hua, Weimin Lyu, Toby Jia-Jun Li, Dakuo Wang:
Towards a Design Guideline for RPA Evaluation: A Survey of Large Language Model-Based Role-Playing Agents. 18229-18268 - Philipp Christmann, Gerhard Weikum:
Recursive Question Understanding for Complex Question Answering over Heterogeneous Personal Data. 18269-18288 - Steven Koniaev, Ori Ernst, Jackie CK Cheung:
PreSumm: Predicting Summarization Performance Without Summarizing. 18289-18305 - Yongjia Lei, Haoyu Han, Ryan A. Rossi, Franck Dernoncourt, Nedim Lipka, Mahantesh M. Halappanavar, Jiliang Tang, Yu Wang:
Mixture of Structural-and-Textual Retrieval over Text-rich Graph Knowledge Bases. 18306-18321 - Denitsa Saynova, Lovisa Hagström, Moa Johansson, Richard Johansson, Marco Kuhlmann:
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion. 18322-18349 - Ke Yi, Jianwei Zhang, Zhiying Xu, Xinlong Yang, Yang Zhou, Minmin Sun, Zengke Liu, Tong Zhang, Junyang Lin, Jingren Zhou:
FPE2M2: Approaching Lossless and Efficient Quantization with Native Floating Point. 18350-18361 - Tong Zheng, Yan Wen, Huiwen Bao, Junfeng Guo, Heng Huang:
Asymmetric Conflict and Synergy in Post-training for LLM-based Multilingual Machine Translation. 18362-18383 - Zhaoyang Xia, Somdeb Sarkhel, Md. Mehrab Tanjim, Stefano Petrangeli, Ishita Dasgupta, Yuxiao Chen, Jinxuan Xu, Di Liu, Saayan Mitra, Dimitris N. Metaxas:
VISIAR: Empower MLLM for Visual Story Ideation. 18384-18402 - Ding Yu, Zhuo Liu, Hangfeng He:
Same Company, Same Signal: The Role of Identity in Earnings Call Transcripts. 18403-18422 - Emma Harvey, Emily Sheng, Su Lin Blodgett, Alexandra Chouldechova, Jean Garcia-Gathright, Alexandra Olteanu, Hanna M. Wallach:
Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based Systems. 18423-18440 - Angana Borah, Marwa Houalla, Rada Mihalcea:
Mind the (Belief) Gap: Group Identity in the World of LLMs. 18441-18463 - Jie Ren, Zhenwei Dai, Xianfeng Tang, Hui Liu, Jingying Zeng, Zhen Li, Rahul Goutam, Suhang Wang, Yue Xing, Qi He:
A General Framework to Enhance Fine-tuning-based LLM Unlearning. 18464-18476 - Francesco Maria Molfese, Luca Moroni, Luca Gioffrè, Alessandro Scirè, Simone Conia, Roberto Navigli:
Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering. 18477-18494 - Adil Soubki, Owen Rambow:
Machine Theory of Mind Needs Machine Validation. 18495-18505 - Akshat Sharma, Hangliang Ding, Jianping Li, Neel Dani, Minjia Zhang:
MiniKV: Pushing the Limits of 2-Bit KV Cache via Compression and System Co-Design for Efficient Long Context Inference. 18506-18523 - Ming Cheng, Jiaying Gong, Hoda Eldardiry:
Sci-LoRA: Mixture of Scientific LoRAs for Cross-Domain Lay Paraphrasing. 18524-18541 - Antonia Karamolegkou, Oliver Eberle, Phillip Rust, Carina Kauf, Anders Søgaard:
Trick or Neat: Adversarial Ambiguity and Language Model Evaluation. 18542-18561 - Kshitish Ghate, Tessa Charlesworth, Mona T. Diab, Aylin Caliskan:
Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes. 18562-18580 - Yingqian Cui, Pengfei He, Jingying Zeng, Hui Liu, Xianfeng Tang, Zhenwei Dai, Yan Han, Chen Luo, Jing Huang, Zhen Li, Suhang Wang, Yue Xing, Jiliang Tang, Qi He:
Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models. 18581-18597 - Yilun Zhao, Chengye Wang, Chuhan Li, Arman Cohan:
Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers. 18598-18631 - Kaustubh Deshpande, Ved Sirdeshmukh, Johannes Baptist Mols, Lifeng Jin, Ed-Yeremai Hernandez-Cardona, Dean Lee, Jeremy Kritz, Willow E. Primack, Summer Yue, Chen Xing:
MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs. 18632-18702 - Jaydeep Borkar, Matthew Jagielski, Katherine Lee, Niloofar Mireshghallah, David A. Smith, Christopher A. Choquette-Choo:
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training. 18703-18726 - Yuyou Zhang, Miao Li, William Han, Yihang Yao, Zhepeng Cen, Ding Zhao:
Safety is Not Only About Refusal: Reasoning-Enhanced Fine-tuning for Interpretable LLM Safety. 18727-18746 - Jaïr A Waal, Giovanni Cassani:
Is a cute puyfred cute? Context-dependent form-meaning systematicity in LLMs. 18747-18769 - Haris Riaz, Sourav Sanjukta Bhabesh, Vinayak Arannil, Miguel Ballesteros, Graham Horwood:
MetaSynth: Meta-Prompting-Driven Agentic Scaffolds for Diverse Synthetic Data Generation. 18770-18803 - Amit Agarwal, Srikant Panda, Angeline Charles, Hitesh Laxmichand Patel, Bhargava Kumar, Priyaranjan Pattnayak, Taki Hasan Rafi, Tejaswini Kumar, Hansa Meghwani, Karan Gupta, Dong-Kyu Chae:
MVTamperBench: Evaluating Robustness of Vision-Language Models. 18804-18828 - Qianqi Yan, Yue Fan, Hongquan Li, Shan Jiang, Yang Zhao, Xinze Guan, Ching-Chen Kuo, Xin Eric Wang:
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models. 18829-18845 - Iñigo Alonso, Gorka Azkune, Ander Salaberria, Jeremy Barnes, Oier Lopez de Lacalle:
Vision-Language Models Struggle to Align Entities across Modalities. 18846-18862 - Lucky Susanto, Musa Izzanardi Wijanarko, Prasetia Anugrah Pratama, Zilu Tang, Fariz Akyas, Traci Hong, Ika Karlina Idris, Alham Fikri Aji, Derry Tanti Wijaya:
A Multi-Labeled Dataset for Indonesian Discourse: Examining Toxicity, Polarization, and Demographics Information. 18863-18890 - Xiao Wang, Mengjue Tan, Qiao Jin, Guangzhi Xiong, Yu Hu, Aidong Zhang, Zhiyong Lu, Minjia Zhang:
MedCite: Can Language Models Generate Verifiable Text for Medicine? 18891-18913 - Sadaf Md. Halim, Chen Zhao, Xintao Wu, Latifur Khan, Christan Grant, Fariha Ishrat Rahman, Feng Chen:
Let The Jury Decide: Fair Demonstration Selection for In-Context Learning through Incremental Greedy Evaluation. 18914-18931 - Portia Cooper, Eduardo Blanco, Mihai Surdeanu:
The Lies Characters Tell: Utilizing Large Language Models to Normalize Adversarial Unicode Perturbations. 18932-18944 - Ahmad Aljanaideh:
Speech Act Patterns for Improving Generalizability of Explainable Politeness Detection Models. 18945-18954 - Khushboo Singh, Vasudha Varadarajan, Adithya V. Ganesan, August Håkan Nilsson, Nikita Soni, Syeda Mahwish, Pranav Chitale, Ryan L. Boyd, Lyle H. Ungar, Richard N. Rosenthal, H. Andrew Schwartz:
Systematic Evaluation of Auto-Encoding and Large Language Model Representations for Capturing Author States and Traits. 18955-18973 - Yubin Ge, Salvatore Romeo, Jason Cai, Raphael Shu, Yassine Benajiba, Monica Sunkara, Yi Zhang:
TReMu: Towards Neuro-Symbolic Temporal Reasoning for LLM-Agents with Memory in Multi-Session Dialogues. 18974-18988 - Toyin Aguda, Erik Wilson, Allan Anzagira, Simerjot Kaur, Charese Smiley:
Conservative Bias in Large Language Models: Measuring Relation Predictions. 18989-18998 - Taeyoun Kim, Jacob Mitchell Springer, Aditi Raghunathan, Maarten Sap:
Mitigating Bias in RAG: Controlling the Embedder. 18999-19024 - Zongyu Lin, Zhikun Xu, Xiaohan Song, Yixin Wan, Xingcheng Yao, Tsung-Han Lin, Selina Song, Pranav Subbaraman, Ben Zhou, Kai-Wei Chang, Yizhou Sun:
V-ALPHASOCIAL: Benchmark and Self-Reflective Chain-of-Thought Generation for Visual Social Commonsense Reasoning. 19025-19047 - Jessica Ojo, Odunayo Ogundepo, Akintunde Oladipo, Kelechi Ogueji, Jimmy Lin, Pontus Stenetorp, David Ifeoluwa Adelani:
AfroBench: How Good are Large Language Models on African Languages? 19048-19095 - Skyler Seto, Maartje ter Hoeve, Richard He Bai, Natalie Schluter, David Grangier:
Training Bilingual LMs with Data Constraints in the Targeted Language. 19096-19122 - Ahmed Masry, Mohammed Saidul Islam, Mahir Ahmed, Aayush Bajaj, Firoz Kabir, Aaryaman Kartha, Md. Tahmid Rahman Laskar, Mizanur Rahman, Shadikur Rahman, Mehrad Shahmohammadi, Megh Thakkar, Md. Rizwan Parvez, Enamul Hoque, Shafiq Joty:
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering. 19123-19151 - Shenshen Li, Wenxin Meng, Lei Wang, Hao Yang, Chong Peng, Peng Yan, Fumin Shen, Jingkuan Song, Heng Tao Shen, Xing Xu:
From Observation to Understanding: Front-Door Adjustments with Uncertainty Calibration for Enhancing Egocentric Reasoning in LVLMs. 19152-19169 - Yejun Yoon, Jaeyoon Jung, Seunghyun Yoon, Kunwoo Park:
Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion. 19170-19187 - Qianqi Yan, Xuehai He, Xiang Yue, Xin Eric Wang:
Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA. 19188-19205 - Bohan Zhai, Canwen Xu, Yuxiong He, Zhewei Yao:
Optimizing Reasoning for Text-to-SQL with Execution Feedback. 19206-19218 - Wenyue Hua, Kaijie Zhu, Lingyao Li, Lizhou Fan, Mingyu Jin, Shuhang Lin, Haochen Xue, Zelong Li, Jindong Wang, Yongfeng Zhang:
Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities. 19219-19242 - Shuqi Liu, Han Wu, Bowei He, Xiongwei Han, Mingxuan Yuan, Linqi Song:
Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models. 19243-19255 - MohammadHossein Rezaei, Yicheng Fu, Phil Cuvin, Caleb Ziems, Yanzhe Zhang, Hao Zhu, Diyi Yang:
EgoNormia: Benchmarking Physical-Social Norm Understanding. 19256-19283 - Linyang He, Ercong Nie, Helmut Schmid, Hinrich Schütze, Nima Mesgarani, Jonathan Brennan:
Large Language Models as Neurolinguistic Subjects: Discrepancy between Performance and Competence. 19284-19302 - Mingmeng Geng, Caixi Chen, Yanru Wu, Yao Wan, Pan Zhou, Dongping Chen:
The Impact of Large Language Models in Academia: from Writing to Speaking. 19303-19319 - Peng Wang, Ruihan Tao, Qiguang Chen, Mengkang Hu, Libo Qin:
X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System. 19320-19335 - Haoran Tan, Zeyu Zhang, Chen Ma, Xu Chen, Quanyu Dai, Zhenhua Dong:
MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based Agents. 19336-19352 - Ryota Miyano, Yuki Arase:
Adaptive LoRA Merge with Parameter Pruning for Low-Resource Generation. 19353-19366 - Longyun Wu, Dawei Zhu, Guangxiang Zhao, Zhuocheng Yu, Junfeng Ran, Xiangyu Wong, Lin Sun, Sujian Li:
LongAttn: Selecting Long-context Training Data via Token-level Attention. 19367-19380 - Sai P. Vallurupalli, Francis Ferraro:
CoRE: Condition-based Reasoning for Identifying Outcome Variance in Complex Events. 19381-19401 - Jihyuk Kim, Sungjin Lee, Seung-won Hwang, Yang Liu:
FaVe: Factored and Verified Search Rationale for Long-form Answer. 19402-19416 - SongTang SongTang, Kaiyong Zhao, Lei Wang, Yuliang Li, Xuebo Liu, Junyi Zou, Qiang Wang, Xiaowen Chu:
UnrealLLM: Towards Highly Controllable and Interactable 3D Scene Generation by LLM-powered Procedural Content Generation. 19417-19435 - Jihyuk Kim, Shubham Garg, Lahari Poddar, Seung-won Hwang, Chris Hench:
Tree-of-Prompts: Abstracting Control-Flow for Prompt Optimization. 19436-19459 - Pengxiang Li, Lu Yin, Xiaowei Gao, Shiwei Liu:
Outlier-weighed Layerwise Sampling for LLM Fine-tuning. 19460-19473 - Chaoyi Jiang, Lei Gao, Hossein Entezari Zarch, Murali Annavaram:
KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial Recomputation. 19474-19488 - Hongming Yang, Shi Lin, Jun Shao, Changting Lin, Donghai Zhu, Meng Han, Qinglei Kong:
Direct Behavior Optimization: Unlocking the Potential of Lightweight LLMs. 19489-19515 - Bo Lv, Nayu Liu, Yang Shen, Xin Liu, Ping Luo, Yue Yu:
Whether LLMs Know If They Know: Identifying Knowledge Boundaries via Debiased Historical In-Context Learning. 19516-19528 - Yunhao Wei, Kai Shuang, Zhiyi Li, Chenrui Mao:
How do LLMs' Preferences Affect Event Argument Extraction? CAT: Addressing Preference Traps in Unsupervised EAE. 19529-19543 - Xiangwei Lv, Mengze Li, Jingyuan Chen, Zhiang Dong, Sirui Han, Beishui Liao:
Out-of-Distribution Detection via LLM-Guided Outlier Generation for Text-attributed Graph. 19544-19555 - Fu Zhang, Yi Yan, Jingwei Cheng:
Document-Level Relation Extraction with Global Relations and Entity Pair Reasoning. 19556-19567 - Yubo Ma, Jinsong Li, Yuhang Zang, Xiaobao Wu, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Haodong Duan, Jiaqi Wang, Yixin Cao, Aixin Sun:
Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings. 19568-19580 - Qingyu Ren, Jie Zeng, Qianyu He, Jiaqing Liang, Yanghua Xiao, Weikang Zhou, Zeye Sun, Fei Yu:
Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models. 19581-19596 - Hwiyeol Jo, Hyunwoo Lee, Kang Min Yoo, Taiwoo Park:
ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models. 19597-19607 - Chunyang Li, Weiqi Wang, Tianshi Zheng, Yangqiu Song:
Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations. 19608-19626 - Haiqi Zhang, Zhengyuan Zhu, Zeyu Zhang, Chengkai Li:
LLMTaxo: Leveraging Large Language Models for Constructing Taxonomy of Factual Claims from Social Media. 19627-19641 - Haibo Sun, Jayeol Chun, Nianwen Xue:
AnCast++: Document-Level Evaluation of Graph-based Meaning Representations. 19642-19654 - Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Yongbin Li, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Hamid Alinejad-Rokny, Xiaobo Xia, Jingkuan Song, Fei Huang:
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct. 19655-19682 - Ziyu Guo, Renrui Zhang, Hao Chen, Jialin Gao, Dongzhi Jiang, Jiaze Wang, Pheng-Ann Heng:
SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems. 19683-19704 - Matthew King-Hang Ma, Chenwei Xie, Wenbo Wang, William Shi-Yuan Wang:
Exploring Layer-wise Representations of English and Chinese Homonymy in Pre-trained Language Models. 19705-19724 - Li Zeng, Zeming Liu, Chong Feng, Heyan Huang, Yuhang Guo:
DocMEdit: Towards Document-Level Model Editing. 19725-19743 - Yifan Lu, Jing Li, Yigeng Zhou, Yihui Zhang, Wenya Wang, Xiucheng Li, Meishan Zhang, Fangming Liu, Jun Yu, Min Zhang:
Adaptive Detoxification: Safeguarding General Capabilities of LLMs through Toxicity-Aware Knowledge Editing. 19744-19758 - Zixi Jia, Qinghua Liu, Hexiao Li, Yuyan Chen, Jiqiang Liu:
Evaluating the Long-Term Memory of Large Language Models. 19759-19777 - Russell Scheinberg, Ameeta Agrawal, Amber Shore, So Young Lee:
Explain-then-Process: Using Grammar Prompting to Enhance Grammatical Acceptability Judgments. 19778-19795 - Sirui Hong, Yizhang Lin, Bang Liu, Bangbang Liu, Binhao Wu, Ceyao Zhang, Danyang Li, Jiaqi Chen, Jiayi Zhang, Jinlin Wang, Li Zhang, Lingyao Zhang, Min Yang, Mingchen Zhuge, Taicheng Guo, Tuo Zhou, Wei Tao, Robert Tang, Xiangtao Lu, Xiawu Zheng, Xinbing Liang, Yaying Fei, Yuheng Cheng, Yongxin Ni, Zhibin Gou, Zongze Xu, Yuyu Luo, Chenglin Wu:
Data Interpreter: An LLM Agent for Data Science. 19796-19821 - Milan Gritta, Huiyin Xue, Gerasimos Lampouras:
DReSD: Dense Retrieval for Speculative Decoding. 19822-19832 - Zhengping Jiang, Jingyu Zhang, Nathaniel Weir, Seth Ebner, Miriam Wanner, Kate Sanders, Daniel Khashabi, Anqi Liu, Benjamin Van Durme:
Core: Robust Factual Precision with Informative Sub-Claim Identification. 19833-19856 - Feng Luo, Rui Yang, Hao Sun, Chunyuan Deng, Jiarui Yao, Jingyan Shen, Huan Zhang, Hanjie Chen:
Rethinking Diverse Human Preference Learning through Principal Component Analysis. 19857-19870 - Zhongtao Miao, Qiyu Wu, Masaaki Nagata, Yoshimasa Tsuruoka:
Improving Word Alignment Using Semi-Supervised Learning. 19871-19888 - Yixin Ou, Yunzhi Yao, Ningyu Zhang, Hui Jin, Jiacheng Sun, Shumin Deng, Zhenguo Li, Huajun Chen:
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training. 19889-19913 - Atharv Kulkarni, Kushagra Dixit, Vivek Srikumar, Dan Roth, Vivek Gupta:
LLM-Symbolic Integration for Robust Temporal Tabular Reasoning. 19914-19940 - Pei Fu, Tongkun Guan, Zining Wang, Zhentao Guo, Chen Duan, Hao Sun, Boming Chen, Qianyi Jiang, Jiayao Ma, Kai Zhou, Junfeng Luo:
Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review. 19941-19958 - Xiaohu Huang, Hao Zhou, Kai Han:
PruneVid: Visual Token Pruning for Efficient Video Large Language Models. 19959-19973 - Eshaan Agarwal, Raghav Magazine, Joykirat Singh, Vivek Dani, Tanuja Ganu, Akshay Uttama Nambi:
PromptWizard: Optimizing Prompts via Task-Aware, Feedback-Driven Self-Evolution. 19974-20003 - Haoyang Li, Xuejia Chen, Zhanchao Xu, Darian Li, Nicole Hu, Fei Teng, Yiming Li, Luyu Qiu, Chen Jason Zhang, Qing Li, Lei Chen:
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models. 20004-20026 - Liancheng Fang, Aiwei Liu, Hengrui Zhang, Henry Peng Zou, Weizhi Zhang, Philip S. Yu:
TABGEN-ICL: Residual-Aware In-Context Example Selection for Tabular Data Generation. 20027-20041 - Chengyi Ju, Weijie Shi, Chengzhong Liu, Jiaming Ji, Jipeng Zhang, Ruiyuan Zhang, Jiajie Xu, Yaodong Yang, Sirui Han, Yike Guo:
Benchmarking Multi-National Value Alignment for Large Language Models. 20042-20058 - Xixian Yong, Jianxun Lian, Xiaoyuan Yi, Xiao Zhou, Xing Xie:
MotiveBench: How Far Are We From Human-Like Motivational Reasoning in Large Language Models? 20059-20089 - Amir Taubenfeld, Tom Sheffer, Eran Ofek, Amir Feder, Ariel Goldstein, Zorik Gekhman, Gal Yona:
Confidence Improves Self-Consistency in LLMs. 20090-20111 - Zhi Rui Tam, Cheng-Kuang Wu, Chieh-Yen Lin, Yun-Nung Chen:
None of the Above, Less of the Right Parallel Patterns in Human and LLM Performance on Multi-Choice Questions Answering. 20112-20134 - Jon Z. Cai, Brendan King, Peyton Cameron, Susan Windisch Brown, Miriam Eckert, Dananjay Srinivas, George Arthur Baker, V. Kate Everson, Martha Palmer, James H. Martin, Jeffrey Flanigan:
In Search of the Lost Arch in Dialogue: A Dependency Dialogue Acts Corpus for Multi-Party Dialogues. 20135-20149 - Xinzhe Zheng, Sijie Ji, Jiawei Sun, Renqi Chen, Wei Gao, Mani Srivastava:
ProMind-LLM: Proactive Mental Health Care via Causal Reasoning with Sensor Data. 20150-20171 - Dongyoung Kim, Jinsung Yoon, Jinwoo Shin, Jaehyung Kim:
Debiasing Online Preference Learning via Preference Feature Preservation. 20172-20191 - Xin Men, Mingyu Xu, Qingyu Zhang, Qianhao Yuan, Bingning Wang, Hongyu Lin, Yaojie Lu, Xianpei Han, Weipeng Chen:
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect. 20192-20204 - Kaiyuan Liu, Youcheng Pan, Yang Xiang, Daojing He, Jing Li, Yexing Du, Tianrun Gao:
ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code Generation. 20205-20221 - Zhiyuan Fan, Yumeng Wang, Sandeep Polisetty, Yi R. Fung:
Unveiling the Lack of LVLM Robustness to Fundamental Visual Variations: Why and Path Forward. 20222-20242 - Juhua Zhang, Zhiliang Tian, Minghang Zhu, Yiping Song, Taishu Sheng, Siyi Yang, Qiunan Du, Xinwang Liu, Minlie Huang, Dongsheng Li:
DYNTEXT: Semantic-Aware Dynamic Text Sanitization for Privacy-Preserving LLM Inference. 20243-20255 - Fei Zuo, Kehai Chen, Yu Zhang, Zhengshan Xue, Min Zhang:
InImageTrans: Multimodal LLM-based Text Image Machine Translation. 20256-20277 - Xuemiao Zhang, Feiyu Duan, Liangyu Xu, Yongwei Zhou, Sirui Wang, Rongxiang Weng, Jingang Wang, Xunliang Cai:
FRAME: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy. 20278-20297 - Zhengdong Yang, Shuichiro Shimizu, Yahan Yu, Chenhui Chu:
When Large Language Models Meet Speech: A Survey on Integration Approaches. 20298-20315 - Arianna Graciotti, Leonardo Piano, Nicolas Lazzari, Enrico Daga, Rocco Tripodi, Valentina Presutti, Livio Pompianu:
KE-MHISTO: Towards a Multilingual Historical Knowledge Extraction Benchmark for Addressing the Long-Tail Problem. 20316-20339 - Dingyu Yao, Bowen Shen, Zheng Lin, Wei Liu, Jian Luan, Bin Wang, Weiping Wang:
TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization. 20340-20359 - Xinwei Guo, Jiashi Gao, Junlei Zhou, Jiaxin Zhang, Guanhua Chen, Xiangyu Zhao, Quanying Liu, Haiyan Wu, Xin Yao, Xuetao Wei:
The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing. 20360-20371 - Biao Fu, Minpeng Liao, Kai Fan, Chengxi Li, Liang Zhang, Yidong Chen, Xiaodong Shi:
LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline. 20372-20395 - Yin Hua, Zhiqiang Liu, Mingyang Chen, Zheng Fang, Chi Man Wong, Lingxiao Li, Chi-Man Vong, Huajun Chen, Wen Zhang:
Beyond Completion: A Foundation Model for General Knowledge Graph Reasoning. 20396-20412 - Zhengdong Yang, Sheng Li, Chenhui Chu:
Generative Error Correction for Emotion-aware Speech-to-text Translation. 20413-20421 - Yuki Hou, Haruki Tamoto, Qinghua Zhao, Homei Miyashita:
SynapticRAG: Enhancing Temporal Memory Retrieval in Large Language Models through Synaptic Mechanisms. 20422-20436 - Rachneet Singh Sachdeva, Yixiao Song, Mohit Iyyer, Iryna Gurevych:
Localizing and Mitigating Errors in Long-form Question Answering. 20437-20469 - Zefei Long, Zhenbiao Cao, Wei Chen, Zhongyu Wei:
EMGLLM: Data-to-Text Alignment for Electromyogram Diagnosis Generation with Medical Numerical Data Encoding. 20470-20480 - Sambal Shikhar, Mohammed Irfan Kurpath, Sahal Shaji Mullappilly, Jean Lahoud, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman H. Khan, Hisham Cholakkal:
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM. 20481-20493 - Zhangwenbo Zhangwenbo, Wang Yuhan:
Act2P: LLM-Driven Online Dialogue Act Classification for Power Analysis. 20494-20504 - Kurt Micallef, Claudia Borg:
MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. 20505-20527 - Sohaila Eltanbouly, Salam Albatarni, Tamer Elsayed:
TRATES: Trait-Specific Rubric-Assisted Cross-Prompt Essay Scoring. 20528-20543 - Shaoshen Chen, Yangning Li, Zishan Xu, Yongqin Zeng, Shunlong Wu, Xinshuo Hu, Zifei Shan, Xin Su, Jiwei Tang, Yinghui Li, Hai-Tao Zheng:
DAST: Context-Aware Compression in LLMs via Dynamic Allocation of Soft Tokens. 20544-20552 - Yimin Deng, Yuxia Wu, Yejing Wang, Guoshuai Zhao, Li Zhu, Qidong Liu, Derong Xu, Zichuan Fu, Xian Wu, Yefeng Zheng, Xiangyu Zhao, Xueming Qian:
A Multi-Expert Structural-Semantic Hybrid Framework for Unveiling Historical Patterns in Temporal Knowledge Graphs. 20553-20565 - Shiyue Xu, Fu Zhang, Jingwei Cheng, Linfeng Zhou:
MWPO: Enhancing LLMs Performance through Multi-Weight Preference Strength and Length Optimization. 20566-20581 - Alexey Dontsov, Dmitrii Korzh, Alexey Zhavoronkin, Boris Mikheev, Denis Bobkov, Aibek Alanov, Oleg Rogov, Ivan V. Oseledets, Elena Tutubalina:
CLEAR: Character Unlearning in Textual and Visual Modalities. 20582-20603 - John Dougrez-Lewis, Mahmud Elahi Akhter, Federico Ruggeri, Sebastian Löbbers, Yulan He, Maria Liakata:
Assessing the Reasoning Capabilities of LLMs in the context of Evidence-based Claim Verification. 20604-20628 - Stella Verkijk, Piek Vossen, Pia Sommerauer:
Language Models Lack Temporal Generalization and Bigger is Not Better. 20629-20637 - Ying Zhou, Xinyao Wang, Yulei Niu, Yaojie Shen, Lexin Tang, Fan Chen, Ben He, Le Sun, Longyin Wen:
DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models. 20638-20658 - Yifei Wang, Yu Sheng, Linjing Li, Daniel Dajun Zeng:
Uncertainty Unveiled: Can Exposure to More In-context Examples Mitigate Uncertainty for Large Language Models? 20659-20678 - Zihao Cheng, Hongru Wang, Zeming Liu, Yuhang Guo, Yuanfang Guo, Yunhong Wang, Haifeng Wang:
ToolSpectrum: Towards Personalized Tool Utilization for Large Language Models. 20679-20699 - Xiang Huang, Ting-En Lin, Feiteng Fang, Yuchuan Wu, Hangyu Li, Yuzhong Qu, Fei Huang, Yongbin Li:
Reverse Preference Optimization for Complex Instruction Following. 20700-20723 - Jeong Hun Yeo, Hyeongseop Rha, Se Jin Park, Yong Man Ro:
MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens. 20724-20735 - Seungmin Lee, Yongsang Yoo, Minhwa Jung, Min Song:
Def-DTS: Deductive Reasoning for Open-domain Dialogue Topic Segmentation. 20736-20753 - Tiehan Cui, Yanxu Mao, Peipei Liu, Congying Liu, Datao You:
Exploring Jailbreak Attacks on LLMs through Intent Concealment and Diversion. 20754-20768 - Joonwon Jang, Jaehee Kim, Wonbin Kweon, Seonghyeon Lee, Hwanjo Yu:
Verbosity-Aware Rationale Reduction: Sentence-Level Rationale Reduction for Efficient and Effective Reasoning. 20769-20784 - Thushari Atapattu, Menasha Thilakaratne, Duc Nhan Do, Mahen Herath, Katrina E. Falkner:
Exploring the Role of Mental Health Conversational Agents in Training Medical Students and Professionals: A Systematic Literature Review. 20785-20798 - Rin Ashizawa, Yoichi Hirose, Nozomu Yoshinari, Kento Uchida, Shinichi Shirakawa:
Bandit-Based Prompt Design Strategy Selection Improves Prompt Optimizers. 20799-20817 - Jiaming Li, Yukun Chen, Ziqiang Liu, Minghuan Tan, Lei Zhang, Yunshui Li, Run Luo, Longze Chen, Jing Luo, Ahmadreza Argha, Hamid Alinejad-Rokny, Wei Zhou, Min Yang:
STORYTELLER: An Enhanced Plot-Planning Framework for Coherent and Cohesive Story Generation. 20818-20846 - Kaushal Kumar Maurya, KV Aditya Srivatsa, Ekaterina Kochmar:
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models. 20847-20863 - Heng Zhao, Yifei Zhu:
SkyLLM: Cross-LLM-APIs Federation for Cost-effective Query Processing. 20864-20873 - Sara Bourbour Hosseinbeigi, Mohammad Ali Seif Kashani, Javad Seraj, Fatemeh Taherinezhad, Ali Nafisi, Fatemeh Nadi, Iman Barati, Hosein Hasani, Mostafa Amiri, Mostafa Masoudi:
Matina: A Culturally-Aligned Persian Language Model Using Multiple LoRA Experts. 20874-20889 - Birgit Kirsch, Héctor Allende-Cid, Stefan Rüping:
PM3-KIE: A Probabilistic Multi-Task Meta-Model for Document Key Information Extraction. 20890-20912 - Ahmed Lekssays, Utsav Shukla, Husrev Taha Sencar, Md. Rizwan Parvez:
TechniqueRAG: Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text. 20913-20926 - Long Bai, Zixuan Li, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng, Tat-Seng Chua:
G2S: A General-to-Specific Learning Framework for Temporal Knowledge Graph Forecasting with Large Language Models. 20927-20938 - Ziang Ye, Zhenru Zhang, Yang Zhang, Jianxin Ma, Junyang Lin, Fuli Feng:
Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning. 20939-20957 - Jun Rao, Zepeng Lin, Xuebo Liu, Xiaopeng Ke, Lian Lian, Dong Jin, Shengjun Cheng, Jun Yu, Min Zhang:
APT: Improving Specialist LLM Performance with Weakness Case Acquisition and Iterative Preference Training. 20958-20980 - Jingwei Cheng, Chenglong Lu, Linyan Yang, Guoqing Chen, Fu Zhang:
EasyEA: Large Language Model is All You Need in Entity Alignment Between Knowledge Graphs. 20981-20995 - Huangming Xu, Fu Zhang, Jingwei Cheng:
An Adaptive Multi-Threshold Loss and a General Framework for Collaborating Losses in Document-Level Relation Extraction. 20996-21007 - Junru Lu, Jiazheng Li, Guodong Shen, Lin Gui, Siyu An, Yulan He, Di Yin, Xing Sun:
RoleMRC: A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following. 21008-21030 - Junru Wu, Tianhao Shen, Linxi Su, Deyi Xiong:
C²RBench: A Chinese Complex Reasoning Benchmark for Large Language Models. 21031-21050 - Ke Ji, Junying Chen, Anningzhe Gao, Wenya Xie, Xiang Wan, Benyou Wang:
Unlocking LLMs' Self-Improvement Capacity with Autonomous Learning for Domain Adaptation. 21051-21067 - John Hartley, Conor Brian Hamill, Dale Seddon, Devesh Batra, Ramin Okhrati, Raad Khraishi:
How Personality Traits Shape LLM Risk-Taking Behaviour. 21068-21092 - Karin Niederreiter, Dagmar Gromann:
Word-Level Detection of Code-Mixed Hate Speech with Multilingual Domain Transfer. 21093-21104 - Amin Abolghasemi, Leif Azzopardi, Seyyed Hadi Hashemi, Maarten de Rijke, Suzan Verberne:
Evaluation of Attribution Bias in Generator-Aware Retrieval-Augmented Large Language Models. 21105-21124 - Wen Yang, Junhong Wu, Chen Wang, Chengqing Zong, Jiajun Zhang:
Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment. 21125-21147 - Zishan Xu, Shuyi Xie, Qingsong Lv, Shupei Xiao, Linlin Song, Sui Wenjuan, Fan Lin:
Diagnosing Failures in Large Language Models' Answers: Integrating Error Attribution into Evaluation Framework. 21148-21165 - Guangyue Peng, Wei Li, Wen Luo, Houfeng Wang:
Encode Errors: Representational Retrieval of In-Context Demonstrations for Multilingual Grammatical Error Correction. 21166-21180 - Xuemiao Zhang, Liangyu Xu, Feiyu Duan, Yongwei Zhou, Sirui Wang, Rongxiang Weng, Jingang Wang, Xunliang Cai:
Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data. 21181-21198 - Mengyu Ye, Tatsuki Kuribayashi, Goro Kobayashi, Jun Suzuki:
Can Input Attributions Explain Inductive Reasoning in In-Context Learning? 21199-21225 - Jayeol Chun, Nianwen Xue:
Modal Dependency Parsing via Biaffine Attention with Self-Loop. 21226-21238 - Zixiao Wang, Duzhen Zhang, Ishita Agrawal, Shen Gao, Le Song, Xiuying Chen:
Beyond Profile: From Surface-Level Facts to Deep Persona Simulation in LLMs. 21239-21257 - Yilun Qiu, Xiaoyan Zhao, Yang Zhang, Yimeng Bai, Wenjie Wang, Hong Cheng, Fuli Feng, Tat-Seng Chua:
Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization. 21258-21277 - Soyeong Jeong, Kangsan Kim, Jinheon Baek, Sung Ju Hwang:
VideoRAG: Retrieval-Augmented Generation over Video Corpus. 21278-21298 - Weizhen Li, Junbao Huang, Peijie Huang, Yuhong Xu, Jiekun Fan:
Synergistic Augmentation: Enhancing Cross-Domain Zero-Shot Slot Filling with Small Model-Assisted Large Language Models. 21299-21312 - Iglika Nikolova-Stoupak, Maxime Amblard, Sophie Robert-Hayek, Frédérique Rey:
A Classifier of Word-Level Variants in Witnesses of Biblical Hebrew Manuscripts. 21313-21329 - Xiang Hu, Hongyu Fu, Jinge Wang, Yifeng Wang, Zhikun Li, Renjun Xu, Yu Lu, Yaochu Jin, Lili Pan, Zhenzhong Lan:
NOVA: An Iterative Planning Framework for Enhancing Scientific Innovation with Large Language Models. 21330-21359 - Chenyang Bu, Guojie Chang, Zihao Chen, CunYuan Dang, Zhize Wu, Yi He, Xindong Wu:
Query-Driven Multimodal GraphRAG: Dynamic Local Knowledge Graph Construction for Online Reasoning. 21360-21380 - Zhiqiu Xia, Jinxuan Xu, Yuqian Zhang, Hang Liu:
A Survey of Uncertainty Estimation Methods on Large Language Models. 21381-21396 - Yicheng Lang, Kehan Guo, Yue Huang, Yujun Zhou, Haomin Zhuang, Tianyu Yang, Yao Su, Xiangliang Zhang:
Beyond Single-Value Metrics: Evaluating and Enhancing LLM Unlearning with Cognitive Diagnosis. 21397-21420 - Zihan Xu, Haotian Ma, Yihao Ding, Gongbo Zhang, Chunhua Weng, Yifan Peng:
Natural Language Processing in Support of Evidence-based Medicine: A Scoping Review. 21421-21443 - Aishik Nagar, Ishaan Singh Rawal, Mansi Dhanania, Cheston Tan:
How do Transformer Embeddings Represent Compositions? A Functional Analysis. 21444-21461 - Yucheng Cai, Ke Li, Yi Huang, Junlan Feng, Zhijian Ou:
Entriever: Energy-based Retriever for Knowledge-Grounded Dialog Systems. 21462-21474 - Shanshan Liu, Menglong Lu, Zhen Huang, Zejiang He, Liu Liu, Zhigang Sun, Dongsheng Li:
MONTROSE: LLM-driven Monte Carlo Tree Search Self-Refinement for Cross-Domain Rumor Detection. 21475-21487 - Qiancheng Xu, Yongqi Li, Heming Xia, Fan Liu, Min Yang, Wenjie Li:
PEToolLLM: Towards Personalized Tool Learning in Large Language Models. 21488-21503 - Quanwei Tang, Sophia Yat Mei Lee, Junshuang Wu, Dong Zhang, Shoushan Li, Erik Cambria, Guodong Zhou:
A Comprehensive Graph Framework for Question Answering with Mode-Seeking Preference Alignment. 21504-21523 - Firoz Shaik, Mobashir Sadat, Nikita Gautam, Doina Caragea, Cornelia Caragea:
A MISMATCHED Benchmark for Scientific Natural Language Inference. 21524-21538 - Zhou Chen, Zhiqiang Wei, Yuqi Bai, Xue Xiong, Jianmin Wu:
TagRouter: Learning Route to LLMs through Tags for Open-Domain Text Generation Tasks. 21539-21564 - Yihuai Hong, Meng Cao, Dian Zhou, Lei Yu, Zhijing Jin:
The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction. 21565-21585 - Xu Zhao Pan, Pengfei Zhou, Jiaxin Ai, Wangbo Zhao, Kai Wang, Xiaojiang Peng, Wenqi Shao, Hongxun Yao, Kaipeng Zhang:
MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification. 21586-21606 - Tianqi Xu, Linyao Chen, Dai-Jie Wu, Yanjun Chen, Zecheng Zhang, Xiang Yao, Zhiqiang Xie, Yongchao Chen, Shilong Liu, Bochen Qian, Anjie Yang, Zhaoxuan Jin, Jianbo Deng, Philip Torr, Bernard Ghanem, Guohao Li:
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. 21607-21647 - Wenqing Wang, Mingqi Gao, Xinyu Hu, Xiaojun Wan:
Towards A "Novel" Benchmark: Evaluating Literary Fiction with Large Language Models. 21648-21673 - Binghui Li, Minghui Zou, Xiaowang Zhang, Shizhan Chen, Zhiyong Feng:
A Reinforcement Learning Framework for Cross-Lingual Stance Detection Using Chain-of-Thought Alignment. 21674-21688 - Zhiliang Li, Bo Tang, Yijun Niu, Beihong Jin, Qiwen Shi, Yuchen Feng, Zhiyu Li, Jie Hu, Mingchuan Yang, Feiyu Xiong:
CARE-STaR: Constraint-aware Self-taught Reasoner. 21689-21703 - William Berkeley Sheffield, Kanishka Misra, Valentina Pyatkin, Ashwini Deo, Kyle Mahowald, Junyi Jessy Li:
Is It JUST Semantics? A Case Study of Discourse Particle Understanding in LLMs. 21704-21715 - Yibin Chen, Jinyi Liu, Yan Zheng, Yifu Yuan, Jianye Hao:
War of Thoughts: Competition Stimulates Stronger Reasoning in Large Language Models. 21716-21737 - Hoyun Song, Huije Lee, Jisu Shin, Sukmin Cho, Changgeon Ko, Jong C. Park:
Does Rationale Quality Matter? Enhancing Mental Disorder Detection via Selective Reasoning Distillation. 21738-21756 - Naihao Deng, Rada Mihalcea:
Rethinking Table Instruction Tuning. 21757-21780 - Naihao Deng, Kapotaksha Das, Rada Mihalcea, Vitaliy Popov, Mohamed Abouelenien:
CliniDial: A Naturally Occurring Multimodal Dialogue Dataset for Team Reflection in Action During Clinical Operation. 21781-21798 - Ruiqi He, Yushu He, Longju Bai, Jiarui Liu, Zhenjie Sun, Zenghao Tang, He Wang, Hanchen Xia, Rada Mihalcea, Naihao Deng:
Chumor 2.0: Towards Better Benchmarking Chinese Humor Understanding from (Ruo Zhi Ba). 21799-21818 - Raymond Li, Chuyuan Li, Gabriel Murray, Giuseppe Carenini:
Explicit Bayesian Inference to Uncover the Latent Themes of Large Language Models. 21819-21833 - Ann-Sophie Gnehm, Simon Clematide:
Improving Occupational ISCO Classification of Multilingual Swiss Job Postings with LLM-Refined Training Data. 21834-21847 - Soham Poddar, Paramita Koley, Janardan Misra, Niloy Ganguly, Saptarshi Ghosh:
Brevity is the soul of sustainability: Characterizing LLM response lengths. 21848-21864 - Yuanfu Wang, Pengyu Wang, Chenyang Xi, Bo Tang, Junyi Zhu, Wenqiang Wei, Chen Chen, Chao Yang, Jingfeng Zhang, Chaochao Lu, Yijun Niu, Keming Mao, Zhiyu Li, Feiyu Xiong, Jie Hu, Mingchuan Yang:
Adversarial Preference Learning for Robust LLM Alignment. 21865-21881 - Youjeong Noh, Joon-Young Paik, Jingun Kwon, Eun-Sun Cho:
gMBA: Expression Semantic Guided Mixed Boolean-Arithmetic Deobfuscation Using Transformer Architectures. 21882-21888 - Zichao Li, Aizier Abulaiti, Yaojie Lu, Xuanang Chen, Jia Zheng, Hongyu Lin, Xianpei Han, Shanshan Jiang, Bin Dong, Le Sun:
READoc: A Unified Benchmark for Realistic Document Structured Extraction. 21889-21905 - Han Ren, Minna Peng:
TicTac: Time-aware Supervised Fine-tuning for Automatic Text Dating. 21906-21918 - Hao Feng, Shu Wei, Xiang Fei, Wei Shi, Yingdong Han, Lei Liao, Jinghui Lu, Binghong Wu, Qi Liu, Chunhui Lin, Jingqun Tang, Hao Liu, Can Huang:
Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting. 21919-21936 - Yilun Zheng, Sha Li, Fangkun Wu, Yang Ziyi, Lin Hongchao, Zhichao Hu, Cai Xinjun, Ziming Wang, Jinxuan Chen, Sitao Luan, Jiahao Xu, Lihui Chen:
FanChuan: A Multilingual and Graph-Structured Benchmark For Parody Detection and Analysis. 21937-21957 - Dongjun Jang, Youngchae Ahn, Hyopil Shin:
P-CoT: A Pedagogically-motivated Participatory Chain-of-Thought Prompting for Phonological Reasoning in LLMs. 21958-21979 - Wenhao Hu, Jinhao Duan, Chunchen Wei, Li Zhang, Yue Zhang, Kaidi Xu:
DynaCode: A Dynamic Complexity-Aware Code Benchmark for Evaluating Large Language Models in Code Generation. 21980-21997 - Istabrak Abbes, Gabriele Prato, Quentin Fournier, Fernando Rodriguez, Alaa Boukhary, Adam Elwood, Sarath Chandar:
Small Encoders Can Rival Large Decoders in Detecting Groundedness. 21998-22005 - Ahmed Heakl, Muhammad Abdullah Sohail, Mukul Ranjan, Rania Elbadry, Ghazi Shazan Ahmad, Mohamed El-Geish, Omar Maher, Zhiqiang Shen, Fahad Shahbaz Khan, Salman H. Khan:
KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding. 22006-22024 - Shayan Alipour, Indira Sen, Mattia Samory, Tanushree Mitra:
Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness. 22025-22047 - Nathaniel Romney Robinson, Shahd Abdelmoneim, Kelly Marchisio, Sebastian Ruder:
AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic. 22048-22065 - Seok Hwan Song, Mohna Chakraborty, Qi Li, Wallapak Tavanapong:
Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked? 22066-22081 - Arijit Nag, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti:
MutantPrompt: Prompt Optimization via Mutation Under a Budget on Modest-sized LMs. 22082-22092 - Wendi Cui, Jiaxin Zhang, Zhuohang Li, Hao Sun, Damien Lopez, Kamalika Das, Bradley A. Malin, Kumar Sricharan:
Heuristic-based Search Algorithm in Automatic Instruction-focused Prompt Optimization: A Survey. 22093-22111 - Priya Pitre, Naren Ramakrishnan, Xuan Wang:
CONSENSAGENT: Towards Efficient and Effective Consensus in Multi-Agent LLM Interactions Through Sycophancy Mitigation. 22112-22133 - Julius Broomfield, Tom Gibbs, George Ingebretsen, Ethan Kosak-Hine, Tia Nasir, Jason Zhang, Reihaneh Iranmanesh, Sara Pieri, Reihaneh Rabbany, Kellin Pelrine:
The Structural Safety Generalization Problem. 22134-22173 - Amitava Das, Suranjana Trivedy, Danush Khanna, Yaswanth Narsupalli, Basab Ghosh, Rajarshi Roy, Gurpreet Singh, Vinija Jain, Vasu Sharma, Aishwarya Naresh Reganti, Aman Chadha:
DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization. 22174-22270 - Neil Fasching, Yphtach Lelkes:
Model-Dependent Moderation: Inconsistencies in Hate Speech Detection Across LLM-based Systems. 22271-22285 - Subhendu Khatuya, Shashwat Naidu, Saptarshi Ghosh, Pawan Goyal, Niloy Ganguly:
Label-semantics Aware Generative Approach for Domain-Agnostic Multilabel Classification. 22286-22298 - Qingyang Zhu, Xiang Hu, Pengyu Ji, Wei Wu, Kewei Tu:
Unsupervised Morphological Tree Tokenizer. 22299-22312 - Jinyue Feng, Frank Rudzicz:
CausalLink: An Interactive Evaluation Framework for Causal Reasoning. 22313-22326 - Jiarui Liu, Iman Ouzzani, Wenkai Li, Lechen Zhang, Tianyue Ou, Houda Bouamor, Zhijing Jin, Mona T. Diab:
Toward Global AI Inclusivity: A Large-Scale Multilingual Terminology Dataset (GIST). 22327-22360 - Bin Wu, Edgar Meij, Emine Yilmaz:
A Joint Optimization Framework for Enhancing Efficiency of Tool Utilization in LLM Agents. 22361-22373 - Jabez Magomere, Emanuele La Malfa, Manuel Tonneau, Ashkan Kazemi, Scott A. Hale:
When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits. 22374-22404 - Bar Gazit, Shaltiel Shmidman, Avi Shmidman, Yuval Pinter:
Splintering Nonconcatenative Languages for Better Tokenization. 22405-22417 - Yuhao Yang, Yue Wang, Dongxu Li, Ziyang Luo, Bei Chen, Chao Huang, Junnan Li:
Aria-UI: Visual Grounding for GUI Instructions. 22418-22433 - Neemesh Yadav, Jiarui Liu, Francesco Ortu, Roya Ensafi, Zhijing Jin, Rada Mihalcea:
Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing. 22434-22452 - Stefan Vasilev, Christian Herold, Baohao Liao, Seyyed Hadi Hashemi, Shahram Khadivi, Christof Monz:
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation. 22453-22472 - Tuo Zhang, Tiantian Feng, Yibin Ni, Mengqin Cao, Ruying Liu, Kiana Avestimehr, Katharine Butler, Yanjun Weng, Mi Zhang, Shrikanth Narayanan, Salman Avestimehr:
Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding. 22473-22487 - Ofir Zafrir, Igor Margulis, Dorin Shteyman, Shira Guskin, Guy Boudoukh:
FastDraft: How to Train Your Draft. 22488-22505 - Shester Gueuwou, Xiaodan Du, Greg Shakhnarovich, Karen Livescu:
SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale. 22506-22521 - Dang Nguyen, Jian Chen, Yu Wang, Gang Wu, Namyong Park, Zhengmian Hu, Hanjia Lyu, Junda Wu, Ryan Aponte, Yu Xia, Xintong Li, Jing Shi, Hongjie Chen, Viet Dac Lai, Zhouhang Xie, Sungchul Kim, Ruiyi Zhang, Tong Yu, Md. Mehrab Tanjim, Nesreen K. Ahmed, Puneet Mathur, Seunghyun Yoon, Lina Yao, Branislav Kveton, Jihyung Kil, Thien Huu Nguyen, Trung Bui, Tianyi Zhou, Ryan A. Rossi, Franck Dernoncourt:
GUI Agents: A Survey. 22522-22538 - Asma Ben Abacha, Wen-wai Yim, Yujuan Fu, Zhaoyi Sun, Meliha Yetisgen, Fei Xia, Thomas Lin:
MEDEC: A Benchmark for Medical Error Detection and Correction in Clinical Notes. 22539-22550 - Jacob Mitchell Springer, Vaibhav Adlakha, Siva Reddy, Aditi Raghunathan, Marius Mosbach:
Understanding the Influence of Synthetic Data for Text Embedders. 22551-22567 - Anar Yeginbergen, Maite Oronoz, Rodrigo Agerri:
Dynamic Knowledge Integration for Evidence-Driven Counter-Argument Generation with Large Language Models. 22568-22584 - Li Lucy, Camilla Griffiths, Sarah Levine, Jennifer L. Eberhardt, Dorottya Demszky, David Bamman:
Tell, Don't Show: Leveraging Language Models' Abstractive Retellings to Model Literary Themes. 22585-22610 - Eunjeong Hwang, Peter West, Vered Shwartz:
BottleHumor: Self-Informed Humor Explanation using the Information Bottleneck Principle. 22611-22632 - Glenn Matlin, Mika Okamoto, Huzaifa Pardawala, Yang Yang, Sudheer Chava:
Financial Language Model Evaluation (FLaME). 22633-22679 - Nengbo Wang, Xiaotian Han, Jagdip Singh, Jing Ma, Vipin Chaudhary:
CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation. 22680-22693 - Tharindu Kumarage, Ninareh Mehrabi, Anil Ramakrishna, Xinyan Zhao, Richard S. Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta, Charith Peris:
Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation. 22694-22715 - Puxuan Yu, Daniel Cohen, Hemank Lamba, Joel R. Tetreault, Alejandro Jaimes:
Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from LLMs. 22716-22730 - Miguel Romero Calvo, Shuoyang Ding, Corey D. Barrett, Georgiana Dinu, George Karypis:
Beyond instruction-conditioning, MoTE: Mixture of Task Experts for Multi-task Embedding Models. 22731-22746 - Yanfang Zhou, Yuntao Liu, Xiaodong Li, Yongqiang Zhao, Xintong Wang, Jinlong Tian, Zhenyu Li, Xinhai Xu:
Metagent-P: A Neuro-Symbolic Planning Agent with Metacognition for Open Worlds. 22747-22764 - George-Kirollos Saad, Scott Sanner:
Q-STRUM Debate: Query-Driven Contrastive Summarization for Recommendation Comparison. 22765-22782 - Raghav Ramji, Keshav Ramji:
Inductive Linguistic Reasoning with Large Language Models. 22783-22810 - Pengfei Hong, Navonil Majumder, Deepanway Ghosal, Somak Aditya, Rada Mihalcea, Soujanya Poria:
Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions. 22811-22849 - Junyi Xiang, Maofu Liu:
Exploiting Phonetics and Glyph Representation at Radical-level for Classical Chinese Understanding. 22850-22871 - Toan Tran, Ruixuan Liu, Li Xiong:
Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training. 22872-22888 - Ameya Godbole, Robin Jia:
Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics. 22889-22912 - Vihang Pancholi, Jainit Sushil Bafna, Tejas Anvekar, Manish Shrivastava, Vivek Gupta:
TabXEval: Why this is a Bad Table? An eXhaustive Rubric for Table Evaluation. 22913-22934 - Shantanu Ghosh, Rayan Syed, Chenyu Wang, Vaibhav Choudhary, Binxu Li, Clare B. Poynton, Shyam Visweswaran, Kayhan Batmanghelich:
LADDER: Language-Driven Slice Discovery and Error Rectification in Vision Classifiers. 22935-22970 - Sifan Zhou, Shuo Wang, Zhihang Yuan, Mingjia Shi, Yuzhang Shang, Dawei Yang:
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning. 22971-22988 - Gunjan Balde, Soumyadeep Roy, Mainack Mondal, Niloy Ganguly:
Evaluation of LLMs in Medical Text Summarization: The Role of Vocabulary Adaptation in High OOV Settings. 22989-23004 - Fangping Lan, Abdullah Aljebreen, Eduard C. Dragut:
UniT: One Document, Many Revisions, Too Many Edit Intention Taxonomies. 23005-23024 - Xianbing Zhao, Yiqing Lyu, Di Wang, Buzhou Tang:
Predicting Depression in Screening Interviews from Interactive Multi-Theme Collaboration. 23025-23035 - Yihang Yao, Zhepeng Cen, Miao Li, William Han, Yuyou Zhang, Emerson Liu, Zuxin Liu, Chuang Gan, Ding Zhao:
Your Language Model May Think Too Rigidly: Achieving Reasoning Consistency with Symmetry-Enhanced Training. 23036-23052 - Jianling Li, Shangzhan Li, Zhenye Gao, Qi Shi, Yuxuan Li, Zefan Wang, Jiacheng Huang, WangHaojie WangHaojie, Jianrong Wang, Xu Han, Zhiyuan Liu, Maosong Sun:
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators. 23053-23066 - Rahul Garg, Trilok Padhi, Hemang Jain, Ugur Kursuncu, Ponnurangam Kumaraguru:
Just KIDDIN' : Knowledge Infusion and Distillation for Detection of INdecent Memes. 23067-23086 - Weiqi Zeng, Bo Wang, Dongming Zhao, Zongfeng Qu, Ruifang He, Yuexian Hou, Qinghua Hu:
Dynamic Personality in LLM Agents: A Framework for Evolutionary Modeling and Behavioral Analysis in the Prisoner's Dilemma. 23087-23100 - Dylan Zhang, Justin Wang, Tianran Sun:
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity. 23101-23118 - Abdul Waheed, Hanin Atwany, Rita Singh, Bhiksha Raj:
On the Robust Approximation of ASR Metrics. 23119-23146 - Yipeng Kang, Junqi Wang, Yexin Li, Mengmeng Wang, Wenming Tu, Quansen Wang, Hengli Li, Tingjun Wu, Xue Feng, Fangwei Zhong, Zilong Zheng:
Are the Values of LLMs Structurally Aligned with Humans? A Causal Perspective. 23147-23161 - Xinxin Li, Huiyao Chen, Chengjun Liu, Jing Li, Meishan Zhang, Jun Yu, Min Zhang:
LLMs Can Also Do Well! Breaking Barriers in Semantic Role Labeling via Large Language Models. 23162-23180 - Hanin Atwany, Abdul Waheed, Rita Singh, Monojit Choudhury, Bhiksha Raj:
Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models. 23181-23203 - Yanfang Zhou, Xiaodong Li, Yuntao Liu, Yongqiang Zhao, Xintong Wang, Zhenyu Li, Jinlong Tian, Xinhai Xu:
M2PA: A Multi-Memory Planning Agent for Open Worlds Inspired by Cognitive Theory. 23204-23220 - Ming Wang, Peidong Wang, Lin Wu, Xiaocui Yang, Daling Wang, Shi Feng, Yuxin Chen, Bixuan Wang, Yifei Zhang:
AnnaAgent: Dynamic Evolution Agent System with Multi-Session Memory for Realistic Seeker Simulation. 23221-23235 - Dylan Zhang, Justin Wang, François Charton:
Diversification Catalyzes Language Models' Instruction Generalization To Unseen Semantics. 23236-23249 - Zeyu Gao, Yuxin Cui, Hao Wang, Siliang Qin, Yuanda Wang, Bolun Zhang, Chao Zhang:
DecompileBench: A Comprehensive Benchmark for Evaluating Decompilers in Real-World Scenarios. 23250-23267 - Xiaoqing Zhang, Yuhan Liu, Flood Sung, Xiuying Chen, Shuo Shang, Rui Yan:
Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement. 23268-23281 - Yuchen Wu, Liang Ding, Li Shen, Dacheng Tao:
Edit Once, Update Everywhere: A Simple Framework for Cross-Lingual Knowledge Synchronization in LLMs. 23282-23302 - Fengqing Jiang, Zhangchen Xu, Yuetai Li, Luyao Niu, Zhen Xiang, Bo Li, Bill Yuchen Lin, Radha Poovendran:
SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities. 23303-23320 - Sigang Luo, Yinan Liu, Dongying Lin, Yingying Zhai, Bin Wang, Xiaochun Yang, Junpeng Liu:
ETRQA: A Comprehensive Benchmark for Evaluating Event Temporal Reasoning Abilities of Large Language Models. 23321-23339 - Yuji Zhang, Sha Li, Cheng Qian, Jiateng Liu, Pengfei Yu, Chi Han, Yi R. Fung, Kathleen McKeown, ChengXiang Zhai, Manling Li, Heng Ji:
The Law of Knowledge Overshadowing: Towards Understanding, Predicting and Preventing LLM Hallucination. 23340-23358 - Fei Yuan, Yinquan Lu, Lei Li, Jingjing Xu:
LegoMT2: Selective Asynchronous Sharded Data Parallel Training for Massive Neural Machine Translation. 23359-23376 - Yiran Zhao, Guizhen Chen, Kenji Kawaguchi, Lidong Bing, Wenxuan Zhang:
Pruning General Large Language Models into Customized Expert Models. 23377-23391 - Xiaoxin Lu, Ranran Haoran Zhang, Yusen Zhang, Rui Zhang:
Enhance Multimodal Consistency and Coherence for Text-Image Plan Generation. 23392-23409 - Metehan Oguz, Yavuz Faruk Bakman, Duygu Nur Yaldiz:
Un-considering Contextual Information: Assessing LLMs' Understanding of Indexical Elements. 23410-23427 - Jan Trienes, Jörg Schlötterer, Junyi Jessy Li, Christin Seifert:
Behavioral Analysis of Information Salience in Large Language Models. 23428-23454 - Avinash Baidya, Kamalika Das, Xiang Gao:
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs. 23455-23472 - Gurusha Juneja, Gautam Jajoo, Hua Li, Jian Jiao, Nagarajan Natarajan, Amit Sharma:
Task Facet Learning: A Structured Approach To Prompt Optimization. 23473-23496 - Junlong Tong, Jinlan Fu, Zixuan Lin, Yingqi Fan, Anhao Zhao, Hui Su, Xiaoyu Shen:
LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding. 23497-23517 - Amitava Das, Yaswanth Narsupalli, Gurpreet Singh, Vinija Jain, Vasu Sharma, Suranjana Trivedy, Aman Chadha, Amit P. Sheth:
YinYang-Align: A new Benchmark for Competing Objectives and Introducing Multi-Objective Preference based Text-to-Image Alignment. 23518-23598 - Divya Jyoti Bajpai, Manjesh Kumar Hanawal:
FREE: Fast and Robust Vision Language Models with Early Exits. 23599-23615 - Chuxuan Hu, Liyun Zhang, Yeji Lim, Aum Wadhwani, Austin Peters, Daniel Kang:
REPRO-Bench: Can Agentic AI Systems Assess the Reproducibility of Social Science Research? 23616-23626 - Sara Ghaboura, Ketan Pravin More, Ritesh Thawkar, Wafa Al Ghallabi, Omkar Thawakar, Fahad Shahbaz Khan, Hisham Cholakkal, Salman H. Khan, Rao Muhammad Anwer:
Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts. 23627-23641 - Huashan Sun, Yizhe Yang, Yinghao Li, Jiawei Li, Yang Gao:
Unveiling and Addressing Pseudo Forgetting in Large Language Models. 23642-23658 - Yupu Liang, Yaping Zhang, Zhiyang Zhang, Zhiyuan Chen, Yang Zhao, Lu Xiang, Chengqing Zong, Yu Zhou:
Improving MLLM's Document Image Machine Translation via Synchronously Self-reviewing Its OCR Proficiency. 23659-23678 - Supriya Bajpai, Athira Gopal, Chandrakant Harjpal, Niraj Kumar:
HG-InsightLog: Context Prioritization and Reduction for Question Answering with Non-Natural Language Construct Log Data. 23679-23695 - Antonios Dimakis, John Pavlopoulos, Antonios Anastasopoulos:
Dialect Normalization using Large Language Models and Morphological Rules. 23696-23714 - Mounika Marreddy, Subba Reddy Oota, Venkata Charan Chinni, Manish Gupta, Lucie Flek:
USDC: A Dataset of \underlineUser \underlineStance and \underlineDogmatism in Long \underlineConversations. 23715-23759 - Eunki Kim, Sangryul Kim, James Thorne:
Learning to Insert [PAUSE] Tokens for Better Reasoning. 23760-23777 - Settaluri Lakshmi Sravanthi, Kishan Maharaj, Sravani Gunnu, Abhijit Mishra, Pushpak Bhattacharyya:
Understand the Implication: Learning to Think for Pragmatic Understanding. 23778-23790 - Xinyang Lu, Jingtan Wang, Zitong Zhao, Zhongxiang Dai, Chuan-Sheng Foo, See-Kiong Ng, Bryan Kian Hsiang Low:
WASA: WAtermark-based Source Attribution for Large Language Model-Generated Data. 23791-23824 - Prayas Agrawal, Nandeesh Kumar, Muthusamy Chelliah, Surender Kumar, Soumen Chakrabarti:
Dense Retrieval with Quantity Comparison Intent. 23825-23839 - Yigeng Zhou, Wu Li, Yifan Lu, Jing Li, Fangming Liu, Meishan Zhang, Yequan Wang, Daojing He, Honghai Liu, Min Zhang:
Reflection on Knowledge Graph for Large Language Models Reasoning. 23840-23857 - Jiahe Jin, Yanheng He, Mingyan Yang:
Revisiting 3D LLM Benchmarks: Are We Really Testing 3D Capabilities? 23858-23869 - Ben Ganon, Alon Zolfi, Omer Hofman, Inderjeet Singh, Hisashi Kojima, Yuval Elovici, Asaf Shabtai:
DIESEL: A Lightweight Inference-Time Safety Enhancement for Language Models. 23870-23890 - Jiawei Gu, Ziting Xian, Yuanzhen Xie, Ye Liu, Enjie Liu, Ruichao Zhong, Mochi Gao, Yunzhi Tan, Bo Hu, Zang Li:
Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience. 23891-23910 - Hieu Trung Nguyen, Bao Nguyen, Viet Anh Nguyen:
Structured Pruning for Diverse Best-of-N Reasoning Optimization. 23911-23922 - Yujia Xiao, Lei He, Haohan Guo, Fenglong Xie, Tan Lee:
PodAgent: A Comprehensive Framework for Podcast Generation. 23923-23937 - Wenhao Liu, Zhenyi Lu, Xinyu Hu, Jerry Zhang, Dailin Li, Jiacheng Cen, Huilin Cao, Haiteng Wang, Yuhan Li, Kun Xie, Dandan Li, Pei Zhang, Chengbo Zhang, Yuxiang Ren, Xiaohong Huang, Yan Ma:
STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent Framework. 23938-23958 - Jiaze Li, Yaya Shi, Zongyang Ma, Haoran Xu, Yandong Bai, Huihui Xiao, Ruiwen Kang, Fan Yang, Tingting Gao, Di Zhang:
iMOVE : Instance-Motion-Aware Video Understanding. 23959-23975 - Simeon Junker, Sina Zarrieß:
SceneGram: Conceptualizing and Describing Tangrams in Scene Context. 23976-23992 - Chengwei Qin, Wenhan Xia, Tan Wang, Fangkai Jiao, Yuchen Hu, Bosheng Ding, Ruirui Chen, Shafiq Joty:
Relevant or Random: Can LLMs Truly Perform Analogical Reasoning? 23993-24010 - Shu Zhou, Yunyang Xuan, Yuxuan Ao, Xin Wang, Tao Fan, Hao Wang:
MERIT: Multi-Agent Collaboration for Unsupervised Time Series Representation Learning. 24011-24028 - Chang Gao, Wenxuan Zhang, Guizhen Chen, Wai Lam:
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning. 24029-24055 - Hongliang Li, Jiaxin Zhang, Wenhui Liao, Dezhi Peng, Kai Ding, Lianwen Jin:
RedundancyLens: Revealing and Exploiting Visual Token Processing Redundancy for Efficient Decoder-Only MLLMs. 24056-24067 - Mufan Xu, Gewen Liang, Kehai Chen, Wei Wang, Xun Zhou, Muyun Yang, Tiejun Zhao, Min Zhang:
Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning. 24068-24084 - Qihuang Zhong, Liang Ding, Xiantao Cai, Juhua Liu, Bo Du, Dacheng Tao:
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance. 24085-24100 - Simeon Junker, Manar Ali, Larissa Koch, Sina Zarrieß, Hendrik Buschmeier:
Are Multimodal Large Language Models Pragmatically Competent Listeners in Simple Reference Resolution Tasks? 24101-24109 - Chaojie Wang, Haonan Shi, Long Tian, Bo An, Shuicheng Yan:
Removing Prompt-template Bias in Reinforcement Learning from Human Feedback. 24110-24122 - Jingwang Huang, Jiang Zhong, Qin Lei, Gaojinpeng Gaojinpeng, Ymyang Ymyang, Sirui Wang, PeiguangLi PeiguangLi, Kaiwen Wei:
Latent Distribution Decouple for Uncertain-Aware Multimodal Multi-label Emotion Recognition. 24123-24138 - Yuhang Zhou, Yuchen Ni, Zhiheng Xi, Zhangyue Yin, Yu He, Gan Yunhui, Xiang Liu, Zhang Jian, Sen Liu, Xipeng Qiu, Yixin Cao, Guangnan Ye, Hongfeng Chai:
Are LLMs Rational Investors? A Study on the Financial Bias in LLMs. 24139-24173 - Dan Oneata, Desmond Elliott, Stella Frank:
Seeing What Tastes Good: Revisiting Multimodal Distributional Semantics in the Billion Parameter Era. 24174-24191 - Sajjad Ghiasvand, Yifan Yang, Zhiyu Xue, Mahnoosh Alizadeh, Zheng Zhang, Ramtin Pedarsani:
Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models. 24192-24207 - Zak Hussain, Rui Mata, Dirk U. Wulff:
A rebuttal of two common deflationary stances against LLM cognition. 24208-24213 - Giovanni Sullutrone, Riccardo Amerigo Vigliermo, Sonia Bergamaschi, Luca Sala:
COVER: Context-Driven Over-Refusal Verification in LLMs. 24214-24229 - Matthieu Dubois, François Yvon, Pablo Piantanida:
MOSAIC: Multiple Observers Spotting AI Content. 24230-24247 - Neil De La Fuente, Oscar Sainz, Iker García-Ferrero, Eneko Agirre:
GUIDEX: Guided Synthetic Data Generation for Zero-Shot Information Extraction. 24248-24262 - Indira Sen, Marlene Lutz, Elisa Rogers, David García, Markus Strohmaier:
Missing the Margins: A Systematic Literature Review on the Demographic Representativeness of LLMs. 24263-24289 - Omkar Thawakar, Dinura Dissanayake, Ketan Pravin More, Ritesh Thawkar, Ahmed Heakl, Noor Ahsan, Yuhao Li, Mohammed Zumri, Jean Lahoud, Rao Muhammad Anwer, Hisham Cholakkal, Ivan Laptev, Mubarak Shah, Fahad Shahbaz Khan, Salman H. Khan:
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs. 24290-24315 - Yingjin Song, Yupei Du, Denis Paperno, Albert Gatt:
Burn After Reading: Do Multimodal Large Language Models Truly Capture Order of Events in Image Sequences? 24316-24342 - Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, Wang Chen, Wei Zhang, Anh Tuan Luu:
Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning. 24343-24356 - Yanran Chen, Steffen Eger:
Do Emotions Really Affect Argument Convincingness? A Dynamic Approach with LLM-based Manipulation Checks. 24357-24381 - Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, Wang Chen, Wei Zhang, Anh Tuan Luu:
SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation. 24382-24394 - Wenxi Li, Xihao Wang, Weiwei Sun:
Compositional Syntactico-SemBanking for English as a Second or Foreign Language. 24395-24406 - Minal Nitin Dani, Aishwarya Maheswaran, Maunendra Sankar Desarkar:
Semantics-aware prompting for translating NOtices To AirMen. 24407-24417 - Anjali Kantharuban, Jeremiah Milbauer, Maarten Sap, Emma Strubell, Graham Neubig:
Stereotype or Personalization? User Identity Biases Chatbot Recommendations. 24418-24436 - Ankita Gupta, Marisa Hudspeth, Polly Stokes, Jacquie Kurland, Brendan T. O'Connor:
Automated main concept generation for narrative discourse assessment in aphasia. 24437-24451 - Mong Yuan Sim, Wei Emma Zhang, Xiang Dai, Biaoyan Fang:
Can VLMs Actually See and Read? A Survey on Modality Collapse in Vision-Language Models. 24452-24470 - Narjis Asad, Nihar Ranjan Sahoo, Rudra Murthy, Swaprava Nath, Pushpak Bhattacharyya:
"You are Beautiful, Body Image Stereotypes are Ugly!" BIStereo: A Benchmark to Measure Body Image Stereotypes in Language Models. 24471-24496 - Zhengliang Shi, Yuhan Wang, Lingyong Yan, Pengjie Ren, Shuaiqiang Wang, Dawei Yin, Zhaochun Ren:
Retrieval Models Aren't Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models. 24497-24524 - Lasse M. Jantsch, Dong-Jae Koh, Seonghwan Yoon, Jisu Lee, Anne Lauscher, Young-Kyoon Suh:
FineCite: A Novel Approach For Fine-Grained Citation Context Analysis. 24525-24542 - Changyue Wang, Weihang Su, Qingyao Ai, Yujia Zhou, Yiqun Liu:
Decoupling Reasoning and Knowledge Injection for In-Context Knowledge Editing. 24543-24562 - Tianqiang Yan, Ziqiao Lin, Lin Zhang, Zhenglong Sun, Yuan Gao:
Entrospect: Information-Theoretic Self-Reflection Elicits Better Response Refinement of Small Language Models. 24563-24577 - Riya Sawhney, Samrat Yadav, Indrajit Bhattacharya, Mausam:
Iterative Repair with Weak Verifiers for Few-shot Transfer in KBQA with Unanswerability. 24578-24596 - San Kim, Jonghwi Kim, Yejin Jeon, Gary Geunbae Lee:
Safeguarding RAG Pipelines with GMTP: A Gradient-based Masked Token Probability Method for Poisoned Document Detection. 24597-24614 - Heejae Suh, Yejin Jeon, Deokhyung Kang, Taehee Park, Yejin Min, Gary Geunbae Lee:
EnSToM: Enhancing Dialogue Systems with Entropy-Scaled Steering Vectors for Topic Maintenance. 24615-24631 - Zhiqian Qin, Yuanfeng Song, Jinwei Lu, Yuanwei Song, Shuaimin Li, Chen Jason Zhang:
MultiTEND: A Multilingual Benchmark for Natural Language to NoSQL Query Translation. 24632-24657 - Xiaobo Liang, Wenjin Xie, Juntao Li, Wanfu Wang, Yibin Chen, Kehai Chen, Min Zhang:
Tool learning via Inference-time Scaling and Cycle Verifier. 24658-24671 - Jane Pan, Ryan Shar, Jacob Pfau, Ameet Talwalkar, He He, Valerie Chen:
When Benchmarks Talk: Re-Evaluating Code LLMs with Interactive Feedback. 24672-24700 - Narutatsu Ri, Nicholas Deas, Kathleen McKeown:
Reranking-based Generation for Unbiased Perspective Summarization. 24701-24723 - Siyuan Fang, Kaijing Ma, Tianyu Zheng, Xeron Du, Ningxuan Lu, Ge Zhang, Qingkun Tang:
KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model's Reasoning Path Aggregation. 24724-24746 - Yibo Zhao, Jiapeng Zhu, Can Xu, Yao Liu, Xiang Li:
Enhancing LLM-based Hatred and Toxicity Detection with Meta-Toxic Knowledge Graph. 24747-24760 - Ngoc Bui, Hieu Trung Nguyen, Shantanu Kumar, Julian Theodore, Weikang Qiu, Viet Anh Nguyen, Rex Ying:
Mixture-of-Personas Language Models for Population Simulation. 24761-24778 - Baohao Liao, Christian Herold, Seyyed Hadi Hashemi, Stefan Vasilev, Shahram Khadivi, Christof Monz:
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning. 24779-24804 - Miao Li, Jey Han Lau, Eduard H. Hovy, Mirella Lapata:
Decomposed Opinion Summarization with Verified Aspect-Aware Modules. 24805-24841 - Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen:
Token-Budget-Aware LLM Reasoning. 24842-24855 - Ping Gong, Jiawei Yi, Shengnan Wang, Juncheng Zhang, Zewen Jin, Ouxiang Zhou, Ruibo Liu, Guanbin Xu, Youhui Bai, Bowen Ye, Kun Yuan, Tong Yang, Gong Zhang, Renhai Chen, Feng Wu, Cheng Li:
HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference. 24856-24871 - Shota Takashiro, Takeshi Kojima, Andrew Gambardella, Qi Cao, Yusuke Iwasawa, Yutaka Matsuo:
Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning. 24872-24885 - Kaiyuan Guan, Ruoxin Li, Xudong Guo, Zhenning Huang, Xudong Weng, Hehuan Liu, Zheng Wei, Zang Li:
LIST: Linearly Incremental SQL Translator for Single-Hop Reasoning, Generation and Verification. 24886-24897 - Guanqun Bi, Zhuang Chen, Zhoufu Liu, Hongkai Wang, Xiyao Xiao, Yuqiang Xie, Wen Zhang, Yongkang Huang, Yuxuan Chen, Libiao Peng, Minlie Huang:
MAGI: Multi-Agent Guided Interview for Psychiatric Assessment. 24898-24921 - Shahriar Kabir Nahin, Rabindra Nath Nandi, Sagor Sarker, Quazi Sarwar Muhtaseem, Md. Kowsher, Apu Chandraw Shill, Md Ibrahim, Mehadi Hasan Menon, Tareq Al Muntasir, Firoj Alam:
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking. 24922-24940 - Negar Foroutan, Angelika Romanou, Matin Ansaripour, Julian Martin Eisenschlos, Karl Aberer, Rémi Lebret:
WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts. 24941-24958 - Chan-Jan Hsu, Yi-Chang Chen, Feng-Ting Liao, Pei-Chen Ho, Yu-Hsiang Wang, Po-Chun Hsu, Da-shan Shiu:
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Robust and Instruction-Aware ASR and OCR. 24959-24973 - Bosi Wen, Pei Ke, Yufei Sun, Cunxiang Wang, Xiaotao Gu, Jinfeng Zhou, Jie Tang, Hongning Wang, Minlie Huang:
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators. 24974-25007 - Zafarullah Mahmood, Soliman Ali, Jiading Zhu, Mohamed Abdelwahab, Michelle Yu Collins, Sihan Chen, Yi Cheng Zhao, Jodi Wolff, Osnat C. Melamed, Nadia Minian, Marta Maslej, Carolynne Cooper, Matt Ratto, Peter Selby, Jonathan Rose:
A Fully Generative Motivational Interviewing Counsellor Chatbot for Moving Smokers Towards the Decision to Quit. 25008-25043 - Kangda Wei, Xi Shi, Jonathan Tong, Sai Ramana Reddy, Anandhavelu Natarajan, Rajiv Jain, Aparna Garimella, Ruihong Huang:
LegalCore: A Dataset for Event Coreference Resolution in Legal Documents. 25044-25059 - Ayana Niwa, Masahiro Kaneko, Kentaro Inui:
Rectifying Belief Space via Unlearning to Harness LLMs' Reasoning. 25060-25075 - Gitanjali Kumari, Jitendra Solanki, Asif Ekbal:
MemeDetoxNet: Balancing Toxicity Reduction and Context Preservation. 25076-25098 - Wichayaporn Wongkamjan, Yanze Wang, Feng Gu, Denis Peskoff, Jonathan K. Kummerfeld, Jonathan May, Jordan Lee Boyd-Graber:
Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL. 25099-25113 - Jingcheng Hu, Houyi Li, Yinmin Zhang, Zili Wang, Shuigeng Zhou, Xiangyu Zhang, Heung-Yeung Shum:
Multi-matrix Factorization Attention. 25114-25126 - Tergel Munkhbat, Namgyu Ho, Seo Hyun Kim, Yongjin Yang, Yujin Kim, Se-Young Yun:
Self-Training Elicits Concise Reasoning in Large Language Models. 25127-25152 - Yinlong Xu, Yanzhao Zheng, Shuoshuo Sun, Shuaihan Huang, Baohua Dong, Hangcheng Zhu, Ruohui Huang, Gang Yu, Hongxia Xu, Jian Wu:
Reason from Future: Reverse Thought Chain Enhances LLM Reasoning. 25153-25166 - Marcus Tantakoun, Christian Muise, Xiaodan Zhu:
LLMs as Planning Formalizers: A Survey for Leveraging Large Language Models to Construct Automated Planning Models. 25167-25188 - Elham Aghakhani, Lu Wang, Karla T. Washington, George Demiris, Jina Huh-Yoo, Rezvaneh Rezapour:
From Conversation to Automation: Leveraging LLMs for Problem-Solving Therapy Analysis. 25189-25207 - Yiwei Li, Ji Zhang, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Jiayi Shi, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li:
Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation. 25208-25223 - Yukai Zhou, Jian Lou, Zhijie Huang, Zhan Qin, Sibei Yang, Wenjie Wang:
Don't Say No: Jailbreaking LLM by Suppressing Refusal. 25224-25249 - Settaluri Lakshmi Sravanthi, Ankit Mishra, Debjyoti Mondal, Subhadarshi Panda, Rituraj Singh, Pushpak Bhattacharyya:
From Perception to Reasoning: Enhancing Vision-Language Models for Mobile UI Understanding. 25250-25269 - Sarah Ruth Brogden Payne, Jordan Kodner:
Lemmas Matter, But Not Like That: Predictors of Lemma-Based Generalization in Morphological Inflection. 25270-25286 - Ming Li, Pei Chen, Chenguang Wang, Hongyu Zhao, Yijun Liang, Yupeng Hou, Fuxiao Liu, Tianyi Zhou:
Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning. 25287-25318 - Yucheng Zhou, Lingran Song, Jianbing Shen:
MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration. 25319-25333 - Zhixun Chen, Ming Li, Yuxuan Huang, Yali Du, Meng Fang, Tianyi Zhou:
ATLAS: Agent Tuning via Learning Critical Steps. 25334-25349 - Vicky Xefteri, Tim Vieira, Ryan Cotterell, Afra Amini:
Syntactic Control of Language Models by Posterior Inference. 25350-25365 - Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran:
Small Models Struggle to Learn from Strong Reasoners. 25366-25394 - Barrett Martin Lattimer, Varun Prashant Gangal, Ryan McDonald, Yi Yang:
Sparse Rewards Can Self-Train Dialogue Agents. 25395-25413 - Shoumik Saha, Soheil Feizi:
Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing. 25414-25431 - Guillermo Marco, Julio Gonzalo, Víctor Fresno:
The Reader is the Metric: How Textual Features and Reader Profiles Explain Conflicting Evaluations of AI Creative Writing. 25432-25449 - Anguo Li, Lei Yu:
Summary Factual Inconsistency Detection Based on LLMs Enhanced by Universal Information Extraction. 25450-25465 - Brihi Joshi, Keyu He, Sahana Ramnath, Sadra Sabouri, Kaitlyn Zhou, Souti Chattopadhyay, Swabha Swayamdipta, Xiang Ren:
ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations. 25466-25499 - Xiaoyue Wang, Xin Liu:
Beyond Generation: Leveraging LLM Creativity to Overcome Label Bias in Classification. 25500-25506 - Xintong Wang, Jingheng Pan, Liang Ding, Longyue Wang, Longqin Jiang, Xingshan Li, Chris Biemann:
CogSteer: Cognition-Inspired Selective Layer Intervention for Efficiently Steering Large Language Models. 25507-25522 - Aaditya Bodke, Avinoor Singh Kohli, Hemant Subhash Pardeshi, Prathamesh Bhosale:
PASTEL : Polarity-Aware Sentiment Triplet Extraction with LLM-as-a-Judge. 25523-25533 - Vincent Siu, Nicholas Crispino, Zihao Yu, Sam Pan, Zhun Wang, Yang Liu, Dawn Song, Chenguang Wang:
COSMIC: Generalized Refusal Direction Identification in LLM Activations. 25534-25553 - Yifan Jiang, Kriti Aggarwal, Tanmay Laud, Kashif Munir, Jay Pujara, Subhabrata Mukherjee:
Red Queen: Exposing Latent Multi-Turn Risks in Large Language Models. 25554-25591 - Joseph J. Peper, Wenzhao Qiu, Ali Payani, Lu Wang:
MDBench: A Synthetic Multi-Document Reasoning Benchmark Generated with Knowledge Guidance. 25592-25621 - Weijieying Ren, Tianxiang Zhao, Lei Wang, Tianchun Wang, Vasant G. Honavar:
DiaLLMs: EHR-Enhanced Clinical Conversational System for Clinical Test Recommendation and Diagnosis Prediction. 25622-25635 - Lingjun Zhao, Mingyang Xie, Paola Cascante-Bonilla, Hal Daumé III, Kwonjoon Lee:
Can Hallucination Correction Improve Video-Language Alignment? 25636-25646 - Yusuke Sakai, Takumi Goto, Taro Watanabe:
IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator. 25647-25654 - Chenjun Xu, Bingbing Wen, Bin Han, Robert Wolfe, Lucy Lu Wang, Bill Howe:
Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs. 25655-25672 - Yongsen Zheng, Zongxuan Xie, Guohua Wang, Ziyao Liu, Liang Lin, Kwok-Yan Lam:
Why Multi-Interest Fairness Matters: Hypergraph Contrastive Multi-Interest Learning for Fair Conversational Recommender System. 25673-25684 - Yizhou Wang, Lingzhi Zhang, Yue Bai, Mang Tik Chiu, Zhengmian Hu, Mingyuan Zhang, Qihua Dong, Yu Yin, Sohrab Amirghodsi, Yun Fu:
Cautious Next Token Prediction. 25685-25697 - Haoyu Han, Yaochen Xie, Hui Liu, Xianfeng Tang, Sreyashi Nag, William Headden, Yang Li, Chen Luo, Shuiwang Ji, Qi He, Jiliang Tang:
Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning. 25698-25714 - Hongda Sun, Jiaren Peng, Wenzhong Yang, Liang He, Bo Du, Rui Yan:
Enhancing Medical Dialogue Generation through Knowledge Refinement and Dynamic Prompt Adjustment. 25715-25726 - Kristian Kuznetsov, Laida Kushnareva, Anton Razzhigaev, Polina Druzhinina, Anastasia Voznyuk, Irina Piontkovskaya, Evgeny Burnaev, Serguei Barannikov:
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders. 25727-25748 - Frank Palma Gomez, Alla Rozovskaya:
Low-Resource Grammatical Error Correction: Selective Data Augmentation with Round-Trip Machine Translation. 25749-25770 - Hope Schroeder, Deb Roy, Jad Kabbara:
Just Put a Human in the Loop? Investigating LLM-Assisted Annotation for Subjective Tasks. 25771-25795 - Bertram Højer, Terne Sasha Thorn Jakobsen, Anna Rogers, Stefan Heinrich:
Research Community Perspectives on "Intelligence" and Large Language Models. 25796-25812 - Sina J. Semnani, Pingyue Zhang, Wanyue Zhai, Haozhuo Li, Ryan Beauchamp, Trey Billing, Katayoun Kishi, Manling Li, Monica S. Lam:
LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World. 25813-25852 - Aochong Oliver Li, Tanya Goyal:
Memorization vs. Reasoning: Updating LLMs with New Knowledge. 25853-25874 - Sandeep Kumar, Abhijit A. Nargund, Vivek Sridhar:
CourtEval: A Courtroom-Based Multi-Agent Evaluation Framework. 25875-25887 - Edison Marrese-Taylor, Erica K. Shimomoto, Alfredo Solano, Enrique Reid:
Multilingual Definition Modeling. 25888-25906 - Tiffany Zhu, Iain Weissburg, Kexun Zhang, William Yang Wang:
Human Bias in the Face of AI: Examining Human Judgment Against Text Labeled as AI Generated. 25907-25914 - Hayato Tsukagoshi, Ryohei Sasano:
Redundancy, Isotropy, and Intrinsic Dimensionality of Prompt-based Text Embeddings. 25915-25930 - Samuel S. Sohn, Sten Knutsen, Karin Stromswold:
Harnessing Whisper for Prosodic Stress Analysis. 25931-25942 - Minju Kim, Dongje Yoo, Yeonjun Hwang, Minseok Kang, Namyoung Kim, Minju Gwak, Beong-woo Kwak, Hyungjoo Chae, Harim Kim, Yunjoong Lee, Min Hee Kim, Dayi Jung, Kyong-Mee Chung, Jinyoung Yeo:
Can You Share Your Story? Modeling Clients' Metacognition and Openness for LLM Therapist Evaluation. 25943-25962 - Haruki Sakajo, Yusuke Ide, Justin Vasselli, Yusuke Sakai, Yingtao Tian, Hidetaka Kamigaito, Taro Watanabe:
Dictionaries to the Rescue: Cross-Lingual Vocabulary Transfer for Low-Resource Languages Using Bilingual Dictionaries. 25963-25976 - Dayoon Ko, Jinyoung Kim, Sohyeon Kim, Jinhyuk Kim, Jaehoon Lee, Seonghak Song, Minyoung Lee, Gunhee Kim:
When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR. 25977-25996 - Abraham Israeli, Shuai Liu, Jonathan May, David Jurgens:
The Million Authors Corpus: A Cross-Lingual and Cross-Domain Wikipedia Dataset for Authorship Verification. 25997-26017 - Seungwoo Choi, Gahyun Yoo, Jay-Yoon Lee:
BridG MT: Enhancing LLMs' Machine Translation Capabilities with Sentence Bridging and Gradual MT. 26018-26042 - Mengkang Hu, Tianxing Chen, Yude Zou, Yuheng Lei, Qiguang Chen, Ming Li, Yao Mu, Hongyuan Zhang, Wenqi Shao, Ping Luo:
Text2World: Benchmarking Large Language Models for Symbolic World Model Generation. 26043-26066 - Kyusik Kim, Jeongwoo Ryu, Hyeonseok Jeon, Bongwon Suh:
Blinded by Context: Unveiling the Halo Effect of MLLM in AI Hiring. 26067-26113 - Boxuan Zhang, Ruqi Zhang:
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought. 26114-26133 - Sam Lin, Wenyue Hua, Lingyao Li, Zhenting Wang, Yongfeng Zhang:
ADO: Automatic Data Optimization for Inputs in LLM Prompts. 26134-26146 - Wonje Jeung, Dongjae Jeon, Ashkan Yousefpour, Jonghyun Choi:
Large Language Models Still Exhibit Bias in Long Text. 26147-26169 - Qiyue Gao, Xinyu Pi, Kevin Liu, Junrong Chen, Ruolan Yang, Xinqi Huang, Xinyu Fang, Lu Sun, Gautham Kishore, Bo Ai, Stone Tao, Mengyang Liu, Jiaxi Yang, Chao-Jung Lai, Chuanyang Jin, Jiannan Xiang, Benhao Huang, Zeming Chen, David Danks, Hao Su, Tianmin Shu, Ziqiao Ma, Lianhui Qin, Zhiting Hu:
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation. 26170-26195 - Ivoline C. Ngong, Swanand Ravindra Kadhe, Hao Wang, Keerthiram Murugesan, Justin D. Weisz, Amit Dhurandhar, Karthikeyan Natesan Ramamurthy:
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents. 26196-26220 - Ke Ji, Yixin Lian, Linxu Li, Jingsheng Gao, Weiyuan Li, Bin Dai:
Enhancing Persona Consistency for LLMs' Role-Playing using Persona-Aware Contrastive Learning. 26221-26238 - Mingyang Zhou, Lingyu Zhang, Sophia Horng, Maximillian Chen, Kung-Hsiang Huang, Shih-Fu Chang:
M²-TabFact: Multi-Document Multi-Modal Fact Verification with Visual and Textual Representations of Tabular Data. 26239-26256 - Maximilian Holsman, Yukun Huang, Bhuwan Dhingra:
Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff. 26257-26273 - Wei Fang, Yang Zhang, Kaizhi Qian, James R. Glass, Yada Zhu:
PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play. 26274-26290 - Romain Puech, Jakub Macina, Julia Chatain, Mrinmaya Sachan, Manu Kapur:
Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure. 26291-26311 - Jisu Shin, Juhyun Oh, Eunsu Kim, Hoyun Song, Alice Oh:
Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation. 26312-26332 - Chengzhi Zhong, Qianying Liu, Fei Cheng, Junfeng Jiang, Zhen Wan, Chenhui Chu, Yugo Murawaki, Sadao Kurohashi:
What Language Do Non-English-Centric Large Language Models Think in? 26333-26346 - Itamar Trainin, Omri Abend:
T⁵Score: A Methodology for Automatically Assessing the Quality of LLM Generated Multi-Document Topic Sets. 26347-26375 - Hakyung Lee, Subeen Park, Joowang Kim, Sungjun Lim, Kyungwoo Song:
Uncertainty-Aware Contrastive Decoding. 26376-26391 - Run Lin, Yao Liu, Yanglei Gan, Yuxiang Cai, Tian Lan, Qiao Liu:
GEMS: Generation-Based Event Argument Extraction via Multi-perspective Prompts and Ontology Steering. 26392-26409 - Alan Saji, Jaavid Aktar Husain, Thanmay Jayakumar, Raj Dabre, Anoop Kunchukuttan, Ratish Puduppully:
RomanLens: The Role Of Latent Romanization In Multilinguality In LLMs. 26410-26429 - Qianying Liu, Katrina Qiyao Wang, Fei Cheng, Sadao Kurohashi:
7 Points to Tsinghua but 10 Points to ? Assessing Large Language Models in Agentic Multilingual National Bias. 26430-26442 - Jiabei Chen, Guang Liu, Shizhu He, Kun Luo, Yao Xu, Jun Zhao, Kang Liu:
Search-in-Context: Efficient Multi-Hop QA over Long Contexts via Monte Carlo Tree Search with Dynamic KV Retrieval. 26443-26455 - Eunsu Kim, Juyoung Suk, Seungone Kim, Niklas Muennighoff, Dongkwan Kim, Alice Oh:
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation. 26456-26493 - Xinjie Zhang, Wenxuan Wang, Qin Jin:
IntentionESC: An Intention-Centered Framework for Enhancing Emotional Support in Dialogue Systems. 26494-26516 - Gerard Christopher Yeo, Kokil Jaidka:
Beyond Context to Cognitive Appraisal: Emotion Reasoning as a Theory of Mind Benchmark for Large Language Models. 26517-26525 - Mst. Fahmida Sultana Naznin, Adnan Ibney Faruq, Mostafa Rifat Tazwar, Md Jobayer, Md. Mehedi Hasan Shawon, Md. Rakibul Hasan:
CSTRL: Context-Driven Sequential Transfer Learning for Abstractive Radiology Report Summarization. 26526-26537 - Xinyi Yang, Runzhe Zhan, Shu Yang, Junchao Wu, Lidia S. Chao, Derek F. Wong:
Rethinking Prompt-based Debiasing in Large Language Model. 26538-26553 - Dohyun Lee, Seungil Chad Lee, Chanwoo Yang, Yujin Baek, Jaegul Choo:
Exploring In-context Example Generation for Machine Translation. 26554-26568 - Jinheon Baek, Horst Samulowitz, Oktie Hassanzadeh, Dharmashankar Subramanian, Sola Shirai, Alfio Gliozzo, Debarun Bhattacharjya:
Knowledge Base Construction for Knowledge-Augmented Text-to-SQL. 26569-26583 - Xuye Liu, Tengfei Ma, Yimu Wang, Fengjie Wang, Jian Zhao:
NBDESCRIB: A Dataset for Text Description Generation from Tables and Code in Jupyter Notebooks with Guidelines. 26584-26606 - Yeonseok Jeong, Jinsu Kim, Dohyeon Lee, Seung-won Hwang:
ECoRAG: Evidentiality-guided Compression for Long Context RAG. 26607-26628 - Jivitesh Jain, Nivedhitha Dhanasekaran, Mona T. Diab:
From Complexity to Clarity: AI/NLP's Role in Regulatory Compliance. 26629-26641 - Hyunjong Kim, Sangyeop Kim, Jongheon Jeong, Yeongjae Cho, Sungzoon Cho:
EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations. 26642-26657 - Eitan Wagner, Nitay Alon, Joseph M. Barnby, Omri Abend:
Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning. 26658-26668 - Yen-Shan Chen, Jing Jin, Peng-Ting Kuo, Chao-Wei Huang, Yun-Nung Chen:
LLMs are Biased Evaluators But Not Biased for Fact-Centric Retrieval Augmented Generation. 26669-26684 - Anya Belz, Simon Mille, Craig Thomson:
Standard Quality Criteria Derived from Current NLP Evaluations for Guiding Evaluation Design and Grounding Comparability and AI Compliance Assessments. 26685-26715 - Marek Suppa, Andrej Ridzik, Daniel Hládek, Tomas Javurek, Viktoria Ondrejova, Kristína Sásiková, Martin Tamajka, Marián Simko:
skLEP: A Slovak General Language Understanding Benchmark. 26716-26743 - Hyundong Justin Cho, Spencer Lin, Tejas Srinivasan, Michael Saxon, Deuksin Kwon, Natali T. Chavez, Jonathan May:
Can Vision Language Models Understand Mimed Actions? 26744-26759 - Tianshu Yu, Chao Xiang, Mingchuan Yang, Pei Ke, Bosi Wen, Cunxiang Wang, Jiale Cheng, Li Zhang, Xinyu Mu, Chuxiong Sun, Minlie Huang:
Training Language Model to Critique for Better Refinement. 26760-26804 - Peiyi Zhang, Richong Zhang, Zhijie Nie, Ziqiao Wang:
Dynamic Task Vector Grouping for Efficient Multi-Task Prompt Tuning. 26805-26821 - Kyochul Jang, Donghyeon Lee, Kyusik Kim, Dongseok Heo, Taewhoo Lee, Woojeong Kim, Bongwon Suh:
DICE-BENCH: Evaluating the Tool-Use Capabilities of Large Language Models in Multi-Round, Multi-Party Dialogues. 26822-26846 - Jinyu Guo, Xunlei Chen, Qiyang Xia, Zhaokun Wang, Jie Ou, Libo Qin, Shunyu Yao, Wenhong Tian:
HASH-RAG: Bridging Deep Hashing with Retriever for Efficient, Fine Retrieval and Augmented Generation. 26847-26858 - Hannan Cao, Hwee Tou Ng:
A Constrained Text Revision Agent via Iterative Planning and Searching. 26859-26882 - Gio Paik, Geewook Kim, Jinbae Im:
MMRefine: Unveiling the Obstacles to Robust Refinement in Multimodal Large Language Models. 26883-26904 - Amir Hossein Kargaran, Yihong Liu, François Yvon, Hinrich Schütze:
How Programming Concepts and Neurons Are Shared in Code Language Models. 26905-26917 - Qian Lin, Junyi Li, Hwee Tou Ng:
DynaQuest: A Dynamic Question Answering Dataset Reflecting Real-World Knowledge Updates. 26918-26936 - Ekaterina Grishina, Mikhail Gorbunov, Maxim Rakhuba:
ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations. 26937-26949 - Jinheon Baek, Sun Jae Lee, Prakhar Gupta, Geunseob Oh, Siddharth Dalmia, Prateek Kolhar:
Revisiting In-Context Learning with Long Context Language Models. 26950-26966 - Hannan Cao, Hai Ye, Hwee Tou Ng:
Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment. 26967-26982 - Jie Ou, Jinyu Guo, Shuaihong Jiang, Zhaokun Wang, Libo Qin, Shunyu Yao, Wenhong Tian:
Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps. 26983-27000 - Amir Hossein Kargaran, Ali Modarressi, Nafiseh Nikeghbal, Jana Diesner, François Yvon, Hinrich Schütze:
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment. 27001-27023 - Zhanhao Xie, Yuexiao Ma, Xiawu Zheng, Fei Chao, Wanchen Sui, Yong Li, Shen Li, Rongrong Ji:
Automated Fine-Grained Mixture-of-Experts Quantization. 27024-27037 - Hongjun Jeong, Minji Kim, Heesoo Jung, Ko Keun Kim, Hogun Park:
Enhancing Complex Reasoning in Knowledge Graph Question Answering through Query Graph Approximation. 27038-27056

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.