Stop the war!

Остановите войну!

for scientists:

default search action

combined dblp search
author search
venue search
publication search

ask others

Paul Röttger

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/natmi/KirkVRH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/natmi/KirkVRH24
Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The benefits, risks and bounds of personalizing the alignment of large language models to individuals. Nat. Mac. Intell. 6(4): 383-392 (2024)
[c12]
- view
  - electronic edition @ aclanthology.org
  - no references & citations available
- export record
  dblp key:
  - conf/acl/HoltermannRDL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HoltermannRDL24
Carolin Holtermann, Paul Röttger, Timm Dill, Anne Lauscher:
Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ. ACL (Findings) 2024: 4476-4494
[c11]
- view
  - electronic edition @ aclanthology.org
  - no references & citations available
- export record
  dblp key:
  - conf/acl/0003MHWRKHP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/0003MHWRKHP24
Xinpeng Wang, Bolei Ma, Chengzhi Hu, Leon Weber-Genzel, Paul Röttger, Frauke Kreuter, Dirk Hovy, Barbara Plank:
"My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models. ACL (Findings) 2024: 7407-7416
[c10]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/0001SARJH024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/0001SARJH024
Federico Bianchi, Mirac Suzgun, Giuseppe Attanasio, Paul Röttger, Dan Jurafsky, Tatsunori Hashimoto, James Zou:
Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions. ICLR 2024
[i26]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-14499
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-14499
Xinpeng Wang, Bolei Ma, Chengzhi Hu, Leon Weber-Genzel, Paul Röttger, Frauke Kreuter, Dirk Hovy, Barbara Plank:
"My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models. CoRR abs/2402.14499 (2024)
[i25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-16786
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-16786
Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schütze, Dirk Hovy:
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models. CoRR abs/2402.16786 (2024)
[i24]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-03814
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-03814
Carolin Holtermann, Paul Röttger, Timm Dill, Anne Lauscher:
Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ. CoRR abs/2403.03814 (2024)
[i23]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-19559
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-19559
Janis Goldzycher, Paul Röttger, Gerold Schneider:
Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset. CoRR abs/2403.19559 (2024)
[i22]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-05399
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-05399
Paul Röttger, Fabio Pernisi, Bertie Vidgen, Dirk Hovy:
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety. CoRR abs/2404.05399 (2024)
[i21]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-08382
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-08382
Xinpeng Wang, Chengzhi Hu, Bolei Ma, Paul Röttger, Barbara Plank:
Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think. CoRR abs/2404.08382 (2024)
[i20]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-12241
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-12241
Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt D. Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Subhra S. Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse Khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren:
Introducing v0.5 of the AI Safety Benchmark from MLCommons. CoRR abs/2404.12241 (2024)
[i19]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-16019
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-16019
Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew M. Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, Scott A. Hale:
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models. CoRR abs/2404.16019 (2024)
[i18]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-17047
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-17047
Francisco Eiras, Aleksandar Petrov, Bertie Vidgen, Christian Schröder de Witt, Fabio Pizzati, Katherine Elkins, Supratik Mukhopadhyay, Adel Bibi, Botos Csaba, Fabro Steibel, Fazl Barez, Genevieve Smith, Gianluca Guadagni, Jon Chun, Jordi Cabot, Joseph Marvin Imperial, Juan A. Nolazco-Flores, Lori Landay, Matthew Jackson, Paul Röttger, Philip H. S. Torr, Trevor Darrell, Yong Suk Lee, Jakob N. Foerster:
Near to Mid-term Risks and Opportunities of Open Source Generative AI. CoRR abs/2404.17047 (2024)
[i17]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-17874
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-17874
Manuel Tonneau, Diyi Liu, Samuel Fraiberger, Ralph Schroeder, Scott A. Hale, Paul Röttger:
From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets. CoRR abs/2404.17874 (2024)
[i16]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-09482
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-09482
Donya Rooein, Paul Röttger, Anastassia Shaitarova, Dirk Hovy:
Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts. CoRR abs/2405.09482 (2024)
[i15]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-14508
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-14508
Kobi Hackenburg, Ben M. Tappin, Paul Röttger, Scott Hale, Jonathan Bright, Helen Z. Margetts:
Evidence of a log scaling law for political persuasion with large language models. CoRR abs/2406.14508 (2024)
2023
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/OrlikowskiRCH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/OrlikowskiRCH23
Matthias Orlikowski, Paul Röttger, Philipp Cimiano, Dirk Hovy:
The Ecological Fallacy in Annotation: Modeling Human Label Variation goes beyond Sociodemographics. ACL (2) 2023: 1017-1029
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HaberVCALYR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HaberVCALYR23
Janosch Haber, Bertie Vidgen, Matthew Chapman, Vibhor Agarwal, Roy Ka-Wei Lee, Yong Keong Yap, Paul Röttger:
Improving the Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore. ACL (1) 2023: 12705-12721
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/KirkBVRH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/KirkBVRH23
Hannah Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values. EMNLP 2023: 2409-2430
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/semeval/KirkYVR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/semeval/KirkYVR23
Hannah Kirk, Wenjie Yin, Bertie Vidgen, Paul Röttger:
SemEval-2023 Task 10: Explainable Detection of Online Sexism. SemEval@ACL 2023: 2193-2210
[i14]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-04222
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-04222
Hannah Rose Kirk, Wenjie Yin, Bertie Vidgen, Paul Röttger:
SemEval-2023 Task 10: Explainable Detection of Online Sexism. CoRR abs/2303.04222 (2023)
[i13]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-05453
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-05453
Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale:
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback. CoRR abs/2303.05453 (2023)
[i12]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-11559
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-11559
Matthias Orlikowski, Paul Röttger, Philipp Cimiano, Dirk Hovy:
The Ecological Fallacy in Annotation: Modelling Human Label Variation goes beyond Sociodemographics. CoRR abs/2306.11559 (2023)
[i11]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-01263
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-01263
Paul Röttger, Hannah Rose Kirk, Bertie Vidgen, Giuseppe Attanasio, Federico Bianchi, Dirk Hovy:
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models. CoRR abs/2308.01263 (2023)
[i10]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07875
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07875
Federico Bianchi, Mirac Suzgun, Giuseppe Attanasio, Paul Röttger, Dan Jurafsky, Tatsunori Hashimoto, James Zou:
Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions. CoRR abs/2309.07875 (2023)
[i9]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-02457
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-02457
Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models. CoRR abs/2310.02457 (2023)
[i8]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-07629
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-07629
Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values. CoRR abs/2310.07629 (2023)
[i7]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-08370
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-08370
Bertie Vidgen, Hannah Rose Kirk, Rebecca Qian, Nino Scherrer, Anand Kannappan, Scott A. Hale, Paul Röttger:
SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models. CoRR abs/2311.08370 (2023)
2022
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/RottgerNBH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/RottgerNBH22
Paul Röttger, Debora Nozza, Federico Bianchi, Dirk Hovy:
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages. EMNLP 2022: 5674-5691
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/naacl/RottgerVHP22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/RottgerVHP22
Paul Röttger, Bertie Vidgen, Dirk Hovy, Janet B. Pierrehumbert:
Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks. NAACL-HLT 2022: 175-190
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/naacl/KirkVRTH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/KirkVRTH22
Hannah Kirk, Bertie Vidgen, Paul Röttger, Tristan Thrush, Scott A. Hale:
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate. NAACL-HLT 2022: 1352-1368
[i6]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-09917
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-09917
Paul Röttger, Haitham Seelawi, Debora Nozza, Zeerak Talat, Bertie Vidgen:
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models. CoRR abs/2206.09917 (2022)
[i5]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-11359
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-11359
Paul Röttger, Debora Nozza, Federico Bianchi, Dirk Hovy:
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages. CoRR abs/2210.11359 (2022)
2021
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/RottgerV0WMP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/RottgerV0WMP20
Paul Röttger, Bertie Vidgen, Dong Nguyen, Zeerak Waseem, Helen Z. Margetts, Janet B. Pierrehumbert:
HateCheck: Functional Tests for Hate Speech Detection Models. ACL/IJCNLP (1) 2021: 41-58
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/RottgerP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/RottgerP21
Paul Röttger, Janet B. Pierrehumbert:
Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media. EMNLP (Findings) 2021: 2400-2412
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2104-08116
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-08116
Paul Röttger, Janet B. Pierrehumbert:
Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media. CoRR abs/2104.08116 (2021)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2108-05921
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-05921
Hannah Rose Kirk, Bertram Vidgen, Paul Röttger, Tristan Thrush, Scott A. Hale:
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate. CoRR abs/2108.05921 (2021)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2112-07475
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2112-07475
Paul Röttger, Bertie Vidgen, Dirk Hovy, Janet B. Pierrehumbert:
Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks. CoRR abs/2112.07475 (2021)
2020
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2012-15606
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-15606
Paul Röttger, Bertram Vidgen, Dong Nguyen, Zeerak Waseem, Helen Z. Margetts, Janet B. Pierrehumbert:
HateCheck: Functional Tests for Hate Speech Detection Models. CoRR abs/2012.15606 (2020)

Coauthor Index

see FAQ

a service of

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.