Остановите войну!
for scientists:
default search action
Paul Röttger
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i16]Xinpeng Wang, Bolei Ma, Chengzhi Hu, Leon Weber-Genzel, Paul Röttger, Frauke Kreuter, Dirk Hovy, Barbara Plank:
"My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models. CoRR abs/2402.14499 (2024) - [i15]Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schütze, Dirk Hovy:
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models. CoRR abs/2402.16786 (2024) - 2023
- [c9]Matthias Orlikowski, Paul Röttger, Philipp Cimiano, Dirk Hovy:
The Ecological Fallacy in Annotation: Modeling Human Label Variation goes beyond Sociodemographics. ACL (2) 2023: 1017-1029 - [c8]Janosch Haber, Bertie Vidgen, Matthew Chapman, Vibhor Agarwal, Roy Ka-Wei Lee, Yong Keong Yap, Paul Röttger:
Improving the Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore. ACL (1) 2023: 12705-12721 - [c7]Hannah Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values. EMNLP 2023: 2409-2430 - [c6]Hannah Kirk, Wenjie Yin, Bertie Vidgen, Paul Röttger:
SemEval-2023 Task 10: Explainable Detection of Online Sexism. SemEval@ACL 2023: 2193-2210 - [i14]Hannah Rose Kirk, Wenjie Yin, Bertie Vidgen, Paul Röttger:
SemEval-2023 Task 10: Explainable Detection of Online Sexism. CoRR abs/2303.04222 (2023) - [i13]Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale:
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback. CoRR abs/2303.05453 (2023) - [i12]Matthias Orlikowski, Paul Röttger, Philipp Cimiano, Dirk Hovy:
The Ecological Fallacy in Annotation: Modelling Human Label Variation goes beyond Sociodemographics. CoRR abs/2306.11559 (2023) - [i11]Paul Röttger, Hannah Rose Kirk, Bertie Vidgen, Giuseppe Attanasio, Federico Bianchi, Dirk Hovy:
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models. CoRR abs/2308.01263 (2023) - [i10]Federico Bianchi, Mirac Suzgun, Giuseppe Attanasio, Paul Röttger, Dan Jurafsky, Tatsunori Hashimoto, James Zou:
Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions. CoRR abs/2309.07875 (2023) - [i9]Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models. CoRR abs/2310.02457 (2023) - [i8]Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values. CoRR abs/2310.07629 (2023) - [i7]Bertie Vidgen, Hannah Rose Kirk, Rebecca Qian, Nino Scherrer, Anand Kannappan, Scott A. Hale, Paul Röttger:
SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models. CoRR abs/2311.08370 (2023) - 2022
- [c5]Paul Röttger, Debora Nozza, Federico Bianchi, Dirk Hovy:
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages. EMNLP 2022: 5674-5691 - [c4]Paul Röttger, Bertie Vidgen, Dirk Hovy, Janet B. Pierrehumbert:
Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks. NAACL-HLT 2022: 175-190 - [c3]Hannah Kirk, Bertie Vidgen, Paul Röttger, Tristan Thrush, Scott A. Hale:
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate. NAACL-HLT 2022: 1352-1368 - [i6]Paul Röttger, Haitham Seelawi, Debora Nozza, Zeerak Talat, Bertie Vidgen:
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models. CoRR abs/2206.09917 (2022) - [i5]Paul Röttger, Debora Nozza, Federico Bianchi, Dirk Hovy:
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages. CoRR abs/2210.11359 (2022) - 2021
- [c2]Paul Röttger, Bertie Vidgen, Dong Nguyen, Zeerak Waseem, Helen Z. Margetts, Janet B. Pierrehumbert:
HateCheck: Functional Tests for Hate Speech Detection Models. ACL/IJCNLP (1) 2021: 41-58 - [c1]Paul Röttger, Janet B. Pierrehumbert:
Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media. EMNLP (Findings) 2021: 2400-2412 - [i4]Paul Röttger, Janet B. Pierrehumbert:
Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media. CoRR abs/2104.08116 (2021) - [i3]Hannah Rose Kirk, Bertram Vidgen, Paul Röttger, Tristan Thrush, Scott A. Hale:
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate. CoRR abs/2108.05921 (2021) - [i2]Paul Röttger, Bertie Vidgen, Dirk Hovy, Janet B. Pierrehumbert:
Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks. CoRR abs/2112.07475 (2021) - 2020
- [i1]Paul Röttger, Bertram Vidgen, Dong Nguyen, Zeerak Waseem, Helen Z. Margetts, Janet B. Pierrehumbert:
HateCheck: Functional Tests for Hate Speech Detection Models. CoRR abs/2012.15606 (2020)
Coauthor Index
aka: Bertram Vidgen
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-03-27 00:26 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint