default search action
Bertie Vidgen
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j3]Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The benefits, risks and bounds of personalizing the alignment of large language models to individuals. Nat. Mac. Intell. 6(4): 383-392 (2024) - [c17]Francisco Eiras, Aleksandar Petrov, Bertie Vidgen, Christian Schröder de Witt, Fabio Pizzati, Katherine Elkins, Supratik Mukhopadhyay, Adel Bibi, Botos Csaba, Fabro Steibel, Fazl Barez, Genevieve Smith, Gianluca Guadagni, Jon Chun, Jordi Cabot, Joseph Marvin Imperial, Juan A. Nolazco-Flores, Lori Landay, Matthew Thomas Jackson, Paul Röttger, Philip H. S. Torr, Trevor Darrell, Yong Suk Lee, Jakob N. Foerster:
Position: Near to Mid-term Risks and Opportunities of Open-Source Generative AI. ICML 2024 - [c16]Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Hanchi Sun, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric P. Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang, Huan Zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John C. Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, Ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao:
Position: TrustLLM: Trustworthiness in Large Language Models. ICML 2024 - [c15]Paul Röttger, Hannah Kirk, Bertie Vidgen, Giuseppe Attanasio, Federico Bianchi, Dirk Hovy:
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models. NAACL-HLT 2024: 5377-5400 - [i32]Paul Röttger, Fabio Pernisi, Bertie Vidgen, Dirk Hovy:
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety. CoRR abs/2404.05399 (2024) - [i31]Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt D. Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Subhra S. Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse Khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren:
Introducing v0.5 of the AI Safety Benchmark from MLCommons. CoRR abs/2404.12241 (2024) - [i30]Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew M. Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, Scott A. Hale:
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models. CoRR abs/2404.16019 (2024) - [i29]Francisco Eiras, Aleksandar Petrov, Bertie Vidgen, Christian Schröder de Witt, Fabio Pizzati, Katherine Elkins, Supratik Mukhopadhyay, Adel Bibi, Botos Csaba, Fabro Steibel, Fazl Barez, Genevieve Smith, Gianluca Guadagni, Jon Chun, Jordi Cabot, Joseph Marvin Imperial, Juan A. Nolazco-Flores, Lori Landay, Matthew Thomas Jackson, Paul Röttger, Philip H. S. Torr, Trevor Darrell, Yong Suk Lee, Jakob N. Foerster:
Near to Mid-term Risks and Opportunities of Open Source Generative AI. CoRR abs/2404.17047 (2024) - [i28]Olly Styles, Sam Miller, Patricio Cerda-Mardini, Tanaya Guha, Victor Sanchez, Bertie Vidgen:
WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting. CoRR abs/2405.00823 (2024) - [i27]Francisco Eiras, Aleksandar Petrov, Bertie Vidgen, Christian Schröder de Witt, Fabio Pizzati, Katherine Elkins, Supratik Mukhopadhyay, Adel Bibi, Aaron Purewal, Botos Csaba, Fabro Steibel, Fazel Keshtkar, Fazl Barez, Genevieve Smith, Gianluca Guadagni, Jon Chun, Jordi Cabot, Joseph Marvin Imperial, Juan Arturo Nolazco, Lori Landay, Matthew Thomas Jackson, Philip H. S. Torr, Trevor Darrell, Yong Suk Lee, Jakob N. Foerster:
Risks and Opportunities of Open-Source Generative AI. CoRR abs/2405.08597 (2024) - [i26]Shayne Longpre, Stella Biderman, Alon Albalak, Hailey Schoelkopf, Daniel McDuff, Sayash Kapoor, Kevin Klyman, Kyle Lo, Gabriel Ilharco, Nay San, Maribeth Rauh, Aviya Skowron, Bertie Vidgen, Laura Weidinger, Arvind Narayanan, Victor Sanh, David Ifeoluwa Adelani, Percy Liang, Rishi Bommasani, Peter Henderson, Sasha Luccioni, Yacine Jernite, Luca Soldaini:
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources. CoRR abs/2406.16746 (2024) - 2023
- [c14]Janosch Haber, Bertie Vidgen, Matthew Chapman, Vibhor Agarwal, Roy Ka-Wei Lee, Yong Keong Yap, Paul Röttger:
Improving the Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore. ACL (1) 2023: 12705-12721 - [c13]Hannah Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values. EMNLP 2023: 2409-2430 - [c12]Hannah Kirk, Wenjie Yin, Bertie Vidgen, Paul Röttger:
SemEval-2023 Task 10: Explainable Detection of Online Sexism. SemEval@ACL 2023: 2193-2210 - [i25]Hannah Rose Kirk, Wenjie Yin, Bertie Vidgen, Paul Röttger:
SemEval-2023 Task 10: Explainable Detection of Online Sexism. CoRR abs/2303.04222 (2023) - [i24]Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale:
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback. CoRR abs/2303.05453 (2023) - [i23]Paul Röttger, Hannah Rose Kirk, Bertie Vidgen, Giuseppe Attanasio, Federico Bianchi, Dirk Hovy:
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models. CoRR abs/2308.01263 (2023) - [i22]Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models. CoRR abs/2310.02457 (2023) - [i21]Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale:
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values. CoRR abs/2310.07629 (2023) - [i20]Bertie Vidgen, Hannah Rose Kirk, Rebecca Qian, Nino Scherrer, Anand Kannappan, Scott A. Hale, Paul Röttger:
SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models. CoRR abs/2311.08370 (2023) - [i19]Pranab Islam, Anand Kannappan, Douwe Kiela, Rebecca Qian, Nino Scherrer, Bertie Vidgen:
FinanceBench: A New Benchmark for Financial Question Answering. CoRR abs/2311.11944 (2023) - 2022
- [j2]Zo Ahmed, Bertie Vidgen, Scott A. Hale:
Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning. EPJ Data Sci. 11(1): 8 (2022) - [j1]Arkaitz Zubiaga, Bertie Vidgen, Miriam Fernández, Nishanth Sastry:
Editorial for Special Issue on Detecting, Understanding and Countering Online Harms. Online Soc. Networks Media 27: 100186 (2022) - [c11]Hannah Kirk, Abeba Birhane, Bertie Vidgen, Leon Derczynski:
Handling and Presenting Harmful Text in NLP Research. EMNLP (Findings) 2022: 497-510 - [c10]Paul Röttger, Bertie Vidgen, Dirk Hovy, Janet B. Pierrehumbert:
Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks. NAACL-HLT 2022: 175-190 - [c9]Hannah Kirk, Bertie Vidgen, Paul Röttger, Tristan Thrush, Scott A. Hale:
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate. NAACL-HLT 2022: 1352-1368 - [i18]Leon Derczynski, Hannah Rose Kirk, Abeba Birhane, Bertie Vidgen:
Handling and Presenting Harmful Text. CoRR abs/2204.14256 (2022) - [i17]Paul Röttger, Haitham Seelawi, Debora Nozza, Zeerak Talat, Bertie Vidgen:
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models. CoRR abs/2206.09917 (2022) - [i16]Hannah Rose Kirk, Bertie Vidgen, Scott A. Hale:
Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning. CoRR abs/2209.10193 (2022) - [i15]Pica Johansson, Florence Enock, Scott A. Hale, Bertie Vidgen, Cassidy Bereskin, Helen Z. Margetts, Jonathan Bright:
How can we combat online misinformation? A systematic overview of current interventions and their efficacy. CoRR abs/2212.11864 (2022) - 2021
- [c8]Paul Röttger, Bertie Vidgen, Dong Nguyen, Zeerak Waseem, Helen Z. Margetts, Janet B. Pierrehumbert:
HateCheck: Functional Tests for Hate Speech Detection Models. ACL/IJCNLP (1) 2021: 41-58 - [c7]Bertie Vidgen, Tristan Thrush, Zeerak Waseem, Douwe Kiela:
Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection. ACL/IJCNLP (1) 2021: 1667-1682 - [c6]Austin Botelho, Scott A. Hale, Bertie Vidgen:
Deciphering Implicit Hate: Evaluating Automated Detection Algorithms for Multimodal Hate. ACL/IJCNLP (Findings) 2021: 1896-1907 - [c5]Ella Guest, Bertie Vidgen, Alexandros Mittos, Nishanth Sastry, Gareth Tyson, Helen Z. Margetts:
An Expert Annotated Dataset for the Detection of Online Misogyny. EACL 2021: 1336-1350 - [c4]Bertie Vidgen, Dong Nguyen, Helen Z. Margetts, Patrícia G. C. Rossini, Rebekah Tromble:
Introducing CAD: the Contextual Abuse Dataset. NAACL-HLT 2021: 2289-2303 - [c3]Douwe Kiela, Max Bartolo, Yixin Nie, Divyansh Kaushik, Atticus Geiger, Zhengxuan Wu, Bertie Vidgen, Grusha Prasad, Amanpreet Singh, Pratik Ringshia, Zhiyi Ma, Tristan Thrush, Sebastian Riedel, Zeerak Waseem, Pontus Stenetorp, Robin Jia, Mohit Bansal, Christopher Potts, Adina Williams:
Dynabench: Rethinking Benchmarking in NLP. NAACL-HLT 2021: 4110-4124 - [i14]Zo Ahmed, Bertie Vidgen, Scott A. Hale:
Tackling Racial Bias in Automated Online Hate Detection: Towards Fair and Accurate Classification of Hateful Online Users Using Geometric Deep Learning. CoRR abs/2103.11806 (2021) - [i13]Douwe Kiela, Max Bartolo, Yixin Nie, Divyansh Kaushik, Atticus Geiger, Zhengxuan Wu, Bertie Vidgen, Grusha Prasad, Amanpreet Singh, Pratik Ringshia, Zhiyi Ma, Tristan Thrush, Sebastian Riedel, Zeerak Waseem, Pontus Stenetorp, Robin Jia, Mohit Bansal, Christopher Potts, Adina Williams:
Dynabench: Rethinking Benchmarking in NLP. CoRR abs/2104.14337 (2021) - [i12]Austin Botelho, Bertie Vidgen, Scott A. Hale:
Deciphering Implicit Hate: Evaluating Automated Detection Algorithms for Multimodal Hate. CoRR abs/2106.05903 (2021) - [i11]Hannah Rose Kirk, Bertram Vidgen, Paul Röttger, Tristan Thrush, Scott A. Hale:
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate. CoRR abs/2108.05921 (2021) - [i10]Laila Sprejer, Helen Z. Margetts, Kleber Oliveira, David O'Sullivan, Bertie Vidgen:
An influencer-based approach to understanding radical right viral tweets. CoRR abs/2109.07588 (2021) - [i9]Paul Röttger, Bertie Vidgen, Dirk Hovy, Janet B. Pierrehumbert:
Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks. CoRR abs/2112.07475 (2021) - 2020
- [c2]Vinodkumar Prabhakaran, Zeerak Waseem, Seyi Akiwowo, Bertie Vidgen:
Online Abuse and Human Rights: WOAH Satellite Session at RightsCon 2020. WOAH 2020: 1-6 - [c1]Bertie Vidgen, Scott A. Hale, Ella Guest, Helen Z. Margetts, David A. Broniatowski, Zeerak Waseem, Austin Botelho, Matthew Hall, Rebekah Tromble:
Detecting East Asian Prejudice on Social Media. WOAH 2020: 162-172 - [e1]Seyi Akiwowo, Bertie Vidgen, Vinodkumar Prabhakaran, Zeerak Waseem:
Proceedings of the Fourth Workshop on Online Abuse and Harms, WOAH 2020, Online, November 20, 2020. Association for Computational Linguistics 2020, ISBN 978-1-952148-79-8 [contents] - [i8]Bertie Vidgen, Leon Derczynski:
Directions in Abusive Language Training Data: Garbage In, Garbage Out. CoRR abs/2004.01670 (2020) - [i7]Bertie Vidgen, Austin Botelho, David A. Broniatowski, Ella Guest, Matthew Hall, Helen Z. Margetts, Rebekah Tromble, Zeerak Waseem, Scott A. Hale:
Detecting East Asian Prejudice on Social Media. CoRR abs/2005.03909 (2020) - [i6]Paul Röttger, Bertram Vidgen, Dong Nguyen, Zeerak Waseem, Helen Z. Margetts, Janet B. Pierrehumbert:
HateCheck: Functional Tests for Hate Speech Detection Models. CoRR abs/2012.15606 (2020) - [i5]Bertie Vidgen, Tristan Thrush, Zeerak Waseem, Douwe Kiela:
Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection. CoRR abs/2012.15761 (2020)
2010 – 2019
- 2019
- [i4]Bertie Vidgen, Taha Yasseri:
What, When and Where of petitions submitted to the UK Government during a time of chaos. CoRR abs/1907.01536 (2019) - [i3]Bertie Vidgen, Taha Yasseri, Helen Z. Margetts:
Trajectories of Islamophobic hate amongst far right actors on Twitter. CoRR abs/1910.05794 (2019) - 2018
- [i2]Bertie Vidgen, Taha Yasseri:
Detecting weak and strong Islamophobic hate speech on social media. CoRR abs/1812.10400 (2018) - 2016
- [i1]Bertie Vidgen, Taha Yasseri:
P-values: misunderstood and misused. CoRR abs/1601.06805 (2016)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-04 01:24 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint