default search action
Alham Fikri Aji
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c45]Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine de Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Indra Winata, Seid Muhie Yimam, Saif M. Mohammad:
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages. ACL (Findings) 2024: 2512-2530 - [c44]Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov:
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection. ACL (1) 2024: 3964-3992 - [c43]Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Rifki Afina Putri, Tjeng Wawan Cenggoro, Jhonson Lee, Salsabil Maulana Akbar, Emmanuel Dave, Nuur Shadieq, Muhammad Ihza Mahendra, Dea Annisayanti Putri, Bryan Wilie, Genta Indra Winata, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages. ACL (1) 2024: 14899-14914 - [c42]Chenyang Lyu, Zefeng Du, Jitao Xu, Yitao Duan, Minghao Wu, Teresa Lynn, Alham Fikri Aji, Derek F. Wong, Longyue Wang:
A Paradigm Shift: The Future of Machine Translation Lies with Large Language Models. LREC/COLING 2024: 1339-1352 - [c41]Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji:
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions. EACL (1) 2024: 944-964 - [c40]Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Chenxi Whitehouse, Osama Mohammed Afzal, Tarek Mahmoud, Toru Sasaki, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov:
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection. EACL (1) 2024: 1369-1407 - [c39]Haryo Akbarianto Wibowo, Erland Hilman Fuadi, Made Nindyatama Nityasya, Radityo Eko Prasojo, Alham Fikri Aji:
COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances. NAACL-HLT 2024: 1404-1422 - [c38]Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Meriem Beloucif, Christine de Kock, Oumaima Hourrane, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Krishnapriya Vishnubhotla, Seid Muhie Yimam, Saif M. Mohammad:
SemEval Task 1: Semantic Textual Relatedness for African and Asian Languages. SemEval@NAACL 2024: 1963-1978 - [d1]Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Chenxi Whitehouse, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov:
SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Black-Box Machine-Generated Text Detection. Zenodo, 2024 - [i61]Muhammad Farid Adilazuarda, Samuel Cahyawijaya, Alham Fikri Aji, Genta Indra Winata, Ayu Purwarianti:
LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization. CoRR abs/2401.06034 (2024) - [i60]Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine de Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Indra Winata, Seid Muhie Yimam, Saif M. Mohammad:
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages. CoRR abs/2402.08638 (2024) - [i59]Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohanned Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov:
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection. CoRR abs/2402.11175 (2024) - [i58]Chenyang Lyu, Minghao Wu, Alham Fikri Aji:
Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models. CoRR abs/2402.13887 (2024) - [i57]Rendi Chevi, Alham Fikri Aji:
Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition. CoRR abs/2402.14523 (2024) - [i56]Muhammad Farid Adilazuarda, Sagnik Mukherjee, Pradhyumna Lavania, Siddhant Singh, Ashutosh Dwivedi, Alham Fikri Aji, Jacki O'Neill, Ashutosh Modi, Monojit Choudhury:
Towards Measuring and Modeling "Culture" in LLMs: A Survey. CoRR abs/2403.15412 (2024) - [i55]Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Meriem Beloucif, Christine de Kock, Oumaima Hourrane, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Krishnapriya Vishnubhotla, Seid Muhie Yimam, Saif M. Mohammad:
SemEval Task 1: Semantic Textual Relatedness for African and Asian Languages. CoRR abs/2403.18933 (2024) - [i54]Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Rifki Afina Putri, Emmanuel Dave, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Salsabil Maulana Akbar, Muhammad Ihza Mahendra, Dea Annisayanti Putri, Bryan Wilie, Genta Indra Winata, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages. CoRR abs/2404.06138 (2024) - [i53]Ahmed Elshabrawy, Yongxin Huang, Iryna Gurevych, Alham Fikri Aji:
Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning. CoRR abs/2404.12897 (2024) - [i52]Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Chenxi Whitehouse, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov:
SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection. CoRR abs/2404.14183 (2024) - [i51]Teresa Lynn, Malik H. Altakrori, Samar Mohamed Magdy, Rocktim Jyoti Das, Chenyang Lyu, Mohamed Nasr, Younes Samih, Alham Fikri Aji, Preslav Nakov, Shantanu Godbole, Salim Roukos, Radu Florian, Nizar Habash:
Can a Multichoice Dataset be Repurposed for Extractive Question Answering? CoRR abs/2404.17342 (2024) - [i50]Stella Biderman, Hailey Schoelkopf, Lintang Sutawika, Leo Gao, Jonathan Tow, Baber Abbasi, Alham Fikri Aji, Pawan Sasanka Ammanamanchi, Sidney Black, Jordan Clive, Anthony DiPofi, Julen Etxaniz, Benjamin Fattori, Jessica Zosa Forde, Charles Foster, Jeffrey Hsu, Mimansa Jaiswal, Wilson Y. Lee, Haonan Li, Charles Lovering, Niklas Muennighoff, Ellie Pavlick, Jason Phang, Aviya Skowron, Samson Tan, Xiangru Tang, Kevin A. Wang, Genta Indra Winata, François Yvon, Andy Zou:
Lessons from the Trenches on Reproducible Evaluation of Language Models. CoRR abs/2405.14782 (2024) - [i49]David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, Atnafu Lambebo Tonja, Bontu Fufa Balcha, Chenxi Whitehouse, Christian Salamea, Dan John Velasco, David Ifeoluwa Adelani, David Le Meur, Emilio Villa-Cueva, Fajri Koto, Fauzan Farooqui, Frederico Belcavello, Ganzorig Batnasan, Gisela Vallejo, Grainne Caulfield, Guido Ivetta, Haiyue Song, Henok Biadglign Ademtew, Hernán Maina, Holy Lovenia, Israel Abebe Azime, Jan Christian Blaise Cruz, Jay P. Gala, Jiahui Geng, Jesús-Germán Ortiz-Barajas, Jinheon Baek, Jocelyn Dunstan, Laura Alonso Alemany, Kumaranage Ravindu Yasas Nagasinghe, Luciana Benotti, Luis Fernando D'Haro, Marcelo Viridiano, Marcos Estecha-Garitagoitia, Maria Camila Buitrago Cabrera, Mario Rodríguez-Cantelar, Mélanie Jouitteau, Mihail Mihaylov, Mohamed Fazli Mohamed Imam, Muhammad Farid Adilazuarda, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Naome A. Etori, Olivier Niyomugisha, Paula Mónica Silva, Pranjal A. Chitale, Raj Dabre, Rendi Chevi, Ruochen Zhang, Ryandito Diandaru, Samuel Cahyawijaya, Santiago Góngora, Soyeong Jeong, Sukannya Purkayastha, Tatsuki Kuribayashi, Thanmay Jayakumar, Tiago Timponi Torrent, Toqeer Ehsan, Vladimir Araujo, Yova Kementchedjhieva, Zara Burzo, Zheng Wei Lim, Zheng Xin Yong, Oana Ignat, Joan Nwatu, Rada Mihalcea, Thamar Solorio, Alham Fikri Aji:
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark. CoRR abs/2406.05967 (2024) - [i48]Zayd Muhammad Kawakibi Zuhri, Muhammad Farid Adilazuarda, Ayu Purwarianti, Alham Fikri Aji:
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding. CoRR abs/2406.09297 (2024) - [i47]Holy Lovenia, Rahmad Mahendra, Salsabil Maulana Akbar, Lester James V. Miranda, Jennifer Santoso, Elyanah Aco, Akhdan Fadhilah, Jonibek Mansurov, Joseph Marvin Imperial, Onno Pepijn Kampman, Joel Ruben Antony Moniz, Muhammad Ravi Shulthan Habibi, Frederikus Hudi, Railey Montalan, Ryan Ignatius, Joanito Agili Lopo, William Nixon, Börje F. Karlsson, James Jaya, Ryandito Diandaru, Yuze Gao, Patrick Amadeus Irawan, Bin Wang, Jan Christian Blaise Cruz, Chenxi Whitehouse, Ivan Halim Parmonangan, Maria Khelli, Wenyu Zhang, Lucky Susanto, Reynard Adha Ryanda, Sonny Lazuardi Hermawan, Dan John Velasco, Muhammad Dehan Al Kautsar, Willy Fitra Hendria, Yasmin Moslem, Noah Flynn, Muhammad Farid Adilazuarda, Haochen Li, Johanes Lee, R. Damanhuri, Shuo Sun, Muhammad Reza Qorib, Amirbek Djanibekov, Wei Qi Leong, Quyet V. Do, Niklas Muennighoff, Tanrada Pansuwan, Ilham Firdausi Putra, Yan Xu, Ngee Chia Tai, Ayu Purwarianti, Sebastian Ruder, William-Chandra Tjhi, Peerat Limkonchotiwat, Alham Fikri Aji, Sedrick Keh, Genta Indra Winata, Ruochen Zhang, Fajri Koto, Zheng Xin Yong, Samuel Cahyawijaya:
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages. CoRR abs/2406.10118 (2024) - [i46]Sagnik Mukherjee, Muhammad Farid Adilazuarda, Sunayana Sitaram, Kalika Bali, Alham Fikri Aji, Monojit Choudhury:
Cultural Conditioning or Placebo? On the Effectiveness of Socio-Demographic Prompting. CoRR abs/2406.11661 (2024) - [i45]Haryo Akbarianto Wibowo, Thamar Solorio, Alham Fikri Aji:
The Privileged Students: On the Value of Initialization in Multilingual Knowledge Distillation. CoRR abs/2406.16524 (2024) - [i44]Lucky Susanto, Musa Izzanardi Wijanarko, Prasetia Anugrah Pratama, Traci Hong, Ika Idris, Alham Fikri Aji, Derry Wijaya:
IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language. CoRR abs/2406.19349 (2024) - [i43]Mervat Abassy, Kareem Elozeiri, Alexander Aziz, Minh Ngoc Ta, Raj Vardhan Tomar, Bimarsha Adhikari, Saad El Dine Ahmed, Yuxia Wang, Osama Mohammed Afzal, Zhuohan Xie, Jonibek Mansurov, Ekaterina Artemova, Vladislav Mikhailov, Rui Xing, Jiahui Geng, Hasan Iqbal, Zain Muhammad Mujahid, Tarek Mahmoud, Akim Tsvigun, Alham Fikri Aji, Artem Shelmanov, Nizar Habash, Iryna Gurevych, Preslav Nakov:
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection. CoRR abs/2408.04284 (2024) - 2023
- [c37]Genta Indra Winata, Alham Fikri Aji, Zheng Xin Yong, Thamar Solorio:
The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges. ACL (Findings) 2023: 2936-2978 - [c36]Chenxi Whitehouse, Clara Vania, Alham Fikri Aji, Christos Christodoulopoulos, Andrea Pierleoni:
WebIE: Faithful and Robust Information Extraction on the Web. ACL (1) 2023: 7734-7755 - [c35]Anubha Kabra, Emmy Liu, Simran Khanuja, Alham Fikri Aji, Genta Indra Winata, Samuel Cahyawijaya, Aremu Anuoluwapo, Perez Ogayo, Graham Neubig:
Multi-lingual and Multi-cultural Figurative Language Understanding. ACL (Findings) 2023: 8269-8284 - [c34]Made Nindyatama Nityasya, Haryo Akbarianto Wibowo, Alham Fikri Aji, Genta Indra Winata, Radityo Eko Prasojo, Phil Blunsom, Adhiguna Kuncoro:
On "Scientific Debt" in NLP: A Case for More Rigour in Language Model Pre-Training Research. ACL (1) 2023: 8554-8572 - [c33]Jinheon Baek, Alham Fikri Aji, Jens Lehmann, Sung Ju Hwang:
Direct Fact Retrieval from Knowledge Graphs without Entity Linking. ACL (1) 2023: 10038-10055 - [c32]Zheng Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M. Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Indra Winata, Stella Biderman, Edward Raff, Dragomir Radev, Vassilina Nikoulina:
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting. ACL (1) 2023: 11682-11703 - [c31]Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Fajri Koto, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Jennifer Santoso, David Moeljadi, Cahya Wirawan, Frederikus Hudi, Muhammad Satrio Wicaksono, Ivan Halim Parmonangan, Ika Alfina, Ilham Firdausi Putra, Samsul Rahmadani, Yulianti Oenang, Ali Akbar Septiandri, James Jaya, Kaustubh D. Dhole, Arie Ardiyanti Suryani, Rifki Afina Putri, Dan Su, Keith Stevens, Made Nindyatama Nityasya, Muhammad Farid Adilazuarda, Ryan Hadiwijaya, Ryandito Diandaru, Tiezheng Yu, Vito Ghifari, Wenliang Dai, Yan Xu, Dyah Damapuspita, Haryo Akbarianto Wibowo, Cuk Tho, Ichwanul Muslim Karo Karo, Tirana Fatyanosa, Ziwei Ji, Graham Neubig, Timothy Baldwin, Sebastian Ruder, Pascale Fung, Herry Sujaini, Sakriani Sakti, Ayu Purwarianti:
NusaCrowd: Open Source Initiative for Indonesian NLP Resources. ACL (Findings) 2023: 13745-13818 - [c30]Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel:
Crosslingual Generalization through Multitask Finetuning. ACL (1) 2023: 15991-16111 - [c29]Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Rahmad Mahendra, Fajri Koto, Ade Romadhony, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Pascale Fung:
NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages. EACL 2023: 815-834 - [c28]Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji:
LLM-powered Data Augmentation for Enhanced Cross-lingual Performance. EMNLP 2023: 671-686 - [c27]Ruochen Zhang, Samuel Cahyawijaya, Jan Christian Blaise Cruz, Genta Indra Winata, Alham Fikri Aji:
Multilingual Large Language Models Are Not (Yet) Code-Switchers. EMNLP 2023: 12567-12582 - [c26]Yueqi Song, Simran Khanuja, Pengfei Liu, Fahim Faisal, Alissa Ostapenko, Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Yulia Tsvetkov, Antonios Anastasopoulos, Graham Neubig:
GlobalBench: A Benchmark for Global Progress in Natural Language Processing. EMNLP 2023: 14157-14171 - [c25]Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Dea Adhista, Emmanuel Dave, Sarah Oktavianti, Salsabil Maulana Akbar, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Hanung Wahyuning Linuwih, Bryan Wilie, Galih Pradipta Muridan, Genta Indra Winata, David Moeljadi, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages. IJCNLP (1) 2023: 921-945 - [i42]Zheng Xin Yong, Ruochen Zhang, Jessica Zosa Forde, Skyler Wang, Samuel Cahyawijaya, Holy Lovenia, Genta Indra Winata, Lintang Sutawika, Jan Christian Blaise Cruz, Long Phan, Yin Lin Tan, Alham Fikri Aji:
Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages. CoRR abs/2303.13592 (2023) - [i41]Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji:
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions. CoRR abs/2304.14402 (2023) - [i40]Jinheon Baek, Alham Fikri Aji, Jens Lehmann, Sung Ju Hwang:
Direct Fact Retrieval from Knowledge Graphs without Entity Linking. CoRR abs/2305.12416 (2023) - [i39]Ruochen Zhang, Samuel Cahyawijaya, Jan Christian Blaise Cruz, Alham Fikri Aji:
Multilingual Large Language Models Are Not (Yet) Code-Switchers. CoRR abs/2305.14235 (2023) - [i38]Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji:
LLM-powered Data Augmentation for Enhanced Crosslingual Performance. CoRR abs/2305.14288 (2023) - [i37]Chenxi Whitehouse, Clara Vania, Alham Fikri Aji, Christos E. Christodoulopoulos, Andrea Pierleoni:
WebIE: Faithful and Robust Information Extraction on the Web. CoRR abs/2305.14293 (2023) - [i36]Yueqi Song, Catherine Cui, Simran Khanuja, Pengfei Liu, Fahim Faisal, Alissa Ostapenko, Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Yulia Tsvetkov, Antonios Anastasopoulos, Graham Neubig:
GlobalBench: A Benchmark for Global Progress in Natural Language Processing. CoRR abs/2305.14716 (2023) - [i35]Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Chenxi Whitehouse, Osama Mohammed Afzal, Tarek Mahmoud, Alham Fikri Aji, Preslav Nakov:
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection. CoRR abs/2305.14902 (2023) - [i34]Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin:
Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation. CoRR abs/2305.15011 (2023) - [i33]Anubha Kabra, Emmy Liu, Simran Khanuja, Alham Fikri Aji, Genta Indra Winata, Samuel Cahyawijaya, Aremu Anuoluwapo, Perez Ogayo, Graham Neubig:
Multi-lingual and Multi-cultural Figurative Language Understanding. CoRR abs/2305.16171 (2023) - [i32]Made Nindyatama Nityasya, Haryo Akbarianto Wibowo, Alham Fikri Aji, Genta Indra Winata, Radityo Eko Prasojo, Phil Blunsom, Adhiguna Kuncoro:
On "Scientific Debt" in NLP: A Case for More Rigour in Language Model Pre-Training Research. CoRR abs/2306.02870 (2023) - [i31]Jinheon Baek, Alham Fikri Aji, Amir Saffari:
Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering. CoRR abs/2306.04136 (2023) - [i30]Minghao Wu, Alham Fikri Aji:
Style Over Substance: Evaluation Biases for Large Language Models. CoRR abs/2307.03025 (2023) - [i29]Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, Satheesh Katipomu, Haonan Li, Fajri Koto, Osama Mohammed Afzal, Samta Kamboj, Onkar Pandit, Rahul Pal, Lalit Pradhan, Zain Muhammad Mujahid, Massa Baali, Alham Fikri Aji, Zhengzhong Liu, Andy Hock, Andrew Feldman, Jonathan Lee, Andrew Jackson, Preslav Nakov, Timothy Baldwin, Eric P. Xing:
Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models. CoRR abs/2308.16149 (2023) - [i28]Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Dea Adhista, Emmanuel Dave, Sarah Oktavianti, Salsabil Maulana Akbar, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Hanung Wahyuning Linuwih, Bryan Wilie, Galih Pradipta Muridan, Genta Indra Winata, David Moeljadi, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages. CoRR abs/2309.10661 (2023) - [i27]Ni Putu Intan Maharani, Ayu Purwarianti, Alham Fikri Aji:
Low-Resource Clickbait Spoiling for Indonesian via Question Answering. CoRR abs/2310.08085 (2023) - [i26]Muhammad Razif Rizqullah, Ayu Purwarianti, Alham Fikri Aji:
QASiNa: Religious Domain Question Answering using Sirah Nabawiyah. CoRR abs/2310.08102 (2023) - [i25]Haryo Akbarianto Wibowo, Erland Hilman Fuadi, Made Nindyatama Nityasya, Radityo Eko Prasojo, Alham Fikri Aji:
COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances. CoRR abs/2311.01012 (2023) - 2022
- [c24]Alham Fikri Aji, Genta Indra Winata, Fajri Koto, Samuel Cahyawijaya, Ade Romadhony, Rahmad Mahendra, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Timothy Baldwin, Jey Han Lau, Sebastian Ruder:
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia. ACL (1) 2022: 7226-7249 - [c23]Veronika Laippala, Anna Salmela, Samuel Rönnqvist, Alham Fikri Aji, Li-Hsin Chang, Asma Dhifallah, Larissa Goulart, Henna Kortelainen, Marc Pàmies, Deise Prina Dutra, Valtteri Skantsi, Lintang Sutawika, Sampo Pyysalo:
Towards better structured and less noisy Web data: Oscar with Register annotations. W-NUT@COLING 2022: 215-221 - [c22]Priyanka Sen, Alham Fikri Aji, Amir Saffari:
Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering. COLING 2022: 1604-1619 - [c21]Siffi Singh, Alham Fikri Aji, Gaurav Singh, Christos Christodoulopoulos:
A Relation Extraction Dataset for Knowledge Extraction from Web Tables. COLING 2022: 2319-2327 - [c20]Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji, Andros Tjandra, Sakriani Sakti:
NIX-TTS: Lightweight and End-to-End Text-to-Speech Via Module-Wise Distillation. SLT 2022: 970-976 - [i24]Made Nindyatama Nityasya, Haryo Akbarianto Wibowo, Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji:
Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models. CoRR abs/2201.00558 (2022) - [i23]Angelina McMillan-Major, Zaid Alyafeai, Stella Biderman, Kimbo Chen, Francesco De Toni, Gérard Dupont, Hady Elsahar, Chris Emezue, Alham Fikri Aji, Suzana Ilic, Nurulaqilla Khamis, Colin Leong, Maraim Masoud, Aitor Soroa, Pedro Javier Ortiz Suárez, Zeerak Talat, Daniel van Strien, Yacine Jernite:
Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources. CoRR abs/2201.10066 (2022) - [i22]Alham Fikri Aji, Genta Indra Winata, Fajri Koto, Samuel Cahyawijaya, Ade Romadhony, Rahmad Mahendra, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Timothy Baldwin, Jey Han Lau, Sebastian Ruder:
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia. CoRR abs/2203.13357 (2022) - [i21]Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji:
Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation. CoRR abs/2203.15643 (2022) - [i20]Alham Fikri Aji, Tirana Noor Fatyanosa, Radityo Eko Prasojo, Philip Arthur, Suci Fitriany, Salma Qonitah, Nadhifa Zulfa, Tomi Santoso, Mahendra Data:
ParaCotta: Synthetic Multilingual Paraphrase Corpora from the Most Diverse Translation Sample Pair. CoRR abs/2205.04651 (2022) - [i19]Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Rahmad Mahendra, Fajri Koto, Ade Romadhony, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Pascale Fung, Timothy Baldwin, Jey Han Lau, Rico Sennrich, Sebastian Ruder:
NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages. CoRR abs/2205.15960 (2022) - [i18]Samuel Cahyawijaya, Alham Fikri Aji, Holy Lovenia, Genta Indra Winata, Bryan Wilie, Rahmad Mahendra, Fajri Koto, David Moeljadi, Karissa Vincentio, Ade Romadhony, Ayu Purwarianti:
NusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian Languages. CoRR abs/2207.10524 (2022) - [i17]Priyanka Sen, Alham Fikri Aji, Amir Saffari:
Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering. CoRR abs/2210.01613 (2022) - [i16]Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel:
Crosslingual Generalization through Multitask Finetuning. CoRR abs/2211.01786 (2022) - [i15]Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilic, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, et al.:
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. CoRR abs/2211.05100 (2022) - [i14]Zheng Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M. Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Indra Winata, Stella Biderman, Dragomir Radev, Vassilina Nikoulina:
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting. CoRR abs/2212.09535 (2022) - [i13]Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Fajri Koto, Jennifer Santoso, David Moeljadi, Cahya Wirawan, Frederikus Hudi, Ivan Halim Parmonangan, Ika Alfina, Muhammad Satrio Wicaksono, Ilham Firdausi Putra, Samsul Rahmadani, Yulianti Oenang, Ali Akbar Septiandri, James Jaya, Kaustubh D. Dhole, Arie Ardiyanti Suryani, Rifki Afina Putri, Dan Su, Keith Stevens, Made Nindyatama Nityasya, Muhammad Farid Adilazuarda, Ryan Ignatius, Ryandito Diandaru, Tiezheng Yu, Vito Ghifari, Wenliang Dai, Yan Xu, Dyah Damapuspita, Cuk Tho, Ichwanul Muslim Karo Karo, Tirana Noor Fatyanosa, Ziwei Ji, Pascale Fung, Graham Neubig, Timothy Baldwin, Sebastian Ruder, Herry Sujaini, Sakriani Sakti, Ayu Purwarianti:
NusaCrowd: Open Source Initiative for Indonesian NLP Resources. CoRR abs/2212.09648 (2022) - [i12]Genta Indra Winata, Alham Fikri Aji, Zheng Xin Yong, Thamar Solorio:
The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges. CoRR abs/2212.09660 (2022) - 2021
- [c19]Haryo Akbarianto Wibowo, Made Nindyatama Nityasya, Afra Feyza Akyürek, Suci Fitriany, Alham Fikri Aji, Radityo Eko Prasojo, Derry Tanti Wijaya:
IndoCollex: A Testbed for Morphological Transformation of Indonesian Word Colloquialism. ACL/IJCNLP (Findings) 2021: 3170-3183 - [c18]Rahmad Mahendra, Alham Fikri Aji, Samuel Louvan, Fahrurrozi Rahman, Clara Vania:
IndoNLI: A Natural Language Inference Dataset for Indonesian. EMNLP (1) 2021: 10511-10527 - [c17]Alham Fikri Aji, Radityo Eko Prasojo, Tirana Noor Fatyanosa, Philip Arthur, Suci Fitriany, Salma Qonitah, Nadhifa Zulfa, Tomi Santoso, Mahendra Data:
ParaCotta: Synthetic Multilingual Paraphrase Corpora from the Most Diverse Translation Sample Pair. PACLIC 2021: 533-542 - [c16]Alham Fikri Aji, Made Nindyatama Nityasya, Haryo Akbarianto Wibowo, Radityo Eko Prasojo, Tirana Fatyanosa:
BERT Goes Brrr: A Venture Towards the Lesser Error in Classifying Medical Self-Reporters on Twitter. SMM4H@NAACL-HLT 2021: 58-64 - [c15]Proyag Pal, Alham Fikri Aji, Pinzhen Chen, Sukanta Sen:
The University of Edinburgh's Bengali-Hindi Submissions to the WMT21 News Translation Task. WMT@EMNLP 2021: 180-186 - [c14]Maximiliana Behnke, Nikolay Bogoychev, Alham Fikri Aji, Kenneth Heafield, Graeme Nail, Qianqian Zhu, Svetlana Tchistiakova, Jelmer van der Linde, Pinzhen Chen, Sidharth Kashyap, Roman Grundkiewicz:
Efficient Machine Translation with Model Pruning and Quantization. WMT@EMNLP 2021: 775-780 - [i11]Rahmad Mahendra, Alham Fikri Aji, Samuel Louvan, Fahrurrozi Rahman, Clara Vania:
IndoNLI: A Natural Language Inference Dataset for Indonesian. CoRR abs/2110.14566 (2021) - 2020
- [b1]Alham Fikri Aji:
Approximating neural machine translation for efficiency. University of Edinburgh, UK, 2020 - [c13]Alham Fikri Aji, Nikolay Bogoychev, Kenneth Heafield, Rico Sennrich:
In Neural Machine Translation, What Does Transfer Learning Transfer? ACL 2020: 7701-7710 - [c12]Tri Wahyu Guntara, Alham Fikri Aji, Radityo Eko Prasojo:
Benchmarking Multidomain English-Indonesian Machine Translation. BUCC@LREC 2020: 35-43 - [c11]Alham Fikri Aji, Kenneth Heafield:
Compressing Neural Machine Translation Models with 4-bit Precision. NGT@ACL 2020: 35-42 - [c10]Nikolay Bogoychev, Roman Grundkiewicz, Alham Fikri Aji, Maximiliana Behnke, Kenneth Heafield, Sidharth Kashyap, Emmanouil-Ioannis Farsarakis, Mateusz Chudyk:
Edinburgh's Submissions to the 2020 Machine Translation Efficiency Task. NGT@ACL 2020: 218-224 - [c9]Haryo Akbarianto Wibowo, Tatag Aziz Prawiro, Muhammad Ihsan, Alham Fikri Aji, Radityo Eko Prasojo, Rahmad Mahendra, Suci Fitriany:
Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation. IALP 2020: 310-315 - [i10]Haryo Akbarianto Wibowo, Tatag Aziz Prawiro, Muhammad Ihsan, Alham Fikri Aji, Radityo Eko Prasojo, Rahmad Mahendra:
Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation. CoRR abs/2011.03286 (2020) - [i9]Made Nindyatama Nityasya, Haryo Akbarianto Wibowo, Radityo Eko Prasojo, Alham Fikri Aji:
No Budget? Don't Flex! Cost Consideration when Planning to Adopt NLP for Your Business. CoRR abs/2012.08958 (2020) - [i8]Asrul Sani Ariesandy, Mukhlis Amien, Alham Fikri Aji, Radityo Eko Prasojo:
Synthetic Source Language Augmentation for Colloquial Neural Machine Translation. CoRR abs/2012.15178 (2020) - [i7]Alham Fikri Aji, Kenneth Heafield:
Exploring Monolingual Data for Neural Machine Translation with Knowledge Distillation. CoRR abs/2012.15455 (2020)
2010 – 2019
- 2019
- [c8]Alham Fikri Aji, Kenneth Heafield:
Making Asynchronous Stochastic Gradient Descent Work for Transformers. NGT@EMNLP-IJCNLP 2019: 80-89 - [c7]Young Jin Kim, Marcin Junczys-Dowmunt, Hany Hassan, Alham Fikri Aji, Kenneth Heafield, Roman Grundkiewicz, Nikolay Bogoychev:
From Research to Production and Back: Ludicrously Fast Neural Machine Translation. NGT@EMNLP-IJCNLP 2019: 280-288 - [c6]Alham Fikri Aji, Kenneth Heafield, Nikolay Bogoychev:
Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training. EMNLP/IJCNLP (1) 2019: 3624-3629 - [i6]