default search action
Olivier Pietquin
Person information
- affiliation: Google DeepMind
- affiliation: University Lille 1, France
SPARQL queries
🛈 Please note that only 52% of the records listed on this page have a DOI. Therefore, DOI-based queries can only provide partial results.
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c151]Kai Cui, Gökçe Dayanikli, Mathieu Laurière, Matthieu Geist, Olivier Pietquin, Heinz Koeppl:
Learning Discrete-Time Major-Minor Mean Field Games. AAAI 2024: 9616-9625 - [c150]Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker:
Back to Basics: Revisiting REINFORCE-Style Optimization for Learning from Human Feedback in LLMs. ACL (1) 2024: 12248-12267 - [c149]Mathieu Rita, Florian Strub, Rahma Chaabouni, Paul Michel, Emmanuel Dupoux, Olivier Pietquin:
Countering Reward Over-Optimization in LLM with Demonstration-Guided Reinforcement Learning. ACL (Findings) 2024: 12447-12472 - [c148]Zida Wu, Mathieu Laurière, Samuel Jia Cong Chua, Matthieu Geist, Olivier Pietquin, Ankur Mehta:
Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning. AAMAS 2024: 2561-2563 - [c147]Geoffrey Cideron, Sertan Girgin, Mauro Verzetti, Damien Vincent, Matej Kastelic, Zalán Borsos, Brian McWilliams, Victor Ungureanu, Olivier Bachem, Olivier Pietquin, Matthieu Geist, Léonard Hussenot, Neil Zeghidour, Andrea Agostinelli:
MusicRL: Aligning Music Generation to Human Preferences. ICML 2024 - [i85]Geoffrey Cideron, Sertan Girgin, Mauro Verzetti, Damien Vincent, Matej Kastelic, Zalán Borsos, Brian McWilliams, Victor Ungureanu, Olivier Bachem, Olivier Pietquin, Matthieu Geist, Léonard Hussenot, Neil Zeghidour, Andrea Agostinelli:
MusicRL: Aligning Music Generation to Human Preferences. CoRR abs/2402.04229 (2024) - [i84]Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker:
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs. CoRR abs/2402.14740 (2024) - [i83]Zida Wu, Mathieu Laurière, Samuel Jia Cong Chua, Matthieu Geist, Olivier Pietquin, Ankur Mehta:
Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning. CoRR abs/2403.03552 (2024) - [i82]Mathieu Rita, Paul Michel, Rahma Chaabouni, Olivier Pietquin, Emmanuel Dupoux, Florian Strub:
Language Evolution with Deep Learning. CoRR abs/2403.11958 (2024) - [i81]Mathieu Rita, Florian Strub, Rahma Chaabouni, Paul Michel, Emmanuel Dupoux, Olivier Pietquin:
Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning. CoRR abs/2404.19409 (2024) - [i80]Eugene Choi, Arash Ahmadian, Matthieu Geist, Olivier Pietquin, Mohammad Gheshlaghi Azar:
Self-Improving Robust Preference Optimization. CoRR abs/2406.01660 (2024) - [i79]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. CoRR abs/2406.19185 (2024) - [i78]Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Olivier Pietquin, Matthieu Geist:
Averaging log-likelihoods in direct alignment. CoRR abs/2406.19188 (2024) - 2023
- [j16]Eugene Kharitonov, Damien Vincent, Zalán Borsos, Raphaël Marinier, Sertan Girgin, Olivier Pietquin, Matt Sharifi, Marco Tagliasacchi, Neil Zeghidour:
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision. Trans. Assoc. Comput. Linguistics 11: 1703-1718 (2023) - [j15]Zalán Borsos, Raphaël Marinier, Damien Vincent, Eugene Kharitonov, Olivier Pietquin, Matthew Sharifi, Dominik Roblek, Olivier Teboul, David Grangier, Marco Tagliasacchi, Neil Zeghidour:
AudioLM: A Language Modeling Approach to Audio Generation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2523-2533 (2023) - [c146]Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos Garea, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor:
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback. ACL (1) 2023: 6252-6272 - [c145]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175 - [c144]Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist:
On Imitation in Mean-field Games. NeurIPS 2023 - [i77]Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil Zeghidour, Jesse H. Engel:
SingSong: Generating musical accompaniments from singing. CoRR abs/2301.12662 (2023) - [i76]Eugene Kharitonov, Damien Vincent, Zalán Borsos, Raphaël Marinier, Sertan Girgin, Olivier Pietquin, Matthew Sharifi, Marco Tagliasacchi, Neil Zeghidour:
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision. CoRR abs/2302.03540 (2023) - [i75]Geoffrey Cideron, Baruch Tabanpour, Sebastian Curi, Sertan Girgin, Léonard Hussenot, Gabriel Dulac-Arnold, Matthieu Geist, Olivier Pietquin, Robert Dadashi:
Get Back Here: Robust Imitation by Return-to-Distribution Planning. CoRR abs/2305.01400 (2023) - [i74]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023) - [i73]Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor:
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback. CoRR abs/2306.00186 (2023) - [i72]Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist:
On Imitation in Mean-field Games. CoRR abs/2306.14799 (2023) - [i71]Kai Cui, Gökçe Dayanikli, Mathieu Laurière, Matthieu Geist, Olivier Pietquin, Heinz Koeppl:
Learning Discrete-Time Major-Minor Mean Field Games. CoRR abs/2312.10787 (2023) - 2022
- [c143]Shideh Rezaeifar, Robert Dadashi, Nino Vieillard, Léonard Hussenot, Olivier Bachem, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning as Anti-exploration. AAAI 2022: 8106-8114 - [c142]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Romuald Élie, Matthieu Geist, Olivier Pietquin:
Generalization in Mean Field Games by Learning Master Policies. AAAI 2022: 9413-9421 - [c141]Nino Vieillard, Marcin Andrychowicz, Anton Raichuk, Olivier Pietquin, Matthieu Geist:
Implicitly Regularized RL with Implicit Q-values. AISTATS 2022: 1380-1402 - [c140]Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: The Mean-field Game Viewpoint. AAMAS 2022: 489-497 - [c139]Alexis Jacq, Johan Ferret, Olivier Pietquin, Matthieu Geist:
Lazy-MDPs: Towards Interpretable RL by Learning When to Act. AAMAS 2022: 669-677 - [c138]Paul Muller, Mark Rowland, Romuald Elie, Georgios Piliouras, Julien Pérolat, Mathieu Laurière, Raphaël Marinier, Olivier Pietquin, Karl Tuyls:
Learning Equilibria in Mean-Field Games: Introducing Mean-Field PSRO. AAMAS 2022: 926-934 - [c137]Julien Pérolat, Sarah Perrin, Romuald Elie, Mathieu Laurière, Georgios Piliouras, Matthieu Geist, Karl Tuyls, Olivier Pietquin:
Scaling Mean Field Games by Online Mirror Descent. AAMAS 2022: 1028-1037 - [c136]Theophile Cabannes, Mathieu Laurière, Julien Pérolat, Raphaël Marinier, Sertan Girgin, Sarah Perrin, Olivier Pietquin, Alexandre M. Bayen, Eric Goubault, Romuald Elie:
Solving N-Player Dynamic Routing Games with Congestion: A Mean-Field Approach. AAMAS 2022: 1557-1559 - [c135]Mathieu Rita, Florian Strub, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux:
On the role of population heterogeneity in emergent communication. ICLR 2022 - [c134]Robert Dadashi, Léonard Hussenot, Damien Vincent, Sertan Girgin, Anton Raichuk, Matthieu Geist, Olivier Pietquin:
Continuous Control with Action Quantization from Demonstrations. ICML 2022: 4537-4557 - [c133]Mathieu Laurière, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Pérolat, Romuald Elie, Olivier Pietquin, Matthieu Geist:
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games. ICML 2022: 12078-12095 - [c132]Alice Martin, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin:
Learning Natural Language Generation with Truncated Reinforcement Learning. NAACL-HLT 2022: 12-37 - [c131]Mathieu Rita, Corentin Tallec, Paul Michel, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub:
Emergent Communication: Generalization and Overfitting in Lewis Games. NeurIPS 2022 - [i70]Alexis Jacq, Johan Ferret, Olivier Pietquin, Matthieu Geist:
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act. CoRR abs/2203.08542 (2022) - [i69]Mathieu Laurière, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Pérolat, Romuald Élie, Olivier Pietquin, Matthieu Geist:
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games. CoRR abs/2203.11973 (2022) - [i68]Mathieu Rita, Florian Strub, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux:
On the role of population heterogeneity in emergent communication. CoRR abs/2204.12982 (2022) - [i67]Mathieu Laurière, Sarah Perrin, Matthieu Geist, Olivier Pietquin:
Learning Mean Field Games: A Survey. CoRR abs/2205.12944 (2022) - [i66]Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022) - [i65]Paul Muller, Romuald Elie, Mark Rowland, Mathieu Laurière, Julien Pérolat, Sarah Perrin, Matthieu Geist, Georgios Piliouras, Olivier Pietquin, Karl Tuyls:
Learning Correlated Equilibria in Mean-Field Games. CoRR abs/2208.10138 (2022) - [i64]Zalán Borsos, Raphaël Marinier, Damien Vincent, Eugene Kharitonov, Olivier Pietquin, Matthew Sharifi, Olivier Teboul, David Grangier, Marco Tagliasacchi, Neil Zeghidour:
AudioLM: a Language Modeling Approach to Audio Generation. CoRR abs/2209.03143 (2022) - [i63]Geoffrey Cideron, Sertan Girgin, Anton Raichuk, Olivier Pietquin, Olivier Bachem, Léonard Hussenot:
vec2text with Round-Trip Translations. CoRR abs/2209.06792 (2022) - [i62]Mathieu Rita, Corentin Tallec, Paul Michel, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub:
Emergent Communication: Generalization and Overfitting in Lewis Games. CoRR abs/2209.15342 (2022) - [i61]Alexis Jacq, Manu Orsini, Gabriel Dulac-Arnold, Olivier Pietquin, Matthieu Geist, Olivier Bachem:
C3PO: Learning to Achieve Arbitrary Goals via Massively Entropic Pretraining. CoRR abs/2211.03521 (2022) - 2021
- [c130]Johan Ferret, Olivier Pietquin, Matthieu Geist:
Self-Imitation Advantage Learning. AAMAS 2021: 501-509 - [c129]Léonard Hussenot, Robert Dadashi, Matthieu Geist, Olivier Pietquin:
Show Me the Way: Intrinsic Motivation from Demonstrations. AAMAS 2021: 620-628 - [c128]Aaqib Saeed, David Grangier, Olivier Pietquin, Neil Zeghidour:
Learning From Heterogeneous Eeg Signals with Differentiable Channel Reordering. ICASSP 2021: 1255-1259 - [c127]Marcin Andrychowicz, Anton Raichuk, Piotr Stanczyk, Manu Orsini, Sertan Girgin, Raphaël Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem:
What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study. ICLR 2021 - [c126]Robert Dadashi, Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
Primal Wasserstein Imitation Learning. ICLR 2021 - [c125]Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
Adversarially Guided Actor-Critic. ICLR 2021 - [c124]Robert Dadashi, Shideh Rezaeifar, Nino Vieillard, Léonard Hussenot, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning with Pseudometric Learning. ICML 2021: 2307-2318 - [c123]Léonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Sabela Ramos, Nikola Momchev, Sertan Girgin, Raphaël Marinier, Lukasz Stafiniak, Manu Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin:
Hyperparameter Selection for Imitation Learning. ICML 2021: 4511-4522 - [c122]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Matthieu Geist, Romuald Élie, Olivier Pietquin:
Mean Field Games Flock! The Reinforcement Learning Way. IJCAI 2021: 356-362 - [c121]Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin:
Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness. IJCAI 2021: 2950-2956 - [c120]Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. NeurIPS 2021: 1898-1911 - [c119]Manu Orsini, Anton Raichuk, Léonard Hussenot, Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, Marcin Andrychowicz:
What Matters for Adversarial Imitation Learning? NeurIPS 2021: 14656-14668 - [i60]Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
Adversarially Guided Actor-Critic. CoRR abs/2102.04376 (2021) - [i59]Julien Pérolat, Sarah Perrin, Romuald Elie, Mathieu Laurière, Georgios Piliouras, Matthieu Geist, Karl Tuyls, Olivier Pietquin:
Scaling up Mean Field Games with Online Mirror Descent. CoRR abs/2103.00623 (2021) - [i58]Robert Dadashi, Shideh Rezaeifar, Nino Vieillard, Léonard Hussenot, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning with Pseudometric Learning. CoRR abs/2103.01948 (2021) - [i57]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Matthieu Geist, Romuald Élie, Olivier Pietquin:
Mean Field Games Flock! The Reinforcement Learning Way. CoRR abs/2105.07933 (2021) - [i56]Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin:
Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness. CoRR abs/2105.09992 (2021) - [i55]Léonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Lukasz Stafiniak, Sertan Girgin, Raphaël Marinier, Nikola Momchev, Sabela Ramos, Manu Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin:
Hyperparameter Selection for Imitation Learning. CoRR abs/2105.12034 (2021) - [i54]Manu Orsini, Anton Raichuk, Léonard Hussenot, Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, Marcin Andrychowicz:
What Matters for Adversarial Imitation Learning? CoRR abs/2106.00672 (2021) - [i53]Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: the Mean-field Game viewpoint. CoRR abs/2106.03787 (2021) - [i52]Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. CoRR abs/2106.04480 (2021) - [i51]Shideh Rezaeifar, Robert Dadashi, Nino Vieillard, Léonard Hussenot, Olivier Bachem, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning as Anti-Exploration. CoRR abs/2106.06431 (2021) - [i50]Nino Vieillard, Marcin Andrychowicz, Anton Raichuk, Olivier Pietquin, Matthieu Geist:
Implicitly Regularized RL with Implicit Q-Values. CoRR abs/2108.07041 (2021) - [i49]Alice Martin, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin:
Learning Natural Language Generation from Scratch. CoRR abs/2109.09371 (2021) - [i48]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Romuald Élie, Matthieu Geist, Olivier Pietquin:
Generalization in Mean Field Games by Learning Master Policies. CoRR abs/2109.09717 (2021) - [i47]Robert Dadashi, Léonard Hussenot, Damien Vincent, Sertan Girgin, Anton Raichuk, Matthieu Geist, Olivier Pietquin:
Continuous Control with Action Quantization from Demonstrations. CoRR abs/2110.10149 (2021) - [i46]Theophile Cabannes, Mathieu Laurière, Julien Pérolat, Raphaël Marinier, Sertan Girgin, Sarah Perrin, Olivier Pietquin, Alexandre M. Bayen, Éric Goubault, Romuald Elie:
Solving N-player dynamic routing games with congestion: a mean field approach. CoRR abs/2110.11943 (2021) - [i45]Sabela Ramos, Sertan Girgin, Léonard Hussenot, Damien Vincent, Hanna Yakubovich, Daniel Toyama, Anita Gergely, Piotr Stanczyk, Raphaël Marinier, Jeremiah Harmsen, Olivier Pietquin, Nikola Momchev:
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning. CoRR abs/2111.02767 (2021) - [i44]Paul Muller, Mark Rowland, Romuald Elie, Georgios Piliouras, Julien Pérolat, Mathieu Laurière, Raphaël Marinier, Olivier Pietquin, Karl Tuyls:
Learning Equilibria in Mean-Field Games: Introducing Mean-Field PSRO. CoRR abs/2111.08350 (2021) - 2020
- [c118]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Deep Conservative Policy Iteration. AAAI 2020: 6070-6077 - [c117]Romuald Elie, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Olivier Pietquin:
On the Convergence of Model Free Learning in Mean Field Games. AAAI 2020: 7143-7150 - [c116]Alexis Jacq, Julien Pérolat, Matthieu Geist, Olivier Pietquin:
Foolproof Cooperative Learning. ACML 2020: 401-416 - [c115]Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist:
Momentum in Reinforcement Learning. AISTATS 2020: 2529-2538 - [c114]Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
CopyCAT: : Taking Control of Neural Policies with Constant Attacks. AAMAS 2020: 548-556 - [c113]Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron C. Courville:
Supervised Seeded Iterated Learning for Interactive Language Learning. EMNLP (1) 2020: 3962-3970 - [c112]Yuchen Lu, Soumye Singhal, Florian Strub, Aaron C. Courville, Olivier Pietquin:
Countering Language Drift with Seeded Iterated Learning. ICML 2020: 6437-6447 - [c111]Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin:
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning. IJCAI 2020: 2655-2661 - [c110]Mathieu Seurin, Philippe Preux, Olivier Pietquin:
"I'm Sorry Dave, I'm Afraid I Can't Do That" Deep Q-Learning from Forbidden Actions. IJCNN 2020: 1-8 - [c109]Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin:
A Machine of Few Words: Interactive Speaker Recognition with Reinforcement Learning. INTERSPEECH 2020: 4323-4327 - [c108]Sarah Perrin, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Romuald Elie, Olivier Pietquin:
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications. NeurIPS 2020 - [c107]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning. NeurIPS 2020 - [c106]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Munchausen Reinforcement Learning. NeurIPS 2020 - [c105]Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin:
HIGhER: Improving instruction following with Hindsight Generation for Experience Replay. SSCI 2020: 225-232 - [p3]Olivier Buffet, Olivier Pietquin, Paul Weng:
Reinforcement Learning. A Guided Tour of Artificial Intelligence Research (1) (I) 2020: 389-414 - [e1]Olivier Pietquin, Smaranda Muresan, Vivian Chen, Casey Kennington, David Vandyke, Nina Dethlefs, Koji Inoue, Erik Ekstedt, Stefan Ultes:
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGdial 2020, 1st virtual meeting, July 1-3, 2020. Association for Computational Linguistics 2020, ISBN 978-1-952148-02-6 [contents] - [i43]Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron C. Courville:
Countering Language Drift with Seeded Iterated Learning. CoRR abs/2003.12694 (2020) - [i42]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of Regularization in RL. CoRR abs/2003.14089 (2020) - [i41]Olivier Buffer, Olivier Pietquin, Paul Weng:
Reinforcement Learning. CoRR abs/2005.14419 (2020) - [i40]Robert Dadashi, Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
Primal Wasserstein Imitation Learning. CoRR abs/2006.04678 (2020) - [i39]Marcin Andrychowicz, Anton Raichuk, Piotr Stanczyk, Manu Orsini, Sertan Girgin, Raphaël Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem:
What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study. CoRR abs/2006.05990 (2020) - [i38]Léonard Hussenot, Robert Dadashi, Matthieu Geist, Olivier Pietquin:
Show me the Way: Intrinsic Motivation from Demonstrations. CoRR abs/2006.12917 (2020) - [i37]Sarah Perrin, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Romuald Elie, Olivier Pietquin:
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications. CoRR abs/2007.03458 (2020) - [i36]Alice Martin, Charles Ollion, Florian Strub, Sylvain Le Corff, Olivier Pietquin:
The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction. CoRR abs/2007.08620 (2020) - [i35]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Munchausen Reinforcement Learning. CoRR abs/2007.14430 (2020) - [i34]Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin:
A Machine of Few Words - Interactive Speaker Recognition with Reinforcement Learning. CoRR abs/2008.03127 (2020) - [i33]Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron C. Courville:
Supervised Seeded Iterated Learning for Interactive Language Learning. CoRR abs/2010.02975 (2020) - [i32]Aaqib Saeed, David Grangier, Olivier Pietquin, Neil Zeghidour:
Learning from Heterogeneous EEG Signals with Differentiable Channel Reordering. CoRR abs/2010.13694 (2020) - [i31]Johan Ferret, Olivier Pietquin, Matthieu Geist:
Self-Imitation Advantage Learning. CoRR abs/2012.11989 (2020)
2010 – 2019
- 2019
- [c104]Diana Borsa, Nicolas Heess, Bilal Piot, Siqi Liu, Leonard Hasenclever, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. AAMAS 2019: 1117-1124 - [c103]Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
A Theory of Regularized Markov Decision Processes. ICML 2019: 2160-2169 - [c102]Alexis Jacq, Matthieu Geist, Ana Paiva, Olivier Pietquin:
Learning from a Learner. ICML 2019: 2990-2999 - [c101]Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin:
Budgeted Reinforcement Learning in Continuous State Space. NeurIPS 2019: 9295-9305 - [i30]