default search action
Yaodong Yang 0001
Adam Yang 0001 – 杨耀东
Person information
- unicode name: 杨耀东
- affiliation: Peking University, Institute for AI, Beijing, China
- affiliation (former): King's College London, UK
- affiliation (former): Huawei Technologies, Noah's Ark Lab, UK
- affiliation (PhD): University College London, UK
Other persons with the same name
- Yaodong Yang — disambiguation page
- Yaodong Yang 0002 — Chinese University of Hong Kong, Hong Kong (and 2 more)
- Yaodong Yang 0003 — University of Nebraska - Lincoln, USA
- Yaodong Yang 0004 — University of Science and Technology Beijing, Beijing, China
- Yaodong Yang 0005 — Hefei University of Technology, School of Mathematics, China
- Adam X. Yang (aka: Adam Yang 0002) — University of Bristol, Department of Computer Science, UK
- Adam Yang 0003 — University of Maryland, Department of Computer Science, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j20]Yifan Zhong, Jakub Grudzien Kuba, Xidong Feng, Siyi Hu, Jiaming Ji, Yaodong Yang:
Heterogeneous-Agent Reinforcement Learning. J. Mach. Learn. Res. 25: 32:1-32:67 (2024) - [j19]Dongzi Wang, Fangwei Zhong, Minglong Li, Muning Wen, Yuanxi Peng, Teng Li, Adam Yang:
RoMAT: Role-based multi-agent transformer for generalizable heterogeneous cooperation. Neural Networks 174: 106129 (2024) - [j18]Jie Liu, Yinmin Zhang, Chuming Li, Yaodong Yang, Yu Liu, Wanli Ouyang:
Adaptive pessimism via target Q-value for offline reinforcement learning. Neural Networks 180: 106588 (2024) - [j17]Yuanpei Chen, Yiran Geng, Fangwei Zhong, Jiaming Ji, Jiechuang Jiang, Zongqing Lu, Hao Dong, Yaodong Yang:
Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation. IEEE Trans. Pattern Anal. Mach. Intell. 46(5): 2804-2818 (2024) - [j16]Chenguang Wang, Zhouliang Yu, Stephen McAleer, Tianshu Yu, Yaodong Yang:
ASP: Learn a Universal Neural Solver! IEEE Trans. Pattern Anal. Mach. Intell. 46(6): 4102-4114 (2024) - [j15]Yuyang Li, Bo Liu, Yiran Geng, Puhao Li, Yaodong Yang, Yixin Zhu, Tengyu Liu, Siyuan Huang:
Grasp Multiple Objects With One Hand. IEEE Robotics Autom. Lett. 9(5): 4027-4034 (2024) - [j14]Yang Li, Fanglei Sun, Jingchen Hu, Chang Liu, Fan Wu, Kai Li, Ying Wen, Zheng Tian, Yaodong Yang, Jiangcheng Zhu, Zhifeng Chen, Jun Wang, Yang Yang:
Self-Supervised MAFENN for Classifying Low-Labeled Distorted Images Over Mobile Fading Channels. IEEE Trans. Mob. Comput. 23(8): 8077-8091 (2024) - [c68]Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning. AAAI 2024: 16908-16916 - [c67]Sirui Chen, Zhaowei Zhang, Yaodong Yang, Yali Du:
STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning. AAAI 2024: 17337-17345 - [c66]Ceyao Zhang, Kaijie Yang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, Yaodong Yang:
ProAgent: Building Proactive Cooperative Agents with Large Language Models. AAAI 2024: 17591-17599 - [c65]Shaoting Feng, Qinya Li, Yaodong Yang, Fan Wu, Guihai Chen:
GIPUT: Maximizing Photo Coverage Efficiency for UAV Trajectory. APWeb/WAIM (1) 2024: 391-406 - [c64]Le Cong Dinh, David Henry Mguni, Long Tran-Thanh, Jun Wang, Yaodong Yang:
A Summary of Online Markov Decision Processes with Non-oblivious Strategic Adversary. AAMAS 2024: 2830-2832 - [c63]Jieming Cui, Tengyu Liu, Nian Liu, Yaodong Yang, Yixin Zhu, Siyuan Huang:
AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents. CVPR 2024: 852-862 - [c62]Weidong Huang, Jiaming Ji, Chunhe Xia, Borong Zhang, Yaodong Yang:
SafeDreamer: Safe Reinforcement Learning with World Models. ICLR 2024 - [c61]Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang:
Safe RLHF: Safe Reinforcement Learning from Human Feedback. ICLR 2024 - [c60]Simin Li, Jun Guo, Jingqiao Xiu, Ruixiao Xu, Xin Yu, Jiakai Wang, Aishan Liu, Yaodong Yang, Xianglong Liu:
Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game. ICLR 2024 - [c59]Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang:
Maximum Entropy Heterogeneous-Agent Reinforcement Learning. ICLR 2024 - [c58]Siyuan Qi, Shuo Chen, Yexin Li, Xiangyu Kong, Junqi Wang, Bangcheng Yang, Pring Wong, Yifan Zhong, Xiaoyuan Zhang, Zhaowei Zhang, Nian Liu, Yaodong Yang, Song-Chun Zhu:
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents. ICLR 2024 - [c57]Juntao Dai, Yaodong Yang, Qian Zheng, Gang Pan:
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation. ICML 2024 - [c56]Yizhe Huang, Anji Liu, Fanqi Kong, Yaodong Yang, Song-Chun Zhu, Xue Feng:
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning. ICML 2024 - [c55]Ruiqing Chen, Xiaoyuan Zhang, Yali Du, Yifan Zhong, Zheng Tian, Fanglei Sun, Yaodong Yang:
Off-Agent Trust Region Policy Optimization. IJCAI 2024: 3798-3806 - [c54]Yue Zhang, Yaodong Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li:
Remember the Past for Better Future: Memory-Augmented Offline RL. IJCNN 2024: 1-8 - [i106]Siyuan Qi, Shuo Chen, Yexin Li, Xiangyu Kong, Junqi Wang, Bangcheng Yang, Pring Wong, Yifan Zhong, Xiaoyuan Zhang, Zhaowei Zhang, Nian Liu, Wei Wang, Yaodong Yang, Song-Chun Zhu:
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents. CoRR abs/2401.10568 (2024) - [i105]Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Qingfu Zhang, Siyuan Qi, Yaodong Yang:
Panacea: Pareto Alignment via Preference Adaptation for LLMs. CoRR abs/2402.02030 (2024) - [i104]Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Juntao Dai, Yaodong Yang:
Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction. CoRR abs/2402.02416 (2024) - [i103]Tianyi Qiu, Fanzhi Zeng, Jiaming Ji, Dong Yan, Kaile Wang, Jiayi Zhou, Han Yang, Josef Dai, Xuehai Pan, Yaodong Yang:
Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective. CoRR abs/2402.10184 (2024) - [i102]Zhaowei Zhang, Fengshuo Bai, Mingzhi Wang, Haoyang Ye, Chengdong Ma, Yaodong Yang:
Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects. CoRR abs/2402.12907 (2024) - [i101]Naming Liu, Mingzhi Wang, Youzhi Zhang, Yaodong Yang, Bo An, Ying Wen:
Leveraging Team Correlation for Approximating Equilibrium in Two-Team Zero-Sum Games. CoRR abs/2403.00255 (2024) - [i100]Tianhao Wu, Yunchong Gan, Mingdong Wu, Jingbo Cheng, Yaodong Yang, Yixin Zhu, Hao Dong:
UniDexFPM: Universal Dexterous Functional Pre-grasp Manipulation Via Diffusion Policy. CoRR abs/2403.12421 (2024) - [i99]Jieming Cui, Tengyu Liu, Nian Liu, Yaodong Yang, Yixin Zhu, Siyuan Huang:
AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents. CoRR abs/2403.12835 (2024) - [i98]Zhiyu Zhao, Ning Yang, Xue Yan, Haifeng Zhang, Jun Wang, Yaodong Yang:
Correlated Mean Field Imitation Learning. CoRR abs/2404.09324 (2024) - [i97]Fengshuo Bai, Rui Zhao, Hongming Zhang, Sijia Cui, Ying Wen, Yaodong Yang, Bo Xu, Lei Han:
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation. CoRR abs/2405.18688 (2024) - [i96]Fengshuo Bai, Mingzhi Wang, Zhaowei Zhang, Boyuan Chen, Yinda Xu, Ying Wen, Yaodong Yang:
Efficient Model-agnostic Alignment via Bayesian Persuasion. CoRR abs/2405.18718 (2024) - [i95]Jiesong Lian, Yucong Huang, Mingzhi Wang, Chengdong Ma, Yixue Hao, Ying Wen, Yaodong Yang:
Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles. CoRR abs/2405.21027 (2024) - [i94]Jiaming Ji, Kaile Wang, Tianyi Qiu, Boyuan Chen, Jiayi Zhou, Changye Li, Hantao Lou, Yaodong Yang:
Language Models Resist Alignment. CoRR abs/2406.06144 (2024) - [i93]Yizhe Huang, Anji Liu, Fanqi Kong, Yaodong Yang, Song-Chun Zhu, Xue Feng:
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning. CoRR abs/2406.08002 (2024) - [i92]Josef Dai, Tianle Chen, Xuyao Wang, Ziran Yang, Taiye Chen, Jiaming Ji, Yaodong Yang:
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset. CoRR abs/2406.14477 (2024) - [i91]Jiaming Ji, Donghai Hong, Borong Zhang, Boyuan Chen, Josef Dai, Boren Zheng, Tianyi Qiu, Boxun Li, Yaodong Yang:
PKU-SafeRLHF: A Safety Alignment Preference Dataset for Llama Family Models. CoRR abs/2406.15513 (2024) - [i90]Tianyi Qiu, Yang Zhang, Xuchuan Huang, Jasmine Xinze Li, Jiaming Ji, Yaodong Yang:
ProgressGym: Alignment with a Millennium of Moral Progress. CoRR abs/2406.20087 (2024) - [i89]Ruize Zhang, Zelai Xu, Chengdong Ma, Chao Yu, Wei-Wei Tu, Shiyu Huang, Deheng Ye, Wenbo Ding, Yaodong Yang, Yu Wang:
A Survey on Self-play Methods in Reinforcement Learning. CoRR abs/2408.01072 (2024) - [i88]Jiayi Zhou, Jiaming Ji, Juntao Dai, Yaodong Yang:
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback. CoRR abs/2409.00162 (2024) - [i87]Naming Liu, Mingzhi Wang, Xihuai Wang, Weinan Zhang, Yaodong Yang, Youzhi Zhang, Bo An, Ying Wen:
Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games. CoRR abs/2410.01575 (2024) - [i86]Qianxu Wang, Congyue Deng, Tyler Ga Wei Lum, Yuanpei Chen, Yaodong Yang, Jeannette Bohg, Yixin Zhu, Leonidas J. Guibas:
Neural Attention Field: Emerging Point Relevance in 3D Scenes for One-Shot Dexterous Grasping. CoRR abs/2410.23039 (2024) - 2023
- [j13]Le Cong Dinh, David Henry Mguni, Long Tran-Thanh, Jun Wang, Yaodong Yang:
Online Markov decision processes with non-oblivious strategic adversary. Auton. Agents Multi Agent Syst. 37(1): 15 (2023) - [j12]Shangding Gu, Jakub Grudzien Kuba, Yuanpei Chen, Yali Du, Long Yang, Alois C. Knoll, Yaodong Yang:
Safe multi-agent reinforcement learning for multi-robot control. Artif. Intell. 319: 103905 (2023) - [j11]Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Hai-Feng Zhang, Weinan Zhang:
Large sequence models for sequential decision-making: a survey. Frontiers Comput. Sci. 17(6): 176349 (2023) - [j10]Linghui Meng, Muning Wen, Chenyang Le, Xiyun Li, Dengpeng Xing, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Yaodong Yang, Bo Xu:
Offline Pre-trained Multi-agent Decision Transformer. Mach. Intell. Res. 20(2): 233-248 (2023) - [j9]Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Yong Yu, Jun Wang, Weinan Zhang:
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning. J. Mach. Learn. Res. 24: 150:1-150:12 (2023) - [j8]Siyi Hu, Yifan Zhong, Minquan Gao, Weixun Wang, Hao Dong, Xiaodan Liang, Zhihui Li, Xiaojun Chang, Yaodong Yang:
MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library. J. Mach. Learn. Res. 24: 315:1-315:23 (2023) - [j7]Jie Ren, Xidong Feng, Bo Liu, Xuehai Pan, Yao Fu, Luo Mai, Yaodong Yang:
TorchOpt: An Efficient Library for Differentiable Optimization. J. Mach. Learn. Res. 24: 367:1-367:14 (2023) - [j6]Yang Li, Kun Xiong, Yingping Zhang, Jiangcheng Zhu, Stephen Marcus McAleer, Wei Pan, Jun Wang, Zonghong Dai, Yaodong Yang:
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games. Trans. Mach. Learn. Res. 2023 (2023) - [c53]Chuming Li, Jie Liu, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
ACE: Cooperative Multi-Agent Q-learning with Bidirectional Action-Dependency. AAAI 2023: 8536-8544 - [c52]David Mguni, Taher Jafferjee, Jianhong Wang, Nicolas Perez Nieves, Wenbin Song, Feifei Tong, Matthew E. Taylor, Tianpei Yang, Zipeng Dai, Hui Chen, Jiangcheng Zhu, Kun Shao, Jun Wang, Yaodong Yang:
Learning to Shape Rewards Using a Game of Two Partners. AAAI 2023: 11604-11612 - [c51]Pei Xu, Junge Zhang, Qiyue Yin, Chao Yu, Yaodong Yang, Kaiqi Huang:
Subspace-Aware Exploration for Sparse-Reward Multi-Agent Tasks. AAAI 2023: 11717-11725 - [c50]Zhijian Duan, Wenhan Huang, Dinghuai Zhang, Yali Du, Jun Wang, Yaodong Yang, Xiaotie Deng:
Is Nash Equilibrium Approximator Learnable? AAMAS 2023: 233-241 - [c49]Binghao Huang, Yuanpei Chen, Tianyu Wang, Yuzhe Qin, Yaodong Yang, Nikolay Atanasov, Xiaolong Wang:
Dynamic Handover: Throw and Catch with Bimanual Hands. CoRL 2023: 1887-1902 - [c48]Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning. ECAI 2023: 1381-1388 - [c47]Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang:
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning. ICCV 2023: 3868-3879 - [c46]Shuang Wu, Jian Yao, Haobo Fu, Ye Tian, Chao Qian, Yaodong Yang, Qiang Fu, Wei Yang:
Quality-Similar Diversity via Population Based Reinforcement Learning. ICLR 2023 - [c45]David Henry Mguni, Haojun Chen, Taher Jafferjee, Jianhong Wang, Longfei Yue, Xidong Feng, Stephen Marcus McAleer, Feifei Tong, Jun Wang, Yaodong Yang:
MANSA: Learning Fast and Slow in Multi-Agent Systems. ICML 2023: 24631-24658 - [c44]Oliver Slumbers, David Henry Mguni, Stefano B. Blumberg, Stephen Marcus McAleer, Yaodong Yang, Jun Wang:
A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems. ICML 2023: 32059-32087 - [c43]Xiaohang Tang, Le Cong Dinh, Stephen Marcus McAleer, Yaodong Yang:
Regret-Minimizing Double Oracle for Extensive-Form Games. ICML 2023: 33599-33615 - [c42]Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai:
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models. ICML 2023: 36380-36390 - [c41]Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong:
RLAfford: End-to-End Affordance Learning for Robotic Manipulation. ICRA 2023: 5880-5886 - [c40]Puhao Li, Tengyu Liu, Yuyang Li, Yiran Geng, Yixin Zhu, Yaodong Yang, Siyuan Huang:
GenDexGrasp: Generalizable Dexterous Grasping. ICRA 2023: 8068-8074 - [c39]Jiaming Ji, Mickel Liu, Josef Dai, Xuehai Pan, Chi Zhang, Ce Bian, Boyuan Chen, Ruiyang Sun, Yizhou Wang, Yaodong Yang:
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset. NeurIPS 2023 - [c38]Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Josef Dai, Yaodong Yang:
Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark. NeurIPS 2023 - [c37]Stephen McAleer, Gabriele Farina, Gaoyue Zhou, Mingzhi Wang, Yaodong Yang, Tuomas Sandholm:
Team-PSRO for Learning Approximate TMECor in Large Team Games via Cooperative Reinforcement Learning. NeurIPS 2023 - [c36]Mingyu Yang, Yaodong Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li:
Hierarchical Multi-Agent Skill Discovery. NeurIPS 2023 - [c35]Jian Yao, Weiming Liu, Haobo Fu, Yaodong Yang, Stephen McAleer, Qiang Fu, Wei Yang:
Policy Space Diversity for Non-Transitive Games. NeurIPS 2023 - [c34]Youpeng Zhao, Yaodong Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li:
Multi-Agent First Order Constrained Optimization in Policy Space. NeurIPS 2023 - [c33]Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi, Yaodong Yang, Peter R. Pietzuch, Lei Chen:
MSRL: Distributed Reinforcement Learning with Dataflow Fragments. USENIX ATC 2023: 977-993 - [i85]David Mguni, Taher Jafferjee, Haojun Chen, Jianhong Wang, Long Fei, Xidong Feng, Stephen McAleer, Feifei Tong, Jun Wang, Yaodong Yang:
MANSA: Learning Fast and Slow in Multi-Agent Systems. CoRR abs/2302.05910 (2023) - [i84]Shangding Gu, Alap Kshirsagar, Yali Du, Guang Chen, Yaodong Yang, Jan Peters, Alois C. Knoll:
A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors. CoRR abs/2302.13137 (2023) - [i83]Chenguang Wang, Zhouliang Yu, Stephen McAleer, Tianshu Yu, Yaodong Yang:
ASP: Learn a Universal Neural Solver! CoRR abs/2303.00466 (2023) - [i82]Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang:
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning. CoRR abs/2304.00464 (2023) - [i81]Sirui Chen, Zhaowei Zhang, Yali Du, Yaodong Yang:
STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning. CoRR abs/2304.07520 (2023) - [i80]Yifan Zhong, Jakub Grudzien Kuba, Siyi Hu, Jiaming Ji, Yaodong Yang:
Heterogeneous-Agent Reinforcement Learning. CoRR abs/2304.09870 (2023) - [i79]Xiaohang Tang, Le Cong Dinh, Stephen Marcus McAleer, Yaodong Yang:
Regret-Minimizing Double Oracle for Extensive-Form Games. CoRR abs/2304.10498 (2023) - [i78]Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang:
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research. CoRR abs/2305.09304 (2023) - [i77]Simin Li, Jun Guo, Jingqiao Xiu, Xini Yu, Jiakai Wang, Aishan Liu, Yaodong Yang, Xianglong Liu:
Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game. CoRR abs/2305.12872 (2023) - [i76]Zhaowei Zhang, Nian Liu, Siyuan Qi, Ceyao Zhang, Ziqi Rong, Song-Chun Zhu, Shuguang Cui, Yaodong Yang:
Heterogeneous Value Evaluation for Large Language Models. CoRR abs/2305.17147 (2023) - [i75]Yonggang Jin, Chenxu Wang, Liuyu Xiang, Yaodong Yang, Jie Fu, Zhaofeng He:
Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork. CoRR abs/2306.10698 (2023) - [i74]Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang:
Maximum Entropy Heterogeneous-Agent Mirror Learning. CoRR abs/2306.10715 (2023) - [i73]Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang:
Large Sequence Models for Sequential Decision-Making: A Survey. CoRR abs/2306.13945 (2023) - [i72]Jian Yao, Weiming Liu, Haobo Fu, Yaodong Yang, Stephen McAleer, Qiang Fu, Wei Yang:
Policy Space Diversity for Non-Transitive Games. CoRR abs/2306.16884 (2023) - [i71]Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Boyuan Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang:
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset. CoRR abs/2307.04657 (2023) - [i70]Weidong Huang, Jiaming Ji, Borong Zhang, Chunhe Xia, Yaodong Yang:
Safe DreamerV3: Safe Reinforcement Learning with World Models. CoRR abs/2307.07176 (2023) - [i69]Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning. CoRR abs/2307.12933 (2023) - [i68]Yang Li, Kun Xiong, Yingping Zhang, Jiangcheng Zhu, Stephen McAleer, Wei Pan, Jun Wang, Zonghong Dai, Yaodong Yang:
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games. CoRR abs/2308.04719 (2023) - [i67]Ceyao Zhang, Kaijie Yang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, Yaodong Yang:
ProAgent: Building Proactive Cooperative AI with Large Language Models. CoRR abs/2308.11339 (2023) - [i66]Jingbang Chen, Yian Wang, Xingwei Qu, Shuangjia Zheng, Yaodong Yang, Hao Dong, Jie Fu:
Mixup-Augmented Meta-Learning for Sample-Efficient Fine-Tuning of Protein Simulators. CoRR abs/2308.15116 (2023) - [i65]Binghao Huang, Yuanpei Chen, Tianyu Wang, Yuzhe Qin, Yaodong Yang, Nikolay Atanasov, Xiaolong Wang:
Dynamic Handover: Throw and Catch with Bimanual Hands. CoRR abs/2309.05655 (2023) - [i64]Chengdong Ma, Ziran Yang, Minquan Gao, Hai Ci, Jun Gao, Xuehai Pan, Yaodong Yang:
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models. CoRR abs/2310.00322 (2023) - [i63]Zhaowei Zhang, Fengshuo Bai, Jun Gao, Yaodong Yang:
Measuring Value Understanding in Language Models through Discriminator-Critique Gap. CoRR abs/2310.00378 (2023) - [i62]Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai:
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models. CoRR abs/2310.05205 (2023) - [i61]Simin Li, Ruixiao Xu, Jun Guo, Pu Feng, Jiakai Wang, Aishan Liu, Yaodong Yang, Xianglong Liu, Weifeng Lv:
MIR2: Towards Provably Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization. CoRR abs/2310.09833 (2023) - [i60]Jie Liu, Yinmin Zhang, Chuming Li, Chao Yang, Yaodong Yang, Yu Liu, Wanli Ouyang:
Masked Pretraining for Multi-Agent Decision Making. CoRR abs/2310.11846 (2023) - [i59]Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, Yaodong Yang:
Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark. CoRR abs/2310.12567 (2023) - [i58]Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang:
Safe RLHF: Safe Reinforcement Learning from Human Feedback. CoRR abs/2310.12773 (2023) - [i57]Yuyang Li, Bo Liu, Yiran Geng, Puhao Li, Yaodong Yang, Yixin Zhu, Tengyu Liu, Siyuan Huang:
Grasp Multiple Objects with One Hand. CoRR abs/2310.15599 (2023) - [i56]Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao:
AI Alignment: A Comprehensive Survey. CoRR abs/2310.19852 (2023) - [i55]Zihao Wang, Shaofei Cai, Anji Liu, Yonggang Jin, Jinbing Hou, Bowei Zhang, Haowei Lin, Zhaofeng He, Zilong Zheng, Yaodong Yang, Xiaojian Ma, Yitao Liang:
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models. CoRR abs/2311.05997 (2023) - [i54]Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning. CoRR abs/2312.07685 (2023) - 2022
- [j5]Ricky Sanjaya, Jun Wang, Yaodong Yang:
Measuring the Non-Transitivity in Chess. Algorithms 15(5): 152 (2022) - [j4]Qingduo Zeng, Qiang Zhang, Shancun Liu, Yaodong Yang:
Illiquidity Comovement and Market Crisis. J. Syst. Sci. Complex. 35(5): 1863-1874 (2022) - [j3]Le Cong Dinh, Stephen Marcus McAleer, Zheng Tian, Nicolas Perez Nieves,