default search action
Zhaoran Wang 0001
Person information
- affiliation: Northwestern University, Evanston, IL, USA
Other persons with the same name
- Zhaoran Wang 0002 — Inner Mongolia University, Hohhot, China
- Zhaoran Wang 0003 — Tsinghua University, Beijing, China
- Zhaoran Wang 0004 — Shanghai University, School of Mechatronic Engineering and Automation, China
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j19]Xiao-Yang Liu, Ziyi Xia, Hongyang Yang, Jiechao Gao, Daochen Zha, Ming Zhu, Christina Dan Wang, Zhaoran Wang, Jian Guo:
Dynamic datasets and market environments for financial reinforcement learning. Mach. Learn. 113(5): 2795-2839 (2024) - [j18]Qi Cai, Zhuoran Yang, Jason D. Lee, Zhaoran Wang:
Neural Temporal Difference and Q Learning Provably Converge to Global Optima. Math. Oper. Res. 49(1): 619-651 (2024) - [j17]Zhihong Deng, Zuyue Fu, Lingxiao Wang, Zhuoran Yang, Chenjia Bai, Tianyi Zhou, Zhaoran Wang, Jing Jiang:
False Correlation Reduction for Offline Reinforcement Learning. IEEE Trans. Pattern Anal. Mach. Intell. 46(2): 1199-1211 (2024) - [j16]Chenjia Bai, Ting Xiao, Zhoufan Zhu, Lingxiao Wang, Fan Zhou, Animesh Garg, Bin He, Peng Liu, Zhaoran Wang:
Monotonic Quantile Network for Worst-Case Offline Reinforcement Learning. IEEE Trans. Neural Networks Learn. Syst. 35(7): 8954-8968 (2024) - [c129]Chau Pham, Boyi Liu, Yingxiang Yang, Zhengyu Chen, Tianyi Liu, Jianbo Yuan, Bryan A. Plummer, Zhaoran Wang, Hongxia Yang:
Let Models Speak Ciphers: Multiagent Debate through Embeddings. ICLR 2024 - [c128]Nuoya Xiong, Zhihan Liu, Zhaoran Wang, Zhuoran Yang:
Sample-Efficient Multi-Agent RL: An Optimization Perspective. ICLR 2024 - [c127]Feng Gao, Liangzhi Shi, Shenao Zhang, Zhaoran Wang, Yi Wu:
Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations. ICML 2024 - [c126]Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang:
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents. ICML 2024 - [c125]Nuoya Xiong, Zhaoran Wang, Zhuoran Yang:
A General Framework for Sequential Decision-Making under Adaptivity Constraints. ICML 2024 - [c124]Sirui Zheng, Chenjia Bai, Zhuoran Yang, Zhaoran Wang:
How Does Goal Relabeling Improve Sample Efficiency? ICML 2024 - [i145]Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu:
Human-Instruction-Free LLM Self-Alignment with Limited Samples. CoRR abs/2401.06785 (2024) - [i144]Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, Mengdi Wang:
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning. CoRR abs/2402.10810 (2024) - [i143]Shenao Zhang, Sirui Zheng, Shuqi Ke, Zhihan Liu, Wanxin Jin, Jianbo Yuan, Yingxiang Yang, Hongxia Yang, Zhaoran Wang:
How Can LLM Guide RL? A Value-Based Approach. CoRR abs/2402.16181 (2024) - [i142]Hongyi Guo, Zhihan Liu, Yufeng Zhang, Zhaoran Wang:
Can Large Language Models Play Games? A Case Study of A Self-Play Approach. CoRR abs/2403.05632 (2024) - [i141]Mengying Lin, Yaran Chen, Dongbin Zhao, Zhaoran Wang:
Advancing Object Goal Navigation Through LLM-enhanced Object Affinities Transfer. CoRR abs/2403.09971 (2024) - [i140]Yuchen Zhu, Yufeng Zhang, Zhaoran Wang, Zhuoran Yang, Xiaohong Chen:
A Mean-Field Analysis of Neural Gradient Descent-Ascent: Applications to Functional Conditional Moment Equations. CoRR abs/2404.12312 (2024) - [i139]Zhihan Liu, Miao Lu, Shenao Zhang, Boyi Liu, Hongyi Guo, Yingxiang Yang, Jose H. Blanchet, Zhaoran Wang:
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer. CoRR abs/2405.16436 (2024) - [i138]Shenao Zhang, Donghan Yu, Hiteshi Sharma, Ziyi Yang, Shuohang Wang, Hany Hassan, Zhaoran Wang:
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment. CoRR abs/2405.19332 (2024) - [i137]Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Hang Li, Yang Liu:
Toward Optimal LLM Alignments Using Two-Player Games. CoRR abs/2406.10977 (2024) - [i136]Zhixian Xie, Wenlong Zhang, Yi Ren, Zhaoran Wang, George J. Pappas, Wanxin Jin:
Safe MPC Alignment with Human Directional Feedback. CoRR abs/2407.04216 (2024) - 2023
- [j15]Han Zhong, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan:
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers? J. Mach. Learn. Res. 24: 35:1-35:52 (2023) - [j14]Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, Mengdi Wang:
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning. J. Mach. Learn. Res. 24: 385:1-385:43 (2023) - [j13]Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang:
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium. Math. Oper. Res. 48(1): 433-462 (2023) - [j12]Chi Jin, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan:
Provably Efficient Reinforcement Learning with Linear Function Approximation. Math. Oper. Res. 48(3): 1496-1521 (2023) - [j11]Mingyi Hong, Hoi-To Wai, Zhaoran Wang, Zhuoran Yang:
A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic. SIAM J. Optim. 33(1): 147-180 (2023) - [j10]Chenjia Bai, Lingxiao Wang, Yixin Wang, Zhaoran Wang, Rui Zhao, Chenyao Bai, Peng Liu:
Addressing Hindsight Bias in Multigoal Reinforcement Learning. IEEE Trans. Cybern. 53(1): 392-405 (2023) - [j9]Chenjia Bai, Peng Liu, Kaiyu Liu, Lingxiao Wang, Yingnan Zhao, Lei Han, Zhaoran Wang:
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning. IEEE Trans. Neural Networks Learn. Syst. 34(8): 4776-4790 (2023) - [c123]Ruitu Xu, Yifei Min, Tianhao Wang, Michael I. Jordan, Zhaoran Wang, Zhuoran Yang:
Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models via Reinforcement Learning. AISTATS 2023: 375-407 - [c122]Jing Wang, Meichen Song, Feng Gao, Boyi Liu, Zhaoran Wang, Yi Wu:
Differentiable Arbitrating in Zero-sum Markov Games. AAMAS 2023: 1034-1043 - [c121]Yixuan Wang, Simon Sinong Zhan, Zhilu Wang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu:
Joint Differentiable Optimization and Verification for Certified Reinforcement Learning. ICCPS 2023: 132-141 - [c120]Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang:
Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency. ICLR 2023 - [c119]Miao Lu, Yifei Min, Zhaoran Wang, Zhuoran Yang:
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes. ICLR 2023 - [c118]Tongzheng Ren, Chenjun Xiao, Tianjun Zhang, Na Li, Zhaoran Wang, Sujay Sanghavi, Dale Schuurmans, Bo Dai:
Latent Variable Representation for Reinforcement Learning. ICLR 2023 - [c117]Haoran Xu, Li Jiang, Jianxiong Li, Zhuoran Yang, Zhaoran Wang, Wai Kin Victor Chan, Xianyuan Zhan:
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization. ICLR 2023 - [c116]Sirui Zheng, Lingxiao Wang, Shuang Qiu, Zuyue Fu, Zhuoran Yang, Csaba Szepesvári, Zhaoran Wang:
Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics. ICLR 2023 - [c115]Jiayang Li, Jing Yu, Boyi Liu, Yu Nie, Zhaoran Wang:
Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints. ICML 2023: 20312-20335 - [c114]Yixuan Wang, Simon Sinong Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, Qi Zhu:
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments. ICML 2023: 36593-36604 - [c113]Shenao Zhang, Wanxin Jin, Zhaoran Wang:
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics. ICML 2023: 41219-41243 - [c112]Yulai Zhao, Zhuoran Yang, Zhaoran Wang, Jason D. Lee:
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning. ICML 2023: 42200-42226 - [c111]Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, Mihailo R. Jovanovic:
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning. L4DC 2023: 315-332 - [c110]Zhihan Liu, Miao Lu, Wei Xiong, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang:
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration. NeurIPS 2023 - [c109]Shuang Qiu, Ziyu Dai, Han Zhong, Zhaoran Wang, Zhuoran Yang, Tong Zhang:
Posterior Sampling for Competitive RL: Function Approximation and Partial Observation. NeurIPS 2023 - [c108]Shenao Zhang, Boyi Liu, Zhaoran Wang, Tuo Zhao:
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms. NeurIPS 2023 - [c107]Fengzhuo Zhang, Vincent Y. F. Tan, Zhaoran Wang, Zhuoran Yang:
Learning Regularized Monotone Graphon Mean-Field Games. NeurIPS 2023 - [i135]Jiayang Li, Jing Yu, Boyi Liu, Zhaoran Wang, Yu Marco Nie:
Achieving Hierarchy-Free Approximation for Bilevel Programs With Equilibrium Constraints. CoRR abs/2302.09734 (2023) - [i134]Jing Wang, Meichen Song, Feng Gao, Boyi Liu, Zhaoran Wang, Yi Wu:
Differentiable Arbitrating in Zero-sum Markov Games. CoRR abs/2302.10058 (2023) - [i133]Ruitu Xu, Yifei Min, Tianhao Wang, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang:
Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning. CoRR abs/2303.04833 (2023) - [i132]Siyu Chen, Yitan Wang, Zhaoran Wang, Zhuoran Yang:
A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations. CoRR abs/2303.11187 (2023) - [i131]Haoran Xu, Li Jiang, Jianxiong Li, Zhuoran Yang, Zhaoran Wang, Wai Kin Victor Chan, Xianyuan Zhan:
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization. CoRR abs/2303.15810 (2023) - [i130]Jiayang Li, Zhaoran Wang, Yu Marco Nie:
Wardrop Equilibrium Can Be Boundedly Rational: A New Behavioral Theory of Route Choice. CoRR abs/2304.02500 (2023) - [i129]Xiao-Yang Liu, Ziyi Xia, Hongyang Yang, Jiechao Gao, Daochen Zha, Ming Zhu, Christina Dan Wang, Zhaoran Wang, Jian Guo:
Dynamic Datasets and Market Environments for Financial Reinforcement Learning. CoRR abs/2304.13174 (2023) - [i128]Yulai Zhao, Zhuoran Yang, Zhaoran Wang, Jason D. Lee:
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning. CoRR abs/2305.04819 (2023) - [i127]Zhihan Liu, Miao Lu, Wei Xiong, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang:
One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration. CoRR abs/2305.18258 (2023) - [i126]Yufeng Zhang, Fengzhuo Zhang, Zhuoran Yang, Zhaoran Wang:
What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization. CoRR abs/2305.19420 (2023) - [i125]Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, Mihailo R. Jovanovic:
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning. CoRR abs/2306.00212 (2023) - [i124]Nuoya Xiong, Zhaoran Wang, Zhuoran Yang:
A General Framework for Sequential Decision-Making under Adaptivity Constraints. CoRR abs/2306.14468 (2023) - [i123]Pangpang Liu, Zhuoran Yang, Zhaoran Wang, Will Wei Sun:
Contextual Dynamic Pricing with Strategic Buyers. CoRR abs/2307.04055 (2023) - [i122]Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang:
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency. CoRR abs/2309.17382 (2023) - [i121]Nuoya Xiong, Zhihan Liu, Zhaoran Wang, Zhuoran Yang:
Sample-Efficient Multi-Agent RL: An Optimization Perspective. CoRR abs/2310.06243 (2023) - [i120]Chau Pham, Boyi Liu, Yingxiang Yang, Zhengyu Chen, Tianyi Liu, Jianbo Yuan, Bryan A. Plummer, Zhaoran Wang, Hongxia Yang:
Let Models Speak Ciphers: Multiagent Debate through Embeddings. CoRR abs/2310.06272 (2023) - [i119]Fengzhuo Zhang, Vincent Y. F. Tan, Zhaoran Wang, Zhuoran Yang:
Learning Regularized Monotone Graphon Mean-Field Games. CoRR abs/2310.08089 (2023) - [i118]Fengzhuo Zhang, Vincent Y. F. Tan, Zhaoran Wang, Zhuoran Yang:
Learning Regularized Graphon Mean-Field Games with Unknown Graphons. CoRR abs/2310.17531 (2023) - [i117]Shuang Qiu, Ziyu Dai, Han Zhong, Zhaoran Wang, Zhuoran Yang, Tong Zhang:
Posterior Sampling for Competitive RL: Function Approximation and Partial Observation. CoRR abs/2310.19861 (2023) - [i116]Shenao Zhang, Boyi Liu, Zhaoran Wang, Tuo Zhao:
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms. CoRR abs/2310.19927 (2023) - [i115]Saizhuo Wang, Zhihan Liu, Zhaoran Wang, Jian Guo:
A Principled Framework for Knowledge-enhanced Large Language Model. CoRR abs/2311.11135 (2023) - [i114]Jianqing Fan, Zhaoran Wang, Zhuoran Yang, Chenlu Ye:
Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks. CoRR abs/2311.13180 (2023) - [i113]Yixuan Wang, Ruochen Jiao, Chengtian Lang, Simon Sinong Zhan, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu:
Empowering Autonomous Driving with Large Language Models: A Safety Perspective. CoRR abs/2312.00812 (2023) - [i112]Quanquan Gu, Zhaoran Wang, Han Liu:
Sparse PCA with Oracle Property. CoRR abs/2312.16793 (2023) - 2022
- [c106]Zehao Dou, Zhuoran Yang, Zhaoran Wang, Simon S. Du:
Gap-Dependent Bounds for Two-Player Markov Games. AISTATS 2022: 432-455 - [c105]Yixuan Wang, Chao Huang, Zhaoran Wang, Zhilu Wang, Qi Zhu:
Design-while-verify: correct-by-construction control learning with verification in the loop. DAC 2022: 925-930 - [c104]Chenjia Bai, Lingxiao Wang, Zhuoran Yang, Zhi-Hong Deng, Animesh Garg, Peng Liu, Zhaoran Wang:
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning. ICLR 2022 - [c103]Baihe Huang, Jason D. Lee, Zhaoran Wang, Zhuoran Yang:
Towards General Function Approximation in Zero-Sum Markov Games. ICLR 2022 - [c102]Qi Cai, Zhuoran Yang, Zhaoran Wang:
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency. ICML 2022: 2485-2522 - [c101]Siyu Chen, Donglin Yang, Jiayang Li, Senmiao Wang, Zhuoran Yang, Zhaoran Wang:
Adaptive Model Design for Markov Decision Process. ICML 2022: 3679-3700 - [c100]Xiaoyu Chen, Han Zhong, Zhuoran Yang, Zhaoran Wang, Liwei Wang:
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation. ICML 2022: 3773-3793 - [c99]Hongyi Guo, Qi Cai, Yufeng Zhang, Zhuoran Yang, Zhaoran Wang:
Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes. ICML 2022: 8016-8038 - [c98]Zhihan Liu, Miao Lu, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang:
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy. ICML 2022: 13870-13911 - [c97]Zhihan Liu, Yufeng Zhang, Zuyue Fu, Zhuoran Yang, Zhaoran Wang:
Learning from Demonstration: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation. ICML 2022: 14094-14138 - [c96]Boxiang Lyu, Zhaoran Wang, Mladen Kolar, Zhuoran Yang:
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning. ICML 2022: 14601-14638 - [c95]Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang:
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning. ICML 2022: 18168-18210 - [c94]Han Zhong, Wei Xiong, Jiyuan Tan, Liwei Wang, Tong Zhang, Zhaoran Wang, Zhuoran Yang:
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets. ICML 2022: 27117-27142 - [c93]Gene Li, Junbo Li, Anmol Kabra, Nati Srebro, Zhaoran Wang, Zhuoran Yang:
Exponential Family Model-Based Reinforcement Learning via Score Matching. NeurIPS 2022 - [c92]Boyi Liu, Jiayang Li, Zhuoran Yang, Hoi-To Wai, Mingyi Hong, Yu Nie, Zhaoran Wang:
Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence. NeurIPS 2022 - [c91]Xiao-Yang Liu, Ziyi Xia, Jingyang Rui, Jiechao Gao, Hongyang Yang, Ming Zhu, Christina Dan Wang, Zhaoran Wang, Jian Guo:
FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning. NeurIPS 2022 - [c90]Yifei Min, Tianhao Wang, Ruitu Xu, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang:
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets. NeurIPS 2022 - [c89]Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang:
A Unifying Framework of Off-Policy General Value Function Evaluation. NeurIPS 2022 - [c88]Rui Yang, Chenjia Bai, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han:
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing. NeurIPS 2022 - [c87]Fengzhuo Zhang, Boyi Liu, Kaixin Wang, Vincent Y. F. Tan, Zhuoran Yang, Zhaoran Wang:
Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL. NeurIPS 2022 - [c86]Shichao Xu, Yangyang Fu, Yixuan Wang, Zhuoran Yang, Zheng O'Neill, Zhaoran Wang, Qi Zhu:
Accelerate online reinforcement learning for building HVAC control with heterogeneous expert guidances. BuildSys@SenSys 2022: 89-98 - [c85]Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan, Haifeng Xu:
Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning. EC 2022: 471-472 - [i111]Yixuan Wang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu:
Joint Differentiable Optimization and Verification for Certified Reinforcement Learning. CoRR abs/2201.12243 (2022) - [i110]Han Zhong, Wei Xiong, Jiyuan Tan, Liwei Wang, Tong Zhang, Zhaoran Wang, Zhuoran Yang:
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets. CoRR abs/2202.07511 (2022) - [i109]Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan, Haifeng Xu:
Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning. CoRR abs/2202.10678 (2022) - [i108]Chenjia Bai, Lingxiao Wang, Zhuoran Yang, Zhihong Deng, Animesh Garg, Peng Liu, Zhaoran Wang:
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning. CoRR abs/2202.11566 (2022) - [i107]Boxiang Lyu, Qinglin Meng, Shuang Qiu, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan:
Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach. CoRR abs/2202.12797 (2022) - [i106]Yifei Min, Tianhao Wang, Ruitu Xu, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang:
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets. CoRR abs/2203.03684 (2022) - [i105]Qi Cai, Zhuoran Yang, Zhaoran Wang:
Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations. CoRR abs/2204.09787 (2022) - [i104]Boxiang Lyu, Zhaoran Wang, Mladen Kolar, Zhuoran Yang:
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning. CoRR abs/2205.02450 (2022) - [i103]Xiaoyu Chen, Han Zhong, Zhuoran Yang, Zhaoran Wang, Liwei Wang:
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation. CoRR abs/2205.11140 (2022) - [i102]Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang:
Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency. CoRR abs/2205.13476 (2022) - [i101]Miao Lu, Yifei Min, Zhaoran Wang, Zhuoran Yang:
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes. CoRR abs/2205.13589 (2022) - [i100]Rui Yang, Chenjia Bai, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han:
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing. CoRR abs/2206.02829 (2022) - [i99]Doudou Zhou, Yufeng Zhang, Aaron Sonabend W., Zhaoran Wang, Junwei Lu, Tianxi Cai:
Federated Offline Reinforcement Learning. CoRR abs/2206.05581 (2022) - [i98]Shuang Qiu, Xiaohan Wei, Jieping Ye, Zhaoran Wang, Zhuoran Yang:
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions. CoRR abs/2207.12463 (2022) - [i97]Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang:
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning. CoRR abs/2207.14800 (2022) - [i96]Jiayang Li, Jing Yu, Qianni Wang, Boyi Liu, Zhaoran Wang, Yu Marco Nie:
Differentiable Bilevel Programming for Stackelberg Congestion Games. CoRR abs/2209.07618 (2022) - [i95]Zuyue Fu, Zhengling Qi, Zhaoran Wang, Zhuoran Yang, Yanxun Xu, Michael R. Kosorok:
Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes. CoRR abs/2209.08666 (2022) - [i94]Fengzhuo Zhang, Boyi Liu, Kaixin Wang, Vincent Y. F. Tan, Zhuoran Yang, Zhaoran Wang:
Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL. CoRR abs/2209.09845 (2022) - [i93]Yixuan Wang, Simon Sinong Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, Qi Zhu:
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments. CoRR abs/2209.15090 (2022) - [i92]Rui Ai, Boxiang Lyu, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan:
A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design. CoRR abs/2210.10278 (2022) - [i91]Han Zhong, Wei Xiong, Sirui Zheng, Liwei Wang, Zhaoran Wang, Zhuoran Yang, Tong Zhang:
GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond. CoRR abs/2211.01962 (2022) - [i90]