Stop the war!
Остановите войну!
for scientists:
default search action
"Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error ..."
Yuan Xie et al. (2019)
- Yuan Xie, Boyi Liu, Qiang Liu, Zhaoran Wang, Yuan Zhou, Jian Peng:
Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy. ICLR (Poster) 2019
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.