"On Average Versus Discounted Reward Temporal-Difference Learning."

John N. Tsitsiklis, Benjamin Van Roy (2002)

Details and statistics

DOI: 10.1023/A:1017980312899

access: closed

type: Journal Article

metadata version: 2020-03-02

a service of  Schloss Dagstuhl - Leibniz Center for Informatics