搜索结果: 1-7 共查到“管理学 regret”相关记录7条 . 查询时间(0.125 秒)
Regret Bounds for Reinforcement Learning with Policy Advice
Regret Bounds Reinforcement LearningPolicy Advice
2013/6/13
In some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors. We present a reinforcement learning with p...
Further Optimal Regret Bounds for Thompson Sampling
Further Optimal Regret Bounds Thompson Sampling
2012/11/23
Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after several s...
We consider the restless Markov bandit problem, in which the state of each arm evolves according to a Markov process independently of the learner's actions. We suggest an algorithm that after $T$ step...
Robust approachability and regret minimization in games with partial monitoring
Robust approachability regret games partial monitoring
2011/6/20
Approachability has become a standard tool in analyzing learning algorithms in the adversarial
online learning setup. We develop a variant of approachability for games where there is ambiguity
in th...
No-Regret Reductions for Imitation Learning and Structured Prediction
No-Regret Reductions for Imitation Learning Structured Prediction
2010/11/9
Sequential prediction problems such as imitation learning, where future observations depend on
previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning.
Some Bayesian Credibility Premiums Obtained by Using Posterior Regret Gamma-Minimax Methodology
Classes of distributions Credibility Minimax Premium Robustness
2009/9/24
In this paper,following the robust Bayesian paradigm, a procedure based on
the posterior regret-minimax principle is applied to derive,in a straightforwar
way, new credibility formula,making use of ...
A Stochastic View of Optimal Regret through Minimax Duality
Stochastic View Optimal Regret Minimax Duality
2010/3/19
We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to t...