方略学科导航

搜索结果: 1-7 共查到“管理学 regret”相关记录7条 . 查询时间(0.125 秒)

Regret Bounds for Reinforcement Learning with Policy Advice Regret Bounds Reinforcement LearningPolicy Advice 2013/6/13

In some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors. We present a reinforcement learning with p...

存档附件原文地址

Further Optimal Regret Bounds for Thompson Sampling Further Optimal Regret Bounds Thompson Sampling 2012/11/23

Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after several s...

存档附件原文地址

Regret Bounds for Restless Markov Bandits Regret Bounds Restless Markov Bandits 2012/11/23

We consider the restless Markov bandit problem, in which the state of each arm evolves according to a Markov process independently of the learner's actions. We suggest an algorithm that after $T$ step...

存档附件原文地址

Robust approachability and regret minimization in games with partial monitoring Robust approachability regret games partial monitoring 2011/6/20

Approachability has become a standard tool in analyzing learning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in th...

存档附件原文地址

No-Regret Reductions for Imitation Learning and Structured Prediction No-Regret Reductions for Imitation Learning Structured Prediction 2010/11/9

Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning.

存档附件原文地址

Some Bayesian Credibility Premiums Obtained by Using Posterior Regret Gamma-Minimax Methodology Classes of distributions Credibility Minimax Premium Robustness 2009/9/24

In this paper,following the robust Bayesian paradigm, a procedure based on the posterior regret-minimax principle is applied to derive,in a straightforwar way, new credibility formula,making use of ...

存档附件原文地址

A Stochastic View of Optimal Regret through Minimax Duality Stochastic View Optimal Regret Minimax Duality 2010/3/19

We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to t...

存档附件原文地址

中国研究生教育排行榜-条

正在加载...

中国学术期刊排行榜-条

正在加载...

世界大学科研机构排行榜-条

正在加载...

中国大学排行榜-条

正在加载...

人　物-篇

正在加载...

课　件-篇

正在加载...

视听资料-篇

正在加载...

研招资料 -篇

正在加载...

知识要闻-篇

正在加载...

国际动态-篇

正在加载...

会议中心-篇

正在加载...

学术指南-篇

正在加载...

学术站点-篇

正在加载...

中国研究生教育排行榜-条

中国学术期刊排行榜-条

世界大学科研机构排行榜-条

中国大学排行榜-条

人 物-篇

课 件-篇

视听资料-篇

知识库-篇

研招资料 -篇

知识要闻-篇

国际动态-篇

会议中心-篇

学术指南-篇

学术站点-篇

人　物-篇

课　件-篇