基于马尔科夫理论的不确定性规划和感知问题研究,Markov Theory based Planning and Sensing under Uncertainty

Size: px
Start display at page:

Download "基于马尔科夫理论的不确定性规划和感知问题研究,Markov Theory based Planning and Sensing under Uncertainty"

Transcription

1

2

3 University of Science and Technology of China A dissertation for doctor s degree Markov Theory based Planning and Sensing under Uncertainty Author : Aijun Bai Speciality : Computer Science Supervisor : Prof. Xiaoping Chen Finished Time : September, 2014

4

5 ,

6

7 MDP POMDP MDP POMDP MDP POMDP MDP POMDP MDP POMDP MAXQ MDP MAXQ-OP MDP POMDP DNG-MCTS D²NG-POMCP POMDP PFS MAXQ-OP MDP MAXQ-OP MAXQ MDP MDP MAXQ-OP MAXQ-OP RoboCup 2D MAXQ-OP 2D RoboCup 2D MAXQ-OP MCTS MCTS MDP POMDP MCTS DNG-MCTS D²NG-POMCP I

8 Thompson MDP POMDP DNG-MCTS D²NG-POMCP UCT POMCP POMDP PFS EM PFS PFS PETS2009 CLEAR MOT CoBot PFS II

9 ABSTRACT ABSTRACT In the research of Artificial Intelligence, agent-based paradigm aims to provide a unifying framework for conceptualizing, designing, and implementing intelligent systems, that sense, act and learn autonomously in dynamic and/or stochastic environments, to solve a growing number of complex problems. Agents, particularly various kinds of robots, are playing more and more important roles in world economics and people s everyday life, from satellites to smartphones. Generally speaking, perceptional inputs from sensors have inevitable noises and errors. The effects of actuators have also unpredictable impact with noises, or even failures. There may also exist different levels of hidden information that can not be observed directly. Such uncertainties have brought huge challenges to the problem of agent planning and sensing. Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs) provide important basis in terms of theory and algorithm to optimal planning and sensing under uncertainty. However, solving large MDPs and POMDPs exactly is usually intractable due to the curse of dimensionality the state space grows exponentially with the number of state variables. To address this challenge in practice, researchers are usually utilizing approximation techniques such as online planning, hierarchical planning, Monte-Carlo simulation, particle filtering, etc. Following the theories of MDP and POMDP, this thesis is focusing on developing efficient approximate algorithms for large MDPs and POMDPs. Specifically, we propose a MAXQ hierarchical decomposition based online planning algorithm MAXQ-OP; we develop DNG- MCTS and D²NG-POMCP algorithms which apply the idea of Thompson sampling to Monte-Carlo planning in MDPs and POMDPs; and, we develop a particle filtering over sets (PFS) approach to multi-human tracking problem. The proposed hierarchical online planning algorithm, namely MAXQ-OP, is a novel algorithm that combines the advantages of both online planing and hierarchical planning. It provides a more sophisticated solution for programming autonomous agents in large stochastic domains. Specifically, we perform online decision-making by following MAXQ value function decomposition. We empirically evaluate our algorithm on the Taxi problem a common benchmark for MDPs. The experimental results show that MAXQ-OP is able to find near optimal policy online, with extremely less computation time comparing to traditional online planning algorithms. The RoboCup soccer simulation 2D domain is a very large test-bed for the research of artificial intelligence. The key challenge lies in the fact that it is a fully distributed, multi-agent stochastic system with continuous state, action and observation spaces. We have conducted a III

10 ABSTRACT long-term case study in RoboCup 2D domain and developed a team named WrightEagle that have won multiple world champions and national champions in RoboCup annual competitions. The results of our case study confirm MAXQ-OP s important potential of scaling up to very large domains. Monte-Carlo tree search (MCTS) has been drawing great interest recently in domains of planning and learning under uncertainty. One of the key challenges is the trade-off between exploration and exploitation. We develop novel approaches, namely DNG-MCTS and D²NG-POMCP, to MCTS by using posterior action sampling to select actions for online planning in MDPs and POMDPs. Specifically, we treat the cumulative reward obtained by taking an action in a search node and following a tree policy thereafter over the Monte-Carlo search tree as a random variable following an unknown distribution. We parametrize the distribution by introducing necessary hidden parameters, and infer the posterior in Bayesian settings. Thompson sampling is then used to exploit and explore the search tree, by selecting an action based on its posterior probability of being optimal. Experimental results confirm that the proposed algorithms outperform the state-of-the-art approaches with better values on several benchmark problems, showing the potential of successfully applying to very large real-world problems. The ability for an autonomous robot to detect, track and identify potentially multiple humans is essential for socialized human-robot interactions in dynamic environments. Online multi-object tracking problem is equivalent to real-time belief update of a complex POMDP. The main challenge is that, without the knowledge of actual number of humans, the robot needs to estimate each human s state information in realtime from sequentially ambiguous observations, including inevitable false and missing detections, while both the robot and humans are constantly moving. In this thesis, we propose a novel particle filtering over sets (PFS) approach to address this challenge. We define joint states and observations both as finite sets, and develop motion and observation functions accordingly. The target identification problem is then solved by using the expectation-maximization (EM) method, given updated particles. The set formulation enables us to avoid directly performing observation-to-target association, leading to high fault-tolerance and robustness in complex dynamic environments with frequent noises and errors in terms of detections. The overall PFS algorithm outperforms the state-of-the-art, in terms of CLEAR MOT metrics, in PETS2009 dataset. We also demonstrate the effectiveness of PFS on a real robot, namely CoBot. Keywords: Markov Decision Process, Partially Observable Markov Decision Process, Decision-Theoretic Planning, Hierarchical Online Planning, Monte-Carlo Tree Search, Multi-Object Tracking IV

11 I ABSTRACT III V IX XI XIII XV MAXQ MAXQ MAXQ MAXQ-OP MAXQ-OP 41 V

12 RoboCup 2D RoboCup 2D RoboCup 2D MDP MAXQ-OP MDP Thompson DNG-MCTS POMDP Thompson D²NG-POMCP MDP POMDP VI

13 VII

14

15 WrightEagle WrightEagle RoboCup 2D CTP etaxi[5] RockSample D²NG-POMCP T a = 0.1 T fm = PFS PETS2009 S2L IX

16

17 Wikipedia Wikipedia DBN MAXQ POMDP MDP MAXQ RoboCup D WrightEagle Helios MAXQ RoboCup D MAB Thompson Simple Regret etaxi RockSample D²NG-POMCP PocMan D²NG-POMCP PocMan D²NG-POMCP PFS PETS2009 S2L CoBot CORAL CoBot I CoBot II PFS CoBot XI

18

19 2.1 MDP MDP MDP AO* MDP RTDP MDP UCT POMDP POMDP RTDP-Bel POMDP POMCP MAXQ-OP OnlinePlanning MAXQ-OP EvaluateState(i, s, d) MAXQ-OP EvaluateCompletion(i, s, a, d) MAXQ-OP NextAction(i, s) Dirichlet-NormalGamma DNG-MCTS Thompson Dirichlet-Dirichlet-NormalGamma POMCP D²NG-POMCP Thompson Human Identification XIII

20

21 MDP POMDP DEC-POMDP BRL HRL MAB MOT MAXQ MAXQ MAXQ-OP MAXQ MCTS UCB UCT POMCP DNG-MCTS D²NG-POMCP PFS S A T R O Ω B P H Pr E Var Cov N U XV

22

23 Artificial Intelligence Intelligence WrightEagle KeJia Agent Markov Property 1.1 / [1] Intelligent Robots Web Crawlers Siri Mars Rover Information State Sufficient Statistics State Belief State 1.1 [2] Observation State Estimation / Action 1

24 / Hidden Information [3] Belief Update Information Fusion Sensor Fusion [4] [5] Odometry Bayesian Method 2

25 Data Association [6] Multi-Object Tracking MOT Simultaneous Localization and Mapping, SLAM Landmark Decision Making Sequential Decision-Making Problem Planning [1] Scheduling [7 9] [10] Control Policy Utility Function Classical Planning Shortest Path Problem Action Cost Depth-First Search Breadth-First Search Backward Chaining Heuristic Search 3

26 (a) (b) 1.2 Wikipedia [11] Reinforcement Learning [12] Model Based Reinforcement Learning Bayesian Reinforcement Learning BRL [13 18] Exploitation Exploration [19 22] Markov Decision Process, MDP [23] Partially Observable Markov Decision Process, POMDP [24] Markov Decision Theory 4

27 1.2 Stochastic Process [25] S X S {X t t T} T = {0, 1, 2,... } Time Index Differential Equation X 0 Markov Process Markov Chain Andrey Markov 1906 [26] Pr(X t+1 X 0, X 1,..., X t ) = Pr(X t+1 X t ). (1.1) 1.2(a) A 40% E 60% A E 70% A 30% E Partially Observable Hidden Markov Model, HMM [27, 28] (b) X y a b Belief Space MDP MDP MDP POMDP POMDP MDP 1.1 MDP MDP s a 5

28 1.1 Markov Chain HMM MDP POMDP 1.3 Wikipedia s Immediate Reward R(s, a, s ) Pr(s s, a) MDP s a s Planning Horizon Cumulative Reward MDP Optimization Problem [29] MDP Dynamic Programming Monte-Carlo [30] MDP P [31] MDP 1.3 MDP S a +5 1 POMDP MDP 6

29 POMDP System Dynamics MDP MDP POMDP MDP MDP Belief MDP MDP MDP MDP MDP POMDP POMDP POMDP PSPACE POMDP POMDP POMDP 100% POMDP / MDP POMDP Multi-Agent Systems [32, 33] Joint Action MDP POMDP 7

30 1.4 Nested Belief A B B A B A B A B POMDP I-POMDP [34] I-POMDP Decentralized MDP, DEC-MDP Decentralized POMDP, DEC-POMDP [35] DEC-POMDP Policy Space Decision Tree [36] a o DEC-POMDP 8

31 DEC-POMDP NEXP DEC-POMDP 1.3 MDP POMDP MAXQ [37] MDP MAXQ-OP [38 41] Thompson [42] MDP POMDP DNG-MCTS D²NG-POMCP [43, 44] PFS Particle Filtering over Sets PFS [45] RoboCup 2D [46] * MDP POMDP MDP MAXQ-OP MAXQ-OP MAXQ MAXQ-OP Monte- Carlo Tree Search, MCTS DNG-MCTS D²NG-POMCP Dirichlet-NormalGamma MDP DNG- MCTS MDP UCT Thompson DNG-MCTS POMDP * RoboCup 2D Wiki 9

32 Dirichlet-Dirichlet-NormalGamma POMDP D²NG-POMCP POMDP POMCP POMDP Thompson PFS Expectation-Maximization, EM Observation Likelihood PETS2009 [47] CoBot PFS 10

33 MDP POMDP MDP POMDP QMDP MDP 2.1 MDP MDP ( ). S, A, T, R, S State Space A Action Space T : S A S [0, 1] Transition Function T(s s, a) = Pr(s s, a) s a s R : S A R Reward Function R(s, a) s a MDP S A S A Flat Representation Factored Representation [48] s 1, s 2, s 3,... Tabular Method s a s T(s s, a) S A S s = [x 1, x 2,..., x n ] n 11

34 2.1 DBN x i (1 i n) [x, y, ẋ, ẏ] (x, y) (ẋ, ẏ) s a s T(s s, a) Dynamic Bayesian Network, DBN [49] DBN DBN s s u i s x i T(s s, a) = Pr(x i u i, a). (2.1) 1 i n 2.1 DBN (x, y) (v_x, v_y) π π : S A [0, 1] π(s, a) s a π π s π : S A {0, 1} π : S A π(s) s 12

35 π s 0 Follow π H π s 0 U(π s 0 ) H [ ] U(π s 0 ) = E R(s t, π(s t )), (2.2) 0 t<h R(s t, π(s t )) t γ (0, 1] U(π s 0 ) [ ] U(π s 0 ) = E γ t R(s t, π(s t )). (2.3) t γ γ = 1 γ 1 R max /(1 γ) R max MDP Optimal Policy [50, 51] π π = argmax U(π s 0 ). (2.4) π π s π s t π t t H t s H t π = {π H, π H 1,..., π 1 } π = {π H, π H 1,..., π 1 } t s π Value Function V π t (s) V π 1 (s) = R(s, π 1(s)) 13

36 t s a π π Q π t (s, a) Q π t (s, a) = R(s, a) + γ T(s s, π t (s))vt 1(s π ). (2.5) s S s a π R(s, a) t 1 T(s s, π t (s)) s Vt π (s) = Q π t (s, π t (s)). (2.6) V π = { VH π, V H 1 π,..., } Vπ 1 VH π π s 0 U(π s 0 ) = VH(s π 0 ). (2.7) π s V π (s) s π s a π Q π (s, a) Q π (s, a) = R(s, a) + γ T(s s, π t (s))v π (s ). (2.8) s S V π (s) V π (s) = Q π (s, π(s)). (2.9) 2.9 V π (s) V π π s 0 U(π s 0 ) = V π (s 0 ). (2.10) V Q π Bellman Optimality Equation [52] V Q : 14 V t (s) = max a A Q t(s, a), (2.11)

37 V t (s) = max a A { V (s) = max a A Q (s, a). (2.12) R(s, a) + γ s S T(s s, a)v t 1(s ) }, (2.13) V (s) = max a A { R(s, a) + γ s S T(s s, a)v (s ) }. (2.14) V V π π t(s) = argmax Vt (s), (2.15) a A π (s) = argmax V (s). (2.16) a A MDP MDP Near-Optimal Policy π s V (s) V π (s) ϵ π ϵ- V π 1 Vπ 2 V π 3 V π H H V H V MDP Multi- Armed Bandit MAB MDP MDP s R(s, a) := X a X a f Xa (x) t a X at MAB MAB Cumulative Regret CR [ T ] R T = E (X a X at ), (2.17) t=1 15

38 a Simple Regret SR r n = E [X a Xā], (2.18) ā = argmax a A X a MDP c(s, a) R(s, a) min max MDP MDP max min MDP MDP 2.2 MDP Stochastic Optimal Control Dynamic Programming The Curse of Dimensionality MDP Offline Online Policy Iteration Value Iteration Policy Evaluation Policy Improvement π V π v 2.9 v v V π = v π V π 16

39 Input: An MDP S, A, T, R, and a small positive number ϵ Output: A near-optimal policy π 1 Initialize π(s) A arbitrarily for all s S 2 repeat 3 repeat foreach s S do 6 V (s) R(s, π(s)) + λ s S T(s s, a)v(s ) 7 max {, V(s) V (s) } 8 V(s) V (s) 9 end 10 until < ϵ 11 converged T rue 12 foreach s S do 13 π (s) argmax a A { R(s, a) + λ s S T(s s, a)v(s ) } 14 if π (s) π(s) then 15 converged False 16 end 17 π(s) π (s) 18 end 19 until converged = T rue 20 return π 2.1: MDP π Greedy Algorithm π (s) = argmax a A Q π (s, a) Q π π V π (s) V π (s) s π π 0 PE V π 0 PI π 1 PE V π 1 PI π 2 PE... PI π PE V, (2.19) PE PI MDP S A [53] 2.1 MDP ϵ

40 Input: An MDP S, A, T, R, and a small positive number ϵ Output: A near-optimal policy π 1 Let V(s) 0 for all s S 2 repeat foreach s S do 5 foreach a A do 6 Q(s, a) R(s, a) + λ s S T(s s, a)v(s ) 7 end 8 π(s) argmax a A Q(s, a) 9 max {, V(s) Q(s, π(s)) } 10 V(s) Q(s, π(s)) 11 end 12 until < ϵ 13 return π 2.2: MDP Backup Operation { } V t+1 (s) = max a A R(s, a) + λ s S T(s s, a)v t (s ), (2.20) V t t 2.20 Asynchronous Dynamic Programming V t V t MDP 2.2 2ϵ γ 1 γ - [24] Sweep

41 (a) (b) MAXQ 2.2 MAXQ AND/OR Tree Search [54] Real-Time Dynamic Programming, RTDP [55] Monte-Carlo Tree Search [56] AND/OR Tree [54] 2.2(a) s 0 s 0 a 1 a 2 Pr(s 1 s 0, a 1 ) = p Pr(s 2 s 0, a 1 ) = 1 p Pr(s 3 s 0, a 2 ) = q Pr(s 4 s 0, a 2 ) = 1 q AO* Best-First Search G 1. G G 2. G G G AO* 2.3 AO* [57] RTDP Trial-based Search MDP 19

42 Input: An MDP S, A, T, R, graph G initially empty, heuristic function h, current state s 0, and planing horizon H Output: An action a 1 Let G G (s 0, H) 2 Let V(s 0, H) h(s 0, H) 3 Initialize best partial graph to G 4 while True do 5 Let (s, d) non-terminal tip node in best partial graph 6 if (s, d) is null then 7 break 8 end 9 foreach a A do 10 Add node (a, s, d) as child of s, d 11 foreach s S do 12 if T(s s, a) > 0 then 13 Add node s, d 1 as child of (a, s, d) 14 if d 1 = 0 then 15 V(s, d 1) 0 16 end 17 else 18 V(s, d 1) h(s, d 1) 19 end 20 end 21 end 22 end 23 foreach s G in a bottom-up way do 24 Q(s, a, d) R(s, a) + γ s S T(s s, a)v(s, d 1) 25 V(s, d) max a A Q(s, a, d) 26 if V(s, d) = Q(s, a, d) then 27 Mark state s and action a 28 end 29 end 30 Recompute best partial graph to G by following marked actions 31 end 32 return marked action for state s 0 in best partial graph to G 2.3: MDP AO* RTDP AO* RTDP RTDP RTDP 2.4 MDP RTDP [58, 59] 20

43 Input: An MDP S, A, T, R, heuristic function h, current state s 0, and planing horizon H Output: An action a 1 foreach s S do 2 Initialize V(s) h(s) 3 end 4 repeat 5 Let s s 0 6 Let d 0 7 while True do 8 d d foreach a A do 10 Q(s, a) R(s, a) + γ s S T(s s, a)v(s ) 11 end 12 Let a argmax a A Q(s, a) 13 Update V(s) Q(s, a ) 14 Sample s T(s s, a ) 15 if s is goal state or d > H then 16 break 17 end 18 Let s s 19 end 20 until resource budgets reached 21 return a argmax a A Q(s 0, a) 2.4: MDP RTDP MCTS [60] MCTS MDP MDP Generative Model / Simulator MCTS MCTS Rollout 2.3 MCTS [61] MCTS MCTS 2.4 [62] MCTS Anytime MCTS MCTS MCTS 21

44 UCT MCTS [56] UCT MAB UCB [63] UCB(s, a) = Q(s, a) + c log N(s) N(s, a), (2.21) Q(s, a) a s N(s, a) a s N(s) = a A N(s, a) s c c UCT MDP UCT [57] 22

45 Input: An MDP simulator sim, current state s 0, search graph G initially empty, rollout policy π, planning horizon H, exploration constant C Output: An action a 1 UCT(s : state, d : depth, sim : simulator, G : graph, π : policy, H : horizon, C : constant) 2 if d H or s is terminal then 3 return 0 4 end 5 if node (s, d) / G then 6 Add node (s, d) to graph G 7 Initialize N(s, d) 0 and N(s, a, d) 0 for all a A 8 Initialize Q(s, a, d) 0 for all a A 9 Play rollout policy π from s for H d steps according to simulator sim 10 Let r the sampled cumulative discounted reward 11 return r 12 end 13 else 14 foreach a A do 15 if N(s, a, d) > 0 then 16 Let Bonus(a) C log N(s, d)/n(s, a, d) 17 end 18 else 19 Let Bonus(a) 20 end 21 end 22 Select a argmax a A {Q(s, a, d) + Bonus(a)} 23 Sample s T(s s, a) according to simulator sim 24 Let nv R(s, a) + γuct(s, d + 1, sim, G, π, H, C) 25 Increment N(s, d) and N(s, a, d) 26 Update Q(s, a, d) Q(s, a, d) + (nv Q(s, a, d))/n(s, a, d) 27 return nv 28 end 29 repeat 30 UCT(s 0, 0, sim, G, π, H, C) 31 until resource budgets reached 32 return a argmax a A Q(s 0, a, 0) 2.5: MDP UCT 2.3 MDP [64] Option [65] Hierarchies of Abstract Machines [66] MAXQ MAXQ Hierarchical Decomposition [37] Option Macro Action Option 23

46 Option o I o S π o β o : S [0, 1] β o (s) s Option o Option MDP MAXQ MDP MDP MDP MDP Semi-Markov Decision Process, SMDP SMDP MAXQ SMDP MDP MDP MDP SMDP τ N + s a SMDP T(s, τ s, a) s a τ s R(s, a) s a SMDP { V (s) = max a A R(s, a) + s S,τ N + γ τ T(s, τ s, a)v (s ) }. (2.22) SMDP SMDP MAXQ MAXQ MDP M MDP {M 0, M 1,, M n } MDP M 0 M 0 MDP M M i T i, A i, R i T i M i Active States S i Terminal States G i 24

47 A i M i R i M i s S i g G i MAXQ Task Graph 2.2(b) MAXQ Root Task M 0 M 1 M 2 M 3 M 1 M 2 M 3 M i 4 i 8 MAXQ Hierarchical Policy π π = {π 0, π 1,, π n } π i : S i A i M i Projected Value Function V π (i, s) s π = {π 0, π 1,, π n } M i g G i Q π (i, s, a) M a π = {π 0, π 1,, π n } M i M a V π (a, s) = R(s, a) π Q π (i, s, a) = V π (a, s) + C π (i, s, a), (2.23) V π (i, s) = { R(s, i), Mi Q π (i, s, π(s)), (2.24) C π (i, s, a) Completion Function M i M a M a M i π C π (i, s, a) = γ N Pr(s, N s, a)v π (i, s ), (2.25) s,n Pr(s, N s, a) s M a N s Recursively Optimal Policy π Q (i, s, a) = V (a, s) + C (i, s, a), (2.26) V (i, s) = { R(s, i), Mi max a Ai Q (i, s, a), (2.27) 25

48 C (i, s, a) = C π (i, s, a) π M i π i(s) = argmax Q (i, s, a). (2.28) a A i 2.4 MDP POMDP MDP [67] ( ). S, A, O, T, Ω, R, S, A, T, R O Observation Space Ω : S A O [0, 1] Observation Function Ω(o s, a) a s o POMDP MDP b b(s) s b(s) = Pr(s b) b 0 h = (a 0, o 1, a 1, o 2,... a t 1, o t ) b (s ) = ηω(o s, a) s S T(s s, a)b(s), (2.29) η = 1/P(o b, a) η = 1 s S Ω(o s, a) s S T(s s, a)b(s). (2.30) 2.29 b a o b = ζ(b, a, o) ζ Bayesian Filter B POMDP π π : B A MDP POMDP π POMDP MDP MDP Bayesian-Adaptive 26

49 2.5 POMDP MDP, BAMDP B, A, T +, r B BAMDP A r(b, a) = s S b(s)r(s, a) T + T + (b b, a) = o O 1[b = ζ(b, a, o)]ω(o b, a), (2.31) 1 Indicator Function BAMDP { } V (b) = max a A r(b, a) + γ o O Ω(o b, a)v (ζ(b, a, o)) V π { π (b) = argmax a A r(b, a) + γ o O Ω(o b, a)v (ζ(b, a, o)) }. (2.32). (2.33) BAMDP POMDP BAMDP BAMDP MDP BAMDP 2.5 MDP POMDP POMDP 27

50 POMDP Piecewise-Linear and Convex [68] t V t S Γ t = {α 0, α 1,..., α m } α- a A b α- V t (b) = max α Γ t s S α(s)b(s). (2.34) POMDP [24] 3 POMDP MDP POMDP V t 1 V t α- α- Γ t = O( A Γ t 1 O ) t POMDP O( A Z S 2 Γ t 1 Z ) Γ t One-Pass [69] Witness [70] [71] QMDP QMDP POMDP MDP QMDP QMDP MDP POMDP ˆQ(b, a) = Q MDP (s, a)b(s), (2.35) s S Q MDP (s, a) MDP ˆV(b) = max a A ˆQ(b, a). (2.36) MDP POMDP QMDP 28

51 2.6 MDP [72] α- B t α- Γ t α a (s) = R(s, a), { Γt a,o = α a,o i αa,o i (s) = γ } T(s s, a)ω(o s, a)α i(s ), α i Γ t 1, s S { Γt b = α a b αa b = α a b + } argmax α(s)b(s), a A, (2.37) o O α Γt a,o s S { } Γ t = α b α b = argmax b(s)α(s), b B. α Γt b s S Γ 0 α- α 0 (s) = 1 1 γ min s S,a A R(s, a) Γ t 1 B O( A O S B ( S + B )) PBVI [73] Perseus [74] 29

52 Input: Current belief b 0, AND-OR search tree T, planning horizon H, lower bound on L, upper bound U 1 Let b b 0 2 Initialize T to contain only b at the root 3 while not ExectionTerminated() do 4 while not PlanningTermiated() do 5 Let b ChooseNextNodeToExpand() 6 Expand(b, H) 7 UpdateAncestors(b ) 8 end 9 Execute best action a for b 10 Perceive a new observation o 11 Update b ζ(b, a, o) 12 Update tree T so that b is the new root 13 end 2.6: POMDP HSVI [75] SARSOP [76] POMDP MDP POMDP α MDP POMDP 2.6 POMDP [2] s a o 2.29 s s Expectation Maximization 2.6 POMDP [77] 30

53 Input: Current belief b 0, initial approximate value function V 0, hashtable of beliefs and approximate values V, discretization resolution k 1 Initialize b to the b 0 and V to an empty hashtable 2 while not ExectionTerminated() do 3 foreach a A do 4 Evaluate Q(b, a) r(b, a) + γ o O Pr(o b, a)v(discretize(ζ(b, a, o), k)) 5 end 6 Select a argmax a A Q(b, a) 7 Execute best action a for b 8 V(Discretize(b, k)) Q(b, a ) 9 Perceive a new observation o 10 Let b ζ(b, a, o) 11 end 2.7: POMDP RTDP-Bel Satia-Lave [78] BI-POMDP [79] AEMS [80] RTDP-Bel MDP POMDP [58] RTDP-Bel RTDP-Bel ˆQ(b, a) = r(b, a) + γ o O Ω(o b, a)v(ζ(b, a, o)), (2.38) V(b) b RTDP-Bel 2.7 RTDP-Bel [77] Discretize(b, k) b b b (s) = round(kb(s))/k O((k + 1) S ) 31

54 Input: An MDP simulator sim, current history h 0, search tree T initially empty, rollout policy π, termination condition ϵ, exploration constant C Output: An action a 1 Rollout(s : state, h : history, d : depth) 2 if γ d < ϵ then 3 return 0 4 end 5 Select a π(h, ) 6 Sample (s, o, r) sim(s, a) 7 return r + γrollout(s, hao, d + 1) 8 Simulate(s : state, h : history, d : depth) 9 if γ d < ϵ then 10 return 0 11 end 12 if h / T then 13 foreach a A do 14 T(ha) (N init (ha), V init (hs), ) 15 end 16 return Rollout(s, h, d) 17 end 18 Select a argmax a A {V(ha) + C } log N(h)/N(ha) 19 Sample (s, o, r) sim(s, a ) 20 R r + γsimulate(s, ha o, d + 1) 21 B(h) B(h) {s} 22 Increment N(h) and N(ha ) 23 Update V(ha ) V(ha ) + R V(ha ) N(ha ) 24 return R 25 repeat 26 Sample s B(h) 27 Simulate(s, h, 0) 28 until resource budgets reached 29 return a argmax a A V(ha) 2.8: POMDP POMCP MDP POMDP POMCP UCT POMDP [81] POMCP h POMCP b(h) Root Sampling MCTS MDP 32

55 POMDP POMCP POMCP UCB UCB(h, a) = Q(h, a) + c log N(h) N(h, a), (2.39) Q(h, a) h a N(h, a) h a N(h) = a A N(h, a) h c POMCP [82, 83] c POMCP 1 POMCP [84 86] 2.8 POMCP [81] MDP MDP MDP 2. MDP Option MAXQ 3. POMDP MDP POMDP MDP POMDP POMDP POMDP 33

56 4. UCB UCB MAB MAXQ MAXQ-OP DNG-MCTS D²NG-POMCP PFS 34

57 MAXQ MAXQ MDP MAXQ MDP MDP MAXQ-OP MAXQ MAXQ-OP MAXQ MAXQ-OP MDP MAXQ-OP MAXQ-OP 2D RoboCup 4 2 RoboCup 2D MAXQ-OP 3.1 MAXQ-OP MAXQ-OP RoboCup 2D [23] MDP [31] RoboCup 2D

58 MAXQ RoboCup 2D RoboCup 2D 6000 [55] LAO* [87] UCT [56] DNG-MCTS [43] RoboCup 2D RoboCup 2D 100 MDP [64] MAXQ MDP [37] MAXQ Temporal Abstraction State Abstraction Subtask Sharing MAXQ MDP RoboCup 2D RoboCup 2D MAXQ MAXQ-OP MAXQ-OP MDP MAXQ-OP RoboCup 2D 36

59 MAXQ moving-to-target MAXQ-OP MAXQ-OP MAXQ MAXQ MAXQ MDP S, A, T, R G S g G a A Pr(g g, a) = 1 R(g, a) = 0 MDP MDP Undiscounted Negative-Reward Goal-directed MDP [88] MDP Stochastic Shortest Path [89] MAXQ [39, 41] Terminating Distribution MDP MAXQ-OP MAXQ-OP WrightEagle RoboCup RoboCup 2D MAXQ-OP 3.2 MDP RTDP [55, 90 92] 37

60 MAXQ AO* [57, 87, 93] MDP MCTS [43, 56, 94 97] Trial-based Heuristic Tree Search, THTS THTS MAXQ-OP Hierarchical Reinforcement Learning, HRL MDP [64] State Abstraction [98 104] Sutton Option HRL SMDP [65] Option Dietterich MAXQ [37] MAXQ HRL MDP SMDP [105, 106] MDP Hauskrecht MDP Abstract MDP MDP Variable Influence Structure Analysis, VISA MDP DBN Causal Graph [107] Barry DetH* MDP DetH* MDP 3.3 MAXQ MAXQ M = {M 0, M 1,..., M n } MAXQ- 38

61 MAXQ OP V (i, s) Q (i, s, a) MAXQ M 0 a A 0 a p A a p MAXQ-OP 2.27 MAXQ-OP MAXQ-OP M a M i 2.25 π C (i, s, a) = γ N Pr(s, N s, a)v (i, s ), (3.1) s,n Pr(s, N s, a) = s,s 1,...,s N 1 Pr(s 1 s, π a(s)) Pr(s 2 s 1, π a(s 1 ))... Pr(s s N 1, π a(s N 1 )) Pr(N s, a). (3.2) s, s 1,..., s N 1 π a π s s N π s, a s C (i, s, a) π MAXQ 1 γ = γ N 1 C (i, s, a) = Pr(s s, a)v (i, s ), (3.3) s Pr(s s, a) = N Pr(s, N s, a) M i Pr(s s, a) 39

62 MAXQ Input: an MDP model with its MAXQ hierarchical structure Output: the accumulated reward r after reaching a goal 1 r 0 2 s GetInitState() 3 while s G 0 do 4 v, a p EvaluateState(0, s, [0, 0,..., 0]) 5 r r+ ExecuteAction(a p, s) 6 s GetNextState() 7 end 8 return r 3.1: MAXQ-OP OnlinePlanning 2.27 { V (i, s) max a A i V (a, s) + s Pr(s s, a)v (i, s ) }. (3.4) MAXQ-OP d D d[i] D[i] M i M i H 3.4 H(i, s) if d[i] D[i] V(i, s, d) max a Ai {V(a, s, d)+ s Pr(s s, a)v(i, s, d[i] d[i] + 1)} (3.5) 3.5 MAXQ-OP MAXQ V(0, s, [0, 0,..., 0]) s M 0 MAXQ-OP s a Pr(s s, a) G s,a = {s s Pr(s s, a)} C (i, s, a) C (i, s, a) 1 G s,a s G s,a V (i, s ). (3.6) 3.5 H(i, s) if d[i] D[i] V(i, s, d) max a Ai {V(a, s, d)+ 1 s G V(i, s,a G s,a s, d[i] d[i] + 1)} 40 (3.7)

63 MAXQ Input: subtask M i, state s and depth array d Output: V (i, s), a primitive action a p 1 if M i is primitive then return R(s, M i ), M i 2 else if s S i and s G i then return, nil 3 else if s G i then return 0, nil 4 else if d[i] D[i] then return HeuristicValue(i, s), nil 5 else 6 v, a p, nil 7 for M k Subtasks(M i ) do 8 if M k is primitive or s G k then 9 v, a p EvaluateState(k, s, d) 10 v v+ EvaluateCompletion(i, s, k, d) 11 if v > v then 12 v, a p v, a p ; 13 end 14 end 15 end 16 return v, a p 17 end 3.2: MAXQ-OP EvaluateState(i, s, d) MAXQ-OP 3.1 MAXQ-OP OnlinePlanning s GetInitState GetNextState ExecuteAction g G 0 MAXQ-OP EvaluateState EvaluateState s s [54, 87] 3.2 MAXQ-OP MAXQ-OP M i s d[i] D[i] 41

64 MAXQ Input: subtask M i, state s, action M a and depth array d Output: estimated C (i, s, a) 1 G s,a {s s Pr(s s, a)} 2 v 0 3 for s G s,a do 4 d d 5 d [i] d [i] v v+ EvaluateState(i, s, d ) 7 end 8 v v G s,a 9 return v 3.3: MAXQ-OP EvaluateCompletion(i, s, a, d) d[i] D[i] M i nil nil EvaluateState MAXQ MAXQ-OP NextAction Subtasks NextAction A* 42

65 MAXQ Input: subtask index i and state s Output: selected action a 1 if SearchStopped(i, s) then 2 return nil 3 end 4 else 5 a argmax a Ai H i [s, a] + c 6 N i [s] N i [s] N i [s, a ] N i [s, a ] return a 9 end ln Ni [s] N i [s,a] 3.4: MAXQ-OP NextAction(i, s) (a) (b) MAXQ 3.1 MAXQ 3.4 UCB [63] NextAction N i [s] N i [s, a] s (s, a) M i, c SearchStopped H i [s, a] M i s a 3.4 MDP [37] 3.1(a) R G Y B 4 x y pl dl pl taxi dl 4 pl dl pl dl

66 MAXQ 3.1 Root pl = dl Get Put 2 Get pl taxi pl = taxi Nav(t) Pickup 2 Put pl = taxi pl = dl Nav(t) Putdown 2 Nav(t) (x, y) = t North, South, East West MAXQ-OP ± ± 0.16 ms LRTDP ± ± 3.71 ms AOT ± ± 2.37 ms UCT ± ± 4.24 ms DNG-MCTS ± ± 4.75 ms R-MAXQ ± ± 50 - MAXQ-Q ± ± [106] North South East West 2. Pickup 3. Putdown Pickup Putdown -10 MAXQ-OP [37] MAXQ 3.1(b) Nav(t) t t R G Y B 3.1 EvaluateCompletion Root Get Put Nav(t) 1 North South East West T(s s, a) HeuristicValue Manhattan Get 44

67 MAXQ Manhattan((x, y), pl) 1 Manhattan((x 1, y 1 ), (x 2, y 2 )) (x 1, y 1 ) (x 2, y 2 ) Manhattan x 1 x 2 + y 1 y 2 s M i 0 d[i] = 0 v, a p cache[i, hash(i, s)] v, a p, cache hash(i, s) s M i s 0.9 MAXQ-OP 3.2 LRTDP [90] AOT [57] UCT [56] DNG-MCTS [43] Anytime LRTDP AOT min-min [90] UCT DNG-MCTS min-min Rollout UCT DNG-MCTS MDP R-MAXQ MAXQ-Q [106] Linux 3.8 CPU 2.90 GHz 8GB MAXQ-OP 3.93 ± ± 0.15 MAXQ-OP MAXQ-OP 3.5 RoboCup 2D RoboCup 2D [108, 109] * RoboCup RoboCup 2D RoboCup 2D Simulator * Peter Stone 45

68 MAXQ 3.2 RoboCup D WrightEagle Helios [110] RoboCup D WrightEagle Helios 3.2 [111] RoboCup 2D Tile-Coding Sarsa(λ) Keepaway [112] [113] MAXQ-OP RoboCup MAXQ-OP 2D 2009 RoboCup 4 2 MAXQ-OP RoboCup 2D RoboCup 2D RoboCup 2D Server

69 MAXQ RoboCup 2D RoboCup 2D MDP RoboCup 2D RoboCup 2D MDP RoboCup 2D s = (s 0, s 2,..., s 22 ) 23 s u, u [1, 11] {s 1,..., s 11 } s u {s 12,..., s 22 } s 0 s = (x, y, ẋ, ẏ, α, β) (x, y) (ẋ, ẏ) α β s = (x, y, ẋ, ẏ) RoboCup 2D dash kick tackle turn turn_neck dash kick tackle kick turn turn_neck dash dash power [0, 1] angle [0, 2π) power = p angle = θ (ẍ, ÿ) = (pa cos θ, pa sin θ) + r a A = 1.0m/s 2 r a (x, y) (x, y)+(ẋ, ẏ)+(ẍ, ÿ) (ẋ, ẏ) (ẋ, ẏ)ω + (ẍ, ÿ)ω ω = 0.4 MDP 47

70 MAXQ RoboCup 2D MAXQ-OP dash turn RoboCup MDP WrightEagle [24] b b(s) s b(s) b(s) = b i (s[i]), (3.8) 0 i 22 s s[i] i b i (s[i]) s[i] m i b i b i (s[i]) {x ij, w ij } j=1...mi, (3.9) x ij i w ij 1 j m i w ij = 1 RoboCup 2D Motion Model Sensor Model [114, 115] s s[i] = w ij x ij. (3.10) 1 j m i 3.3 neck_dir RoboCup 2D turn_neck 3.3 WrightEagle 48

71 MAXQ 3.3 x (m) y (m) ẋ (m/s) ẏ (m/s) α (Deg) β (Deg) e MAXQ-OP MAXQ-OP RoboCup MAXQ kick turn dash tackle -1 KickTo TackleTo NavTo KickTo TackleTo NavTo KickTo kick turn NavTo dash turn turn / kick KickTo TackleTo NavTo 49

72 MAXQ 3.4 MAXQ Shoot Dribble Pass Position Intercept Block Trap Mark Formation 1. Shoot 2. Dribble 3. Pass 4. Position 5. Intercept 6. Block 7. Trap 8. Mark 9. Formation Shoot Dribble Pass Shoot Dribble Pass Intercept Position Attack Defense Attack Defense Attack Defense Root Root Attack Attack Defense MAXQ 3.4 Attack Pass Intercept KickTo kick s Q (Root, s, Attack) = V (Attack, s)+ s Pr(s s, Attack)V (Root, s ), (3.11) 50 V (Root, s) = max{q (Root, s, Attack), Q (Root, s, Defense)}, (3.12)

73 MAXQ V (Attack, s) = max{q (Attack, s, Pass), Q (Attack, s, Dribble), Q (Attack, s, Shoot), Q (Attack, s, Intercept), Q (Attack, s, Position)}, (3.13) Q (Attack, s, Pass) = V (Pass, s)+ s Pr(s s, Pass)V (Attack, s ), (3.14) Q (Attack, s, Intercept) = V (Intercept, s)+ s Pr(s s, Intercept)V (Attack, s ), (3.15) V (Pass, s) = max position p Q (Pass, s, KickTo(p)), (3.16) V (Intercept, s) = max position p Q (Intercept, s, NavTo(p)), (3.17) Q (Pass, s, KickTo(p)) = V (KickTo(p), s)+ s Pr(s s, KickTo(p))V (Pass, s ), (3.18) Q (Intercept, s, NavTo(p)) = V (NavTo(p), s)+ s Pr(s s, NavTo(p))V (Intercept, s ), (3.19) V (KickTo(p), s) = V (NavTo(p), s) = max power a, angle θ Q (KickTo(p), s, kick(a, θ)), (3.20) max power a, angle θ Q (NavTo(p), s, dash(a, θ)), (3.21) Q (KickTo(p), s, kick(a, θ)) = R(s, kick(a, θ))+ s Pr(s s, kick(a, θ))v (KickTo(p), s ), (3.22) Q (NavTo(p), s, dash(a, θ)) = R(s, dash(a, θ))+ s Pr(s s, dash(a, θ))v (NavTo(p), s ). (3.23) R(s, kick(a, θ)) = 1 Pr(s s, kick(a, θ)) KickTo(p) p 3.20 p Pass 51

74 MAXQ Attack Attack Intercept Position NavTo(p) p R(s, dash(a, θ)) = 1 Pr(s s, dash(a, θ)) 3.21 p 3.17 Attack Root Root Defense Defense p b = (b x, b y, bẋ, bẏ) p = (p x, p y, pẋ, pẏ, p α, p β ) p Pr(p b b, p) p Pr(p b b, p) = max {Pr(p b, t b, p)} Pr(p b, t b, p) p t Pr(p b, t b, p) = g(t f(p, b t )) b t t f(p, b t ) (p x, p y ) t g(δ) b t δ 3.5 g(δ) Pr(s s, Attack) = 1 (1 Pr(o b b, o)), (3.24) Pr(s s, Defense) = 1 opponent o teammate t (1 Pr(t b b, t)), (3.25) Pr(s s, Intercept) = 1[ player i : i b] Pr(i b b, i) (1 Pr(p b b, p)), (3.26) player p i Pr(s s, Position) = 1[ non-teammate i : i b] Pr(i b b, i) (1 Pr(p b b, p)), (3.27) player p i b = s[0] MAXQ kick dash MAXQ-OP Pass KickTo NavTo A* dash 52

75 MAXQ 1 Intercepting Probability 0.8 Probability Cycle Difference 3.5 turn MAXQ-OP Attack Impelling Speed V (Attack, s t ) s t t s s impelling_speed(s, s, α) = dist(s, s, α) + pre_dist(s, α), (3.28) step(s, s ) + pre_step(s ) α Aim Angle dist(s, s, α) α s s step(s, s ) pre_dist(s ) s α pre_step(s ) s α aim_angle(s) V (Attack, s) V (Attack, s t ) = impelling_speed(s 0, s t, aim_angle(s 0 )), (3.29) s 0 impelling_speed(s 0, s t, aim_angle(s 0 )) s t MAXQ-OP Full: MAXQ-OP Random: Full Attack Pass Dribble 53

76 MAXQ 3.6 RoboCup D Hand-coded: Random Pass Dribble 3 Full Random Hand-coded Attack Pass Dribble Full MAXQ-OP EvaluateState(Pass,, ) EvaluateState(Dribble,, ) Random Hand-coded Pass-Dribble, Shoot Pass Dribble Intercept Full RoboCup 2D Trainer 100 Helios2011 RoboCup 2011 Episode RoboCup WrightEagle success 2. x failure timeout 3.6 RoboCup 2011 # RoboCup #

77 MAXQ 3.4 WrightEagle Success Failure Timeout Full Random Hand-coded WrightEagle BrainsStomers : : ± 7.5% Helios : : ± 5.0% Helios : : ± 8.8% Oxsy : : ± 5.6% 3.4 Full Random Hand-coded 86.7% and 64.7% Random Hand-coded Pass-Dribble Full Pass Dribble Attack MAXQ-OP Pass Dribble Attack Defense Shoot Pass MAXQ-OP WrightEagle Full RoboCup 2D 4 4 BrainsStomers08 Helios10 Helios11 Oxsy11 BrainStormers08 Helios10 RoboCup 2008 RoboCup WrightEagle 3.5 p = n/n n N WrightEagle 82.0% 93.0% 83.0% 91.0% BrainsStomers08 Helios10 Helios11 Oxsy RoboCup 2D WrightEagle MAXQ MAXQ-OP RoboCup 2D 55

78 MAXQ 3.6 RoboCup 2D RoboCup : : 0.84 RoboCup : : 0.43 RoboCup : : 0.64 RoboCup : : 1.13 RoboCup : : 1.21 RoboCup : : 0.54 RoboCup : : 0.25 RoboCup : : 0.86 RoboCup : : MAXQ-OP 2. RoboCup 2D 3.6 MAXQ-OP MAXQ-OP MAXQ-OP MAXQ MAXQ-OP MAXQ-OP MDP RoboCup 2D MAXQ-OP MDP MAXQ-OP 56

79 MCTS MCTS MCTS Thompson MDP POMDP Dirichlet-NormalGamma Dirichlet-NormalGamma based Monte-Carlo Tree Search DNG-MCTS Dirichlet-Dirichlet-NormalGamma Dirichlet-Dirichlet-NormalGamma based Partially Observable Monte-Carlo Planning D²NG-POMCP MDP MDP [23] POMDP MDP [24] MDP POMDP MCTS [60] MCTS MCTS MCTS / [13, 14] UCB [63, 116] UCB 57

80 MAB [117] MAB UCB UCB UCB Auer UCB MAB [63] Thompson MAB Thompson Randomized Probability Matching [42] UCB Thompson [118] MAB Cumulative Regret Simple Regret [119] [120] Thompson UCB [121] Thompson MAB [ ] [97] [119] Bubeck MCTS [126] Thompson Thompson Thompson MDP POMDP Thompson MCTS Thompson MDP POMDP [43, 44] MDP POMDP Dirichlet-NormalGamma Dirichlet-NormalGamma based Monte-Carlo Tree Search DNG-MCTS Dirichlet-Dirichlet-NormalGamma Dirichlet-Dirichlet-NormalGamma based Partially Observable Monte-Carlo Planning D²NG-POMCP DNG-MCTS MCTS MDP 58

81 Dirichlet NormalGamma Thompson DNG-MCTS MDP POMDP POMDP POMDP D²NG-POMCP Dirichlet NormalGamma Thompson 4.2 DNG-MCTS [127] 4.3 [128] MCTS UCT UCT MAX/MIN UCB MAX/MIN MAB Thompson POMDP [77] 3 Branch-and-Bound Pruning [ ] [75, 76, 79, 80, 133, 134] [81, 83, ] 59

82 4.3 MDP MDP DNG-MCTS MDP [138, 139] ( ). X = {x 0, x 1,... } X w f X µ = E w [f] = X w(x)f(x) dx σ = Var w (f(x 0 )) + 2 i=1 Cov w (f(x 0 ), f(x i )) N(0, σ 2 ) x 0 n ( ) 1 n n f(x t ) µ N(0, σ 2 ). (4.1) n t= n 1 n n t=0 f(s t) N(µ, σ 2 /n) n 1 n n t=0 f(s t) n t=0 f(s t) n MDP π X s,π s π X s,a,π s a π X s,π X s,a,π π MDP S {s t } Pr(s s) = T (s s, π(s)) {s t } MDP H γ = 1 X s0,π = H t=0 R(s t, π(s t )) f(s t ) = R(s t, π(s t )) H X s0,π s 0 S γ 1 H γ 1 X s0,π π X s,π DNG-MCTS X s,π 60

83 s a π X s,a,π = R(s, a) + γx s,π, (4.2) s T(s s, a) Y s,a,π Y s,a,π = 1 γ (X s,a,π R(s, a)). (4.3) Y s,a,π s X s,π f Ys,a,π (x) = s S T(s s, a)f Xs,π (x). (4.4) s X s,π Y s,a,π X s,a,π Y s,a,π X s,a,π ( ). X θ L(x θ) θ Pr(θ) X Z = {x 1, x 2,... } θ Pr(θ Z) = η Pr(Z θ) Pr(θ) = η i L(x i θ) Pr(θ), (4.5) η = 1/ Pr(Z) N(µ s, 1/τ s ) X s,π µ s τ s τ = 1/σ 2 NormalGamma [140] (NormalGamma ). NormalGamma Hyper Parameters µ 0, λ, α, β λ > 0 α 1 β 0 Γ( ) Gamma (µ, τ) NormalGamma NormalGamma(µ 0, λ, α, β) (µ, τ) f(µ, τ µ 0, λ, α, β) = βα λ Γ(α) 2π τα 1 2 e βτ e λτ(µ µ 0 )2 2. (4.6) τ Gamma τ Gamma(α, β) τ µ µ N (µ 0, 1/(λτ)) 61

84 4.3.3 (NormalGamma ). X µ τ X N(µ, 1/τ) (µ, τ) NormalGamma (µ, τ) NormalGamma(µ 0, λ 0, α 0, β 0 ) n X {x 1, x 2,..., x n } x = n i=1 x i/n s = n i=1 (x i x) 2 /n (µ, τ) NormalGamma (µ, τ) NormalGamma(µ n, λ n, α n, β n ) µ n = λ 0µ 0 + n x λ 0 + n, (4.7) λ n = λ 0 + n, (4.8) α n = α 0 + n 2, (4.9) β n = β (ns + λ ) 0n( x µ 0 ) 2. (4.10) 2 λ 0 + n Y s,a,π Y s,a,π s S w s,a,s N(µ s, 1 τ s ), (4.11) w s,a,s = T(s s, a) w s,a,s 0 s S w s,a,s = 1 w s,a,s Dirichlet Dirichlet [140] s a Dirichlet Dirichlet(ρ s,a ) ρ s,a = (ρ s,a,s1, ρ s,a,s2,... ) Dirichlet (s, a) s ρ s,a,s 1 T(s s, a) (s, a) s T(s s, a) Dirichlet ρ s,a,s ρ s,a,s + 1. (4.12) X s,π X s,a,π MCTS s a µ s,0, λ s, α s, β s ρ s,a Thompson (Thompson ). Thompson a [ ] Pr(a) = 1 a = argmax E [X a θ a ] P a (θ a Z) dθ, (4.13) a a θ a a θ = (θ a1, θ a2,... ) E[X a θ a ] = xl a (x θ a ) dx θ a a 62

85 1 OnlinePlanning(s : state, T : tree) 2 Initialize H maximal planning horizon 3 repeat 4 DNG-MCTS(s, T, 0) 5 until resource budgets reached 6 return ThompsonSampling(s, 0, False) 7 DNG-MCTS(s : state, T : tree, d : depth) 8 if d H or s is terminal then 9 return 0 10 end 11 else if node s, d is not in tree T then 12 Initialize (µ s,0, λ s, α s, β s ), and ρ s,a for a A 13 Add node s, d to T 14 Play rollout policy by simulation for H d steps 15 Get the cumulative reward r 16 return r 17 end 18 else 19 a ThompsonSampling(s, d, T rue) 20 Execute a by simulation 21 Observe next state s and reward R(s, a) 22 r R(s, a) + γdng-mcts(s, T, d + 1) 23 α s α s β s β s + (λ s (r µ s,0 ) 2 /(λ s + 1))/2 25 µ s,0 (λ s µ s,0 + r)/(λ s + 1) 26 λ s λ s ρ s,a,s ρ s,a,s return r 29 end 4.1: Dirichlet-NormalGamma Thompson a A P a (θ a Z) θ a a = argmax E[X a θ a ]. (4.14) a DNG-MCTS Thompson s NormalGamma(µ s,0, λ s, α s, β s ) Dirichlet(ρ s,a ) s µ s w s,a,s Q(s, a) Q(s, a) = R(s, a) + γ s S w s,a,s µ s. (4.15) 63

86 1 ThompsonSampling(s : state, d : depth, sampling : boolean) 2 foreach a A do 3 q a QValue(s, a, d, sampling) 4 end 5 return argmax a q a 6 QValue(s : state, a : action, d : depth, sampling : boolean) 7 r 0 8 foreach s S do 9 if sampling = True then 10 Sample w s Dirichlet(ρ s,a ) 11 end 12 else 13 w s ρ s,a,s / s S ρ s,a,s 14 end 15 r r + w s Value(s, d + 1, sampling) 16 end 17 r R(s, a) + γr 18 return r 19 Value(s : state, d : depth, sampling : boolean) 20 if d H or s is terminal then 21 return 0 22 end 23 else 24 if sampling = T rue then 25 Sample (µ, τ) NormalGamma(µ s,0, λ s, α s, β s ) 26 return µ 27 end 28 else 29 return µ s,0 30 end 31 end 4.2: DNG-MCTS Thompson DNG-MCTS DNG-MCTS ThompsonSampling sampling sampling Thompson Q(s, a) Q(s, a) = R(s, a) + γ s S ρ s,a,s s S ρ s,a,s µ s,0. (4.16) DNG-MCTS Thompson T 64

87 Rollout OnlinePlanning s T OnlinePlanning DNG-MCTS Rollout Z n = POMDP POMDP D²NG- POMCP POMDP π POMDP s, b s b J = S B S B J { s t, b t } P ( s, b s, b ) = T (s s, π(b)) T + (b b, π(b)). (4.17) X b,a b a X s,b,π s, b π X b,π b π X b,a X s,b,π X b,π POMDP I I = {r 1, r 2,..., r k } r i = R(s, a) s a X b,a Multinomial Distribution Multinomial(p 1, p 2,..., p k ) k i=1 p i = 1 p i = s S 1[R(s, a) = r i]b(s) X b,a = r i [141]. POMDP b 0-65

88 s 0, b 0 H POMDP γ = 1 X s0,b 0,π = H t=0 R(s t, π(b t )) f(s t, b t ) = R(s t, π(b t )) H s 0, b 0 J X s0,b 0,π γ 1 1 H X s0,b 0,π b π X b,π = X s,b,π s b(s) X b,π X s,b,π f Xb,π (x) = s S b(s)f Xs,b,π (x). (4.18) s, b J X s,b,π X b,π π X b,π X b,π X b,a X b,a Multinomial(p 1, p 2,..., p k ) Dirichlet b a p i Dirichlet Dirichlet(ψ b,a ) ψ b,a = (ψ b,a,r1, ψ b,a,r2,..., ψ b,a,rk ) r Dirichlet ψ b,a,r ψ b,a,r + 1. (4.19) X s,b,π N(µ s,b, 1/τ s,b ) µ s,b τ s,b NormalGamma (µ s,b, τ s,b ) NormalGamma (µ s,b, τ s,b ) NormalGamma(µ s,b,0, λ s,b, α s,b, β s,b ) µ s,b,0 λ s,b α s,b β s,b X b,π b(s) and X s,b,π s b a π X b,a,π 66 X b,a,π = X b,a + γx b,π, (4.20)

89 b T + (b b, a) X b,a,π E[X b,a,π ] = E[X b,a ] + γ b B E[X b,π]t + (b b, a) = E[X b,a ] + γ o O 1[b = ζ(b, a, o)]ω(o b, a)e[x b,π]. (4.21) E[X b,a,π ] Q π (b, a) = r(b, a) + γ o O Ω(o b, a)v π (ζ(b, a, o)). (4.22) X b,a X b,π Ω( b, a) Ω( b, a) Dirichlet Dirichlet(ρ b,a ) ρ b,a = (ρ b,a,o1, ρ b,a,o2,... ) (b, a) o Ω( b, a) ρ b,a,o ρ b,a,o + 1. (4.23) X b,a,π MCTS b s a µ s,b,0, λ s,b, α s,b, β s,b ψ b,a ρ b,a Thompson D²NG-POMCP Thompson s X b,a,π Dirichlet(ρ b,a ) o O w b,a,o Dirichlet(ψ b,a ) r I w b,a,r NormalGamma(µ s,b,0, λ s,b, α s,b, β s,b ) s, b J µ s,b b = ζ(b, a, o) b a o Q(b, a) Q(b, a) = r I w b,a,r r + γ o O D²NG-POMCP 1[b = ζ(b, a, o)]w b,a,o µ s,b b (s ). (4.24) D²NG-POMCP h s S 67

90 1 OnlinePlanning(h : history, T : tree) 2 repeat 3 Sample s P(h) 4 D²NG-POMCP(s, h, T, 0) 5 until resource budgets reached 6 return ThompsonSampling(h, 0, False) 7 Agent(b 0 : initial belief) 8 Initialize H maximal planning horizon 9 Initialize I {possible immediate rewards} 10 Initialize h 11 Initialize P(h) b 0 12 repeat 13 a OnlinePlanning(h, ) 14 Execute a and get observation o 15 h hao 16 P(h) ParticleFilter(P(h), a, o) 17 until terminating conditions 18 D²NG-POMCP(s : state, h : history, T : tree, d : depth) 19 if d H or s is terminal then 20 return 0 21 end 22 else if node h is not in tree T then 23 Initialize (µ s,h,0, λ s,h, α s,h, β s,h ) for s S, and ρ h,a and ψ h,a for a A 24 Add node h to T 25 Play rollout policy for H d steps 26 Get cumulative reward r 27 return r 28 end 29 else 30 a ThompsonSampling(h, d, T rue) 31 Execute a by simulation 32 Get state s, observation o and reward i 33 h hao 34 P(h ) P(h ) s 35 r i + γd²ng-pomcp(s, h, T, d + 1) 36 α s,h α s,h β s,h β s,h + (λ s,h (r µ s,h,0 ) 2 /(λ s,h + 1))/2 38 µ s,h,0 (λ s,h µ s,h,0 + r)/(λ s,h + 1) 39 λ s,h λ s,h ρ h,a,o ρ h,a,o ψ h,a,i ψ h,a,i return r 43 end 4.3: Dirichlet-Dirichlet-NormalGamma POMCP 68

91 1 ThompsonSampling(h : history, d : depth, sampling : boolean) 2 foreach a A do 3 q a QValue(h, a, d, sampling) 4 end 5 return argmax a q a 6 QValue(h : history, a : action, d : depth, sampling : boolean) 7 r 0 8 foreach o O do 9 if sampling = True then 10 Sample w o Dirichlet(ρ h,a ) 11 end 12 else 13 w o ρ h,a,o / o O ρ h,a,o 14 end 15 h hao 16 r r + w o Value(h, d + 1, sampling) 17 end 18 r γr 19 foreach i I do 20 if sampling = T rue then 21 Sample w i Dirichlet(ψ h,a ) 22 end 23 else 24 w i ψ h,a,i / i I ψ h,a,i 25 end 26 r r + w i i 27 end 28 return r 29 Value(h : history, d : depth, sampling : boolean) 30 if d H then 31 return 0 32 end 33 else 34 if sampling = T rue then 35 Sample (µ s, τ s ) NormalGamma(µ s,h,0, λ s,h, α s,h, β s,h ) 36 for s P(h) 1 37 return P(h) s P(h) µ s 38 end 39 else 1 40 return P(h) s P(h) µ s,h,0 41 end 42 end 4.4: D²NG-POMCP Thompson 69

92 MCTS h s a µ s,h,0, λ s,h, α s,h, β s,h ψ h,a ρ h,a P(h) [142, 143] 4.24 Q(h, a) = r I w h,a,r r + γ o O w h,a,o s P(hao) µ s,hao, (4.25) w h,a,r w h,a,o µ s,hao Dirichlet(ψ h,a ) Dirichlet(ρ h,a ) NormalGamma(µ s,hao,0, λ s,hao, α s,hao, β s,hao) Q(h, a) Q(h, a) = r I ψ h,a,r r I ψ h,a,r r + γ o O ρ h,a,o o O ρ h,a,o s P(hao) µ s,hao,0. (4.26) D²NG-POMCP T Rollout OnlinePlanning h T P(h) D²NG-POMCP Agent OnlinePlanning ParticleFilter Uninformative Priors [144] Principle of Indifference NormalGamma τ µ N(µ 0, 1/(λτ)) 1/(λτ) λτ 0 τ E[τ] = α/β Gamma Gamma(α, β) λα/β 0 70

93 λ > 0, α 1, β 0 λ α = 1 β µ 0 = 0 β Dirichlet Informative Priors DNG-MCTS DNG-MCTS NormalGamma λ µ 0 2α α/β µ τ NormalGamma(µ 0, λ, α, β) [140] MAB Thompson [124] Thompson MAB 1 DNG-MCTS X s,π π Q Q X s,π H Thompson 1 MAB Rollout H 1 MAB Thompson H 1 DNG-MCTS Rollout D²NG-POMCP DNG-MCTS D²NG- POMCP H Rollout 4.6 Thompson DNG-MCTS D²NG-POMCP Linux GHz 8G 71

94 Simple Regret e-05 RoundRobin Randomized 0.5-Greedy UCB1 ThompsonSampling Number of Action Pulls Simple Regret e-05 RoundRobin Randomized 0.5-Greedy UCB1 ThompsonSampling Number of Action Pulls (a) 8 arms. (b) 32 arms Simple Regret e-05 RoundRobin Randomized 0.5-Greedy UCB1 ThompsonSampling Number of Action Pulls Simple Regret e-05 RoundRobin Randomized 0.5-Greedy UCB1 ThompsonSampling Number of Action Pulls (c) 128 arms. (d) 512 arms. 4.1 MAB Thompson Simple Regret MAB Thompson RoundRobin, Randomized, 0.5-Greedy UCB RoundRobin [97] Randomized 0.5-Greedy [126] UCB UCB Bernoulli UCB 2 Thompson Beta (α = 1, β = 1) MAB Thompson Thompson Thompson MCTS 72

95 4.6.2 MDP DNG-MCTS UCT Canadian Traveler Problem, CTP Racetrack Problem Sailing Problem c(s, a) R(s, a) min max MDP min max MDP MDP DNG-MCTS Thompson min MDP-engine * MDP s S, a A s S (µ s,0, λ s, α s, β s ) (0, 0.01, 1, 100) ρ s,a,s 0.01 [57] UCT Q(s, a, d) CTP [145] CTP POMDP MDP n 3 m n m γ = 1 Anytime AO* AOT [57] UCTB UCTO [146] UCTB UCTO UCT CTP Rollout CTP DNG-MCTS UCT [57] Rollout 4.1 [57] UCTB UCTO MDP Rollout DNG-MCTS UCT Rollout UCT * MDP-engine 73

96 CTP UCT Rollout Rollout UCTB UCTO UCT DNG UCT DNG ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ±3 total Avg. Accumulated Cost UCT DNG-MCTS Avg. Accumulated Cost UCT DNG-MCTS Number of Iterations Number of Iterations (a) Barto-Big (b) DNG-MCTS UCTO UCT [55] DNG-MCTS UCT Rollout H = 100 Barto-Big s = γ = (a) DNG-MCTS UCT [56] 74

97 Avg. Returns Avg. Time Usage (ms) RTDP AOT -90 UCT DNG-MCTS Grid Width RTDP AOT UCT DNG-MCTS Grid Width (a) (b) 4.3 etaxi γ = 0.95 H = DNG-MCTS UCT Rollout (b) DNG-MCTS UCT Taxi Taxi n etaxi[n] n n R G Y B (0, 0) (0, n 1) (n 2, 0) (n 1, n 1) n 1 (0, 0) (1, 0) (1, n 1) (2, n 1) 2 (n 3, 0) (n 2, 0) Taxi n = 5 etaxi[n] Taxi DNG-MCTS LRTDP [90] AOT [57] UCT LRTDP AOT MDP Min-Min [90] LRTDP AOT Min-Min UCT DNG-MCTS Rollout etaxi 4.3(a) 4.3(b) 1000 etaxi[5] 4.2 DNG-MCTS UCT LRTDP AOT DNG-MCTS DNG-MCTS 75

98 4.2 etaxi[5] (ms) LRTDP ± ± 3.71 AOT ± ± 2.37 UCT ± ± 4.24 DNG-MCTS ± ± POMDP RockSample Problem PocMan Problem D²NG-POMCP POMCP D²NG-POMCP POMCP * POMCP s S a A r I o O h (µ s,h,0, λ s,h, α s,h, β s,h ) (0, 0.01, 1, 100) ψ h,a,r 0.01 ρ h,a,o 0.01 D²NG-POMCP POMCP [81] Rollout POMCP RockSample(n, k) k n n γ = OnlinePlanning D²NG-POMCP OnlinePlanning D²NG-POMCP POMCP POMCP 4.3 D²NG-POMCP AEMS2 [80] HSVI-BFS [75, 77] SARSOP [76] POMCP [81] AEMS2 HSVI-BFS * POMCP 76

99 Avg. Discounted Return POMCP -5 D 2 NG-POMCP e+06 Number of Iterations (a) RockSample(7,8) Avg. Discounted Return POMCP -5 D 2 NG-POMCP 1e Avg. Time Per Action (Seconds) (b) RockSample(7,8) Avg. Discounted Return POMCP -5 D 2 NG-POMCP e+06 Number of Iterations (c) RockSample(11,11) Avg. Discounted Return POMCP -5 D 2 NG-POMCP 1e Avg. Time Per Action (Seconds) (d) RockSample(11,11) Avg. Discounted Return POMCP -5 D 2 NG-POMCP Number of Iterations (e) RockSample(15,15) Avg. Discounted Return POMCP -5 D 2 NG-POMCP Avg. Time Per Action (Seconds) (f) RockSample(15,15) 4.4 RockSample D²NG-POMCP SARSOP POMDP AEMS2 HSVI- BFS PBVI [73] [77] SARSOP 1000 [76] POMCP D²NG-POMCP Rollout POMCP [81] D²NG-POMCP RockSample(7, 8) RockSample(11, 11) RockSample(15, 15) POMCP 77

100 4.3 RockSample D²NG-POMCP RockSample [7, 8] [11,11] [15,15] s 12, ,808 7,372,800 AEMS ± 0.22 N/A N/A HSVI-BFS ± 0.22 N/A N/A SARSOP ± ± 0.11 N/A POMCP ± ± ± 0.28 D²NG-POMCP ± ± ± 0.24 Avg. Discounted Return POMCP D 2 NG-POMCP Number of Iterations (a) Pocman Avg. Discounted Return POMCP D 2 NG-POMCP 1e Avg. Time Per Action (Seconds) (b) Pocman 4.5 PocMan D²NG-POMCP Avg. Undiscounted Return POMCP -50 D 2 NG-POMCP Number of Iterations (a) Pocman Avg. Undiscounted Return POMCP -50 D 2 NG-POMCP 1e Avg. Time Per Action (Seconds) (b) Pocman 4.6 PocMan D²NG-POMCP PocMan [81] PocMan PocMan PocMan 78

101 γ = [81] 4.6 D²NG-POMCP POMCP DNG-MCTS D²NG-POMCP width depth width depth UCT POMCP 3D MCTS DNG-MCTS D²NG-POMCP PocMan 4.7 DNG-MCTS D²NG-POMCP DNG-MCTS D²NG- POMCP MDP POMDP Thompson Thompson MDP UCT DNG-MCTS CTP RaceTrack Sailing POMDP D²NG-POMCP RockSample PocMan AEMS2 HSVI-BFS SARSOP POMCP 79

102

103 Particle Filter over Sets PFS Target Identification EM PFS PETS2009 CoBot [147, 148] CoBot [149, 150] CoBot CoBot CoBot Multi-Object Tracking MOT Tracking-by-Detection [151, 152] Object Detector Tracker [151, 153, 154] Global Nearest Neighbor GNN 81

104 [155] Joint Probabilistic Data- Association JPDA Association Probability [153] Multiple Hypothesis Tracking MHT [154] Particle Filter over Sets PFS PFS EM PFS PETS2009 PFS PFS CoBot PFS 5.2 Joint Multi-Target Probability Density JMPD [156] JMPD JMPD [157] Markov Random Field MRF Monte-Carlo Markov Chain MCMC [158] Interacting 82

105 [159] Rao-Blackwellized Random Finite Set RFS [ ] Finite Set Statistics FISST [165] FISST 5.3 PFS PFS S n S = {X i } i=1:n n S = {x i } i=1:n Pr(S) = σ A n Pr(X 1 = x σ(1), X 2 = x σ(2),..., X n = x σ(n) ) A n {i} i=1:n Pr(X 1 = x σ(1), X 2 = x σ(2),..., X n = x σ(n) ) X 1 = x σ(1) X 2 = x σ(2) X n = x σ(n). S = {x i } i=1:n x {x i } i=1:n S X {X i } i=1:n S ψ ψ : {X i } i=1:n {x i } i=1:n S O = {o i } i=1:n n k S = {o (i) } i=1:k S Pr(S) = 1 k! = 1 n(n 1) (n k+1) ( n k) ( ) n k = n! k!(n k)! X f X (x) S = {X i } i=1:n S X n S = {x i } i=1:n Pr(S) = n! 1 i n f X(x i ) S = {X i } i=1:n n S 3 {Head} {Head, Tail} {Tail} Pr({Head, Tail}) = 1/ Head Tail 83

106 5.3.2 PFS S = {s i } i=1: S s = (x, y, ẋ, ẏ) (x, y) (ẋ, ẏ) (x, y) (x, y) + (ẋ, ẏ)τ + 1(ẍ, 2 ÿ)τ2 (ẋ, ẏ) (ẋ, ẏ) + (ẍ, ÿ)τ τ (ẍ, ÿ) (ẍ, ÿ) = (p cos θ, p sin θ) p N(0, σ 2 p) Dash Power θ U(0, 2π) Dash Direction N U σ 2 p Filed of View FOV Birth-Death Process λ S µ O = {o i } i=1: O o = (x, y, c) (x, y) c [0, 1] Human Detector Support Vector Machine SVM [166] 0.5 s = (x, y, ẋ, ẏ) Pr(o s) o = (x, y, c) Pr(o s) = Pr(c 1) Pr(x, y x, y) Pr(c 1) c Pr(x, y x, y) (x, y) (x, y ) Beta Pr(c 1) = Beta(c 2, 1) Pr(x, y x, y) = N(x, y x, y, Σ) Σ False Detection Pr(o ) o = (x, y, c) Pr(o ) = Pr(c 0)f b (x, y ) Pr(c 0) c f b (x, y ) (x, y ) Beta Pr(c 0) = Beta(c 1, 2) f b F O M S Missing Detection F M 84

107 O F = S M O S = { F i, M i } i=1: O S F-M O S = ( O )( S ) ( 0 i min{ O, S } i i = O + S ) O ν S ξ τ Pr(O S) = F,M O S Pr(O F S M) (ντ) F e ντ o F P(o ) ( S ξτ) M e S ξτ M! 1 ), (5.1) ( S M Pr(O F S M) f F (F) = (ντ) F e ντ o F P(o ) f M(M) = ( S ξτ) M e S ξτ 1 Ψ O F S M M! ( M ) S S M O F Pr(O F S M) = ψ Ψ O F s S M S M Pr(ψ(s) s). (5.2) ( O )( S ) 0 i min{ O, S } i i i! = Ω(( max{ O, S } ) min{ O, S } ) e PFS m! m = S M = O F m > 2 Pr(o s) c(s, o) = log(pr(o s)) Murty [167] Murty N N in O(kN 3 ) N N k Top k [168] F-M ( ) O + S O F-M F, M f F (F)f M (M) 85

108 Input: A set of detections O, and a set of humans S Output: Probability of observing O given S 1 Let Q a descending priority queue initially empty 2 Let F a list of all possible false detections F 3 Let M a list of all possible missing detections M 4 Sort F according to f F ( ) in descending order 5 Sort M according to f M ( ) in descending order 6 Add (1, 1) to Q with priority f F (F[1])f M (M[1]) 7 Let p 0 8 repeat 9 Let (i, j) Pop(Q) 10 Let q f F (F[i])f M (M[j]) 11 if F[i] = M[j] then 12 p p + q Murty(F[i], M[j]) 13 end 14 if i + 1 F then 15 Add (i + 1, j) to Q with priority f F (F[i + 1])f M (M[j]) 16 end 17 if j + 1 M then 18 Add (i, j + 1) to Q with priority f F (F[i])f M (M[j + 1]) 19 end 20 until q < threshold or Q is empty 21 return p 5.1: 5.1 Murty 5.2. f FM (i, j) = f F (F[i])f M (M[j]) k 1 k F M Q k Pop (i k, j k ) (i k, j k ) = argmax (i,j) Qk f FM (i, j) Q k+1 (i k, j k ) = Q k 1[i k +1 F ](i k +1, j k ) 1[j k +1 M ](i k, j k +1) f FM (i k +1, j k ) f FM (i k, j k ) f FM (i k, j k + 1) f FM (i k, j k ) f FM (i k+1, j k+1 ) f FM (i k, j k ) for 1 k F M F-M X = {s i } i=1: X t Pr(S t O 1, O 2,..., O t ) P t = { X (i) t, w (i) t } i=1:n N i=1 w = 1 Proposal Distribution π( X t 1, O t ) i N ˆX (i) t π( X (i) t 1, O t),

109 2. 1 i N, (a) Motion Weight m (i) t (b) Observation Weight o (i) t (c) Proposal Weight p (i) t (d) w (i) t 3. 1 i N ŵ (i) t = = w (i) t 1 w (i) t 1 j N w(j) t = Pr( ˆX (i) t X (i) t 1 ), = Pr(O t ˆX (i) t ), = π( ˆX (i) t m (i) t. o(i) t p (i) t. X (i) t 1, O t), 4. Resample 1 i N 1 i N ŵ(i) t { ˆX (i) t, ŵ (i) t } i=1:n X (i) t P t = { X (i) t, 1 } N i=1:n. δ ˆX (i) t (X (i) t ) w t w t 1 o t o = (x, y, c) Pr(1) = Pr(c 1) Pr(1) Pr(0) = 0.5 o Pr(1 c) = = c Pr(c 1) Pr(1)+Pr(c 0) Pr(0) o o s = (x, y,, ) Pr(s o) = η Pr(o s) Pr(s) = N(x, y x, y, Σ) o O c Pr(s o) s π s ( o) o = (x, y, c) s = (x, y,, ) π s ( o) = 1 c π s (s o) = cn(x, y x, y, Σ) s X o O X PFS X O O X O 3 φ = F, M, ψ F X M O ψ Ψ O F X M X M O F 5.1 Pr(O X) = φ Pr(O, φ X) φ = argmax φ Pr(O, φ X) Observation Likelihood φ = F, M, ψ F X π r ( X t 1, O t ) 87

110 1. X t Pr( X t 1 ), 2. φ = F, M, ψ given X t, 3. X = {s s π s ( o), o F }, 4. X t X t X, 5. ˆX t argmax X {X t,x t } Pr(O t X) Acceptance Test O P = {X i } i=1:n P P = {X X Pr( X), X P} P P = {X X π r ( X ), X P } [169] P P Probability Density Estimation Pr(X X) Pr(X P ) π r (X X) Pr(X P ) X X f s Pr(X P) = n! Pr( X = n P) s X f s(s P) γ γ Gamma (α 0, β 0 ) γ Gamma (α = α 0 + X P X, β = β 0 + N) Posterior Predictive Negative Binomial Distribution r = α p = 1 1+β Pr( X = n P) = NB(n; r, p) = ( ) n+r 1 n p n (1 p) r. s = (x, y, ẋ, ẏ) (ẋ, ẏ) x y Multivariate Kernel Density Estimator f s (s P) f f {x i } i=1:n f f ˆf(x {x i } i=1:n ) = 1 i n K(x x i) K( ) Kernel Function 1 n 88

111 H(P) = {s s X, X P} P f s f s (s P) 1 H(P) s H(P) ϕ(x x )ϕ(y y ) s = (x, y,, ) ϕ H(P) H(P) P t Human Identification Identified Human Identity 3 h = (s, c, ρ) s c [0, 1] ρ ID h H(h) H(P t ) s = 1 H(h) s H(h) s c = H(h) N State Pool L t = {h i } i=1: Lt t L 0 = o O t h o H(h o ) L Ot = {h o o O t } t C t = L t 1 L Ot h C t H(h) s Labeling H(h) = {s l(s) = h, s H(P t )} 1. s H(P t )!h C t : l(s) = h 2. X P t s 1 X s 2 X s 1 s 2 l(s 1 ) l(s 2 ) f h h C t P = {f h h C t } P P = argmax P max l Pr(P t, l P) EM Maximum a Posterior MAP K- K-Means [170] E step: l (k) = argmax l Pr(P t, l P (k 1) ), M step: P (k) = argmax P Pr(P t, l (k 1) P). E f h P (k 1) l s H(P t ) f l (s)(s) X P t s X f l (s)(s) N. 0 X P 0 s X s o O 0 X O 0 C 0 = L O0 = O 0 89

112 X P 0 : X C 0 0 l 0 t T T l T L T = {l T (s) s X, X P T } X P T : X L T T + 1 C T+1 = L T L OT +1 L T L OT+1 = C T+1 = L T + L OT+1 X P T +1 X = (X _ X ) X O X _ P T X X X _ X X _ X O T + 1 O O T+1 X _ X O = X = X _ X + X O X _ L T X O = O O T+1 = L OT+1 X C T+1 X P T+1 l T+1 t 0 E l t M O t Maximal Likelihood Estimation MLE o O t H(o) H(P t ) X P t 1. X F, M, ψ = φ = argmax φ Pr(O t, φ X) 2. s X H(ψ (s)) H(ψ (s)) s H(ψ (s)) h C t f (k) h f (k) h (s) = o O t f h (s, o) + f h (s, ) = o O t Pr(s o)f h (o) + f h (s, ) = o O t 1[s H(o)]f h (o) + 1[ o : s / H(o)]f h ( ) f h (o) o h f h ( ) h f h (o) = Pr(o h) Pr(h) 1 H(o) H(h) f N h( ) = Pr( h) Pr(h) 1 N H(h) o O t H(o) H(h), P (k) = {f (k) h h C t} M l (0) P (k) l (k+1) L t L t = {l(s) s H(P t )} L t C t L t 5.2 FindBestAssignment ApproximateHuman h L t H(h) 5.4 PETS2009 PFS CoBot PFS 90

113 Input: Identities L t 1, state pools H(h) for h L t 1, particles P t, observation O t, and maximal EM steps EM Output: Identities L t, and state pools H(h) for h L t 1 Let L t, L Ot 2 foreach o O t do 3 Let H o 4 Propose h o as a potential new identity from o 5 L Ot L Ot h o 6 Let H(h o ) 7 end 8 Let C t L t 1 L Ot 9 foreach X P t do 10 Let F, M, ψ = φ = argmax φ Pr(O t, φ X) 11 H(ψ (s)) H(ψ (s)) s for each s X 12 end 13 foreach h L t 1 do 14 H(h) H(h) taking account particle filtering from P t 1 to P t 15 end 16 foreach s H(P t ) do 17 if h C t : s H(h) then 18 l(s) h 19 end 20 end 21 Let n 0 22 repeat 23 foreach X P t do 24 n n Let converged T rue 26 Let c(s, h) log(f h (s)) for s X, h C t 27 Let l FindBestAssignment(c) 28 foreach s X do 29 if l(s) l (s) then 30 converged False 31 l(s) l (s) 32 end 33 end 34 foreach h C t do 35 H(h) = {s l(s) = h, s H(P t )} 36 end 37 end 38 until converged = T rue or n > EM 39 L t L t l(s) for s H(P t ) 40 h ApproximateHuman(H(h)) for each h L t 41 return L t, {H(h) h L t } 5.2: Human Identification 91

114 Avg. Relative error (%) Avg. Relative Error Avg. Time Usage Avg. Time Usage (µs) Avg. Relative error (%) Avg. Relative Error Avg. Time Usage Avg. Time Usage (ms) e Assignments pruning ratio threshold e False-missing pruning threshold (a) (b) T a = 0.1 T fm = ± ± ± ± % 97.95% 0.026% 3.30% λ s = 0.06/s µ s = 0.02 ν s = 0.5 ξ s = v v v v T v T T T PFS (a) T T 1 2 2% T 0.1 T PFS 5.1(b) T

115 5.2 PFS Parameter PETS2009 Real Robot λ (1/s) µ (1/s) σ p (m 2 /s) ν (1/s) ξ (1/s) τ (s) T T Σ 0.5I 0.3I α 0 Gamma α β 0 Gamma β A (m 2 ) A (m 2 ) R N H EM PETS2009 [47] S2L1 PFS 7fps [171] m Bounding Box * 0 PFS CLEAR MOT [177] PFS 1 Multiple Object Tracking Accuracy MOTA False Positive False Negative ID Identity Switch Multiple Object Tracking Precision MOTP d (1 d) 100% n t t * [172] 93

116 第五章 表 5.3 基于集合粒子滤波的多对象跟踪算法 PETS2009 S2L1 数据集量化实验结果 算法 PFS1 (本章提出) PFS12 (本章提出) Milan[172] Milan et al.[173] Segal et al.[174] Segal et al.[174]2 Zamir et al.2[175] Andriyenko et al.[171] Breitenstein et al.[176]2 1 2 MOTA MOTP IDS MT FM 93.1% 90.6% 90.6% 90.3% 92% 90% 90.3% 81.4% 56.3% 76.1% 74.5% 80.2% 74.3% 75% 75% 69.0% 76.1% 79.7% 次运行的平均值 整体范围内的评估结果 图 5.2 PFS 算法在 PETS2009 S2L1 数据集中的跟踪结果举例 个最优分配问题找到的匹配数目 如果某个真实状态和估计状态之间的距离小 (i) 于 1 米 就认为这两个状态匹配 dt 为真实的状态和与之匹配的估计状态之 间的距离 则 MOTP 计算如下 ( MOTP = 94 ) (i) t 1 i nt dt 1 100%. t nt (5.3)

117 第五章 基于集合粒子滤波的多对象跟踪算法 (a) CoBot 硬件构造 图 5.3 (b) 部署中的 CoBot 服务机器人 CoBot 实验平台 图片来自 CORAL 研究组 令 gt 为实际目标的真实数目 at 为算法报告的估计目标的数目 mt 为 ID 交换 的错误数目 MOTA 定义为 ( ) at 2nt + mt ) t (gt + MOTA = 1 100%. t gt (5.4) MOTP 反应了算法精确估计目标状态的能力 MOTA 反应了算法成功跟踪和保 持目标轨迹的能力 另外 本节也报告了文献[178] 提出的一些度量指标 包括几乎全部跟踪 Mostly Tracked MT 数目 跟踪碎片 Track Fragmentation FM 数目和 ID 交换数目 一个目标如果在其 80% 的时间内都被成功跟踪就称其为几乎全部跟 踪 跟踪碎片数目即一个目标的真实轨迹在 被跟踪 和 没有被跟踪 之间切 换的次数 表 5.3展示了主要实验结果 图 5.2显示了一些跟踪的例子 图中 白 色的限位框为原始的探测结果 估计出来的目标轨迹和当前粒子状态使用不同 的颜色显示 意为其属于不同的个体 * 作为比较 文献[176] 使用贪心的数据关联方法在粒子滤波框架内单独跟踪每 一个目标 可以被看成是 PFS 的一个很好的基准 Baseline 方法 文献[174] 把多 * 显示整个实验结果的完整视频见 95

118 5.4 CoBot I Switch Linear Dynamical System Outlier [175] Generalized Minimum Clique Graph [171] [172] [173] Energy Function Spline Extended Kalman Filter EKF PFS PFS CoBot PFS CoBot CoBot CoBot Carnegie Mellon University Manuela M. Veloso CORAL Cooperate, Observe, Reason, Act, and Learn 96

119 5.5 CoBot II [149, 150] CoBot-1 CoBot-2 CoBot-3 CoBot-4 CoBot-2 5.3(a) CoBot 4 PTZ Microsoft Kinect Hokuyo CoBot CoBot CoBot-4 Kinect CoBot CoBot-2 CoBot-2 5.3(b) CoBot-2 PFS CoBot-2 Kinect 30Hz Histogram of Oriented 97

120 5.6 PFS CoBot Depth HOD 10Hz [179] 2.7GHz CPU 4GB Linux 3.5 PFS X 1m 1m * 5.5 PFS PFS * 98

.1.2 MAXQ.3.4.5

.1.2 MAXQ.3.4.5 2014 10 27 .1.2 MAXQ.3.4.5 .1.2 MAXQ.3.4.5 自主智能体和多智能体系统 Figure 1 : 各种智能体系统 Figure 2 : Figure 3 : CoBot MDP POMDP (Puterman, 1994; Kaelbling et al., 1998) Pr(s t+1 s 0, a 0, s 1, a 1,..., s t, a t ) = Pr(s

More information

MAXQ BA ( ) / 20

MAXQ BA ( ) / 20 MAXQ BA11011028 2016 6 7 () 2016 6 7 1 / 20 1 2 3 4 () 2016 6 7 2 / 20 RoboCup 2D 11 11 100ms/ 1: RoboCup 2D () 2016 6 7 3 / 20 2: () 2016 6 7 4 / 20 () 2016 6 7 5 / 20 Markov Decision Theory [Puterman,

More information

PowerPoint Presentation

PowerPoint Presentation Decision analysis 量化決策分析方法專論 2011/5/26 1 Problem formulation- states of nature In the decision analysis, decision alternatives are referred to as chance events. The possible outcomes for a chance event

More information

國立中山大學學位論文典藏.PDF

國立中山大學學位論文典藏.PDF I II III The Study of Factors to the Failure or Success of Applying to Holding International Sport Games Abstract For years, holding international sport games has been Taiwan s goal and we are on the way

More information

University of Science and Technology of China A dissertation for master s degree Research of e-learning style for public servants under the context of

University of Science and Technology of China A dissertation for master s degree Research of e-learning style for public servants under the context of 中 国 科 学 技 术 大 学 硕 士 学 位 论 文 新 媒 体 环 境 下 公 务 员 在 线 培 训 模 式 研 究 作 者 姓 名 : 学 科 专 业 : 导 师 姓 名 : 完 成 时 间 : 潘 琳 数 字 媒 体 周 荣 庭 教 授 二 一 二 年 五 月 University of Science and Technology of China A dissertation for

More information

國立中山大學學位論文典藏.PDF

國立中山大學學位論文典藏.PDF 國 立 中 山 大 學 企 業 管 理 學 系 碩 士 論 文 以 系 統 動 力 學 建 構 美 食 餐 廳 異 國 麵 坊 之 管 理 飛 行 模 擬 器 研 究 生 : 簡 蓮 因 撰 指 導 教 授 : 楊 碩 英 博 士 中 華 民 國 九 十 七 年 七 月 致 謝 詞 寫 作 論 文 的 過 程 是 一 段 充 滿 艱 辛 與 淚 水 感 動 與 窩 心 的 歷 程, 感 謝 這 一

More information

untitled

untitled LBS Research and Application of Location Information Management Technology in LBS TP319 10290 UDC LBS Research and Application of Location Information Management Technology in LBS , LBS PDA LBS

More information

不确定性环境下公司并购的估价:一种实物期权.doc

不确定性环境下公司并购的估价:一种实物期权.doc Abstract In view of the inadequacy of investment valuation under uncertainty by the orthodox discounted cash flow (DCF), many scholars have begun to study the investment under uncertainty. Option pricing

More information

Time Estimation of Occurrence of Diabetes-Related Cardiovascular Complications by Ching-Yuan Hu A thesis submitted in partial fulfillment of the requi

Time Estimation of Occurrence of Diabetes-Related Cardiovascular Complications by Ching-Yuan Hu A thesis submitted in partial fulfillment of the requi Time Estimation of Occurrence of Diabetes-Related Cardiovascular Complications by Ching-Yuan Hu Master of Science 2011 Institute of Chinese Medical Sciences University of Macau Time Estimation of Occurrence

More information

Open topic Bellman-Ford算法与负环

Open topic   Bellman-Ford算法与负环 Open topic Bellman-Ford 2018 11 5 171860508@smail.nju.edu.cn 1/15 Contents 1. G s BF 2. BF 3. BF 2/15 BF G Bellman-Ford false 3/15 BF G Bellman-Ford false G c = v 0, v 1,..., v k (v 0 = v k ) k w(v i 1,

More information

國家圖書館典藏電子全文

國家圖書館典藏電子全文 i ii Abstract The most important task in human resource management is to encourage and help employees to develop their potential so that they can fully contribute to the organization s goals. The main

More information

: 29 : n ( ),,. T, T +,. y ij i =, 2,, n, j =, 2,, T, y ij y ij = β + jβ 2 + α i + ɛ ij i =, 2,, n, j =, 2,, T, (.) β, β 2,. jβ 2,. β, β 2, α i i, ɛ i

: 29 : n ( ),,. T, T +,. y ij i =, 2,, n, j =, 2,, T, y ij y ij = β + jβ 2 + α i + ɛ ij i =, 2,, n, j =, 2,, T, (.) β, β 2,. jβ 2,. β, β 2, α i i, ɛ i 2009 6 Chinese Journal of Applied Probability and Statistics Vol.25 No.3 Jun. 2009 (,, 20024;,, 54004).,,., P,. :,,. : O22... (Credibility Theory) 20 20, 80. ( []).,.,,,.,,,,.,. Buhlmann Buhlmann-Straub

More information

IP TCP/IP PC OS µclinux MPEG4 Blackfin DSP MPEG4 IP UDP Winsock I/O DirectShow Filter DirectShow MPEG4 µclinux TCP/IP IP COM, DirectShow I

IP TCP/IP PC OS µclinux MPEG4 Blackfin DSP MPEG4 IP UDP Winsock I/O DirectShow Filter DirectShow MPEG4 µclinux TCP/IP IP COM, DirectShow I 2004 5 IP TCP/IP PC OS µclinux MPEG4 Blackfin DSP MPEG4 IP UDP Winsock I/O DirectShow Filter DirectShow MPEG4 µclinux TCP/IP IP COM, DirectShow I Abstract The techniques of digital video processing, transferring

More information

LIST OF ALGORITHMS 1..................................... 3 2......................................... 4 3 NegaMax............................... 6 4

LIST OF ALGORITHMS 1..................................... 3 2......................................... 4 3 NegaMax............................... 6 4 0511@USTC 2007 10 1 2 2 3 2.1............................................ 4 2.2.......................................... 4 2.3 Alpha-Beta........................................ 6 2.4 Alpha-Beta...........................

More information

The Development of Color Constancy and Calibration System

The Development of Color Constancy and Calibration System The Development of Color Constancy and Calibration System The Development of Color Constancy and Calibration System LabVIEW CCD BMP ii Abstract The modern technologies develop more and more faster, and

More information

中国科学技术大学学位论文模板示例文档

中国科学技术大学学位论文模板示例文档 University of Science and Technology of China A dissertation for doctor s degree An Example of USTC Thesis Template for Bachelor, Master and Doctor Author: Zeping Li Speciality: Mathematics and Applied

More information

10384 19020101152519 UDC Rayleigh Quasi-Rayleigh Method for computing eigenvalues of symmetric tensors 2 0 1 3 2 0 1 3 2 0 1 3 2013 , 1. 2. [4], [27].,. [6] E- ; [7], Z-. [15]. Ramara G. kolda [1, 2],

More information

Stochastic Processes (XI) Hanjun Zhang School of Mathematics and Computational Science, Xiangtan University 508 YiFu Lou talk 06/

Stochastic Processes (XI) Hanjun Zhang School of Mathematics and Computational Science, Xiangtan University 508 YiFu Lou talk 06/ Stochastic Processes (XI) Hanjun Zhang School of Mathematics and Computational Science, Xiangtan University hjzhang001@gmail.com 508 YiFu Lou talk 06/04/2010 - Page 1 Outline 508 YiFu Lou talk 06/04/2010

More information

Introduction to Hamilton-Jacobi Equations and Periodic Homogenization

Introduction to Hamilton-Jacobi Equations  and Periodic Homogenization Introduction to Hamilton-Jacobi Equations and Periodic Yu-Yu Liu NCKU Math August 22, 2012 Yu-Yu Liu (NCKU Math) H-J equation and August 22, 2012 1 / 15 H-J equations H-J equations A Hamilton-Jacobi equation

More information

259 I

259 I National Taiwan College of Physical Education A STUDY ON SCHOOL LION DANCE DEVELOPMENT IN CHANGHUA COUNTY 259 I S hie,huei-jing(2011).a Study On School Lion Dance Development in ChangHwa County Unpublished

More information

國立中山大學學位論文典藏.PDF

國立中山大學學位論文典藏.PDF 93 2 () ()A Study of Virtual Project Team's Knowledge Integration and Effectiveness - A Case Study of ERP Implementation N924020024 () () ()Yu ()Yuan-Hang () ()Ho,Chin-Fu () ()Virtual Team,Knowledge Integration,Project

More information

Ζ # % & ( ) % + & ) / 0 0 1 0 2 3 ( ( # 4 & 5 & 4 2 2 ( 1 ) ). / 6 # ( 2 78 9 % + : ; ( ; < = % > ) / 4 % 1 & % 1 ) 8 (? Α >? Β? Χ Β Δ Ε ;> Φ Β >? = Β Χ? Α Γ Η 0 Γ > 0 0 Γ 0 Β Β Χ 5 Ι ϑ 0 Γ 1 ) & Ε 0 Α

More information

A VALIDATION STUDY OF THE ACHIEVEMENT TEST OF TEACHING CHINESE AS THE SECOND LANGUAGE by Chen Wei A Thesis Submitted to the Graduate School and Colleg

A VALIDATION STUDY OF THE ACHIEVEMENT TEST OF TEACHING CHINESE AS THE SECOND LANGUAGE by Chen Wei A Thesis Submitted to the Graduate School and Colleg 上 海 外 国 语 大 学 SHANGHAI INTERNATIONAL STUDIES UNIVERSITY 硕 士 学 位 论 文 MASTER DISSERTATION 学 院 国 际 文 化 交 流 学 院 专 业 汉 语 国 际 教 育 硕 士 题 目 届 别 2010 届 学 生 陈 炜 导 师 张 艳 莉 副 教 授 日 期 2010 年 4 月 A VALIDATION STUDY

More information

2 ( 自 然 科 学 版 ) 第 20 卷 波 ). 这 种 压 缩 波 空 气 必 然 有 一 部 分 要 绕 流 到 车 身 两 端 的 环 状 空 间 中, 形 成 与 列 车 运 行 方 向 相 反 的 空 气 流 动. 在 列 车 尾 部, 会 产 生 低 于 大 气 压 的 空 气 流

2 ( 自 然 科 学 版 ) 第 20 卷 波 ). 这 种 压 缩 波 空 气 必 然 有 一 部 分 要 绕 流 到 车 身 两 端 的 环 状 空 间 中, 形 成 与 列 车 运 行 方 向 相 反 的 空 气 流 动. 在 列 车 尾 部, 会 产 生 低 于 大 气 压 的 空 气 流 第 20 卷 第 3 期 2014 年 6 月 ( 自 然 科 学 版 ) JOURNAL OF SHANGHAI UNIVERSITY (NATURAL SCIENCE) Vol. 20 No. 3 June 2014 DOI: 10.3969/j.issn.1007-2861.2013.07.031 基 于 FLUENT 测 轨 道 交 通 隧 道 中 电 波 折 射 率 结 构 常 数 张 永

More information

第三章 国内外小组合作学习的应用情况

第三章 国内外小组合作学习的应用情况 摘 要 论 文 题 目 : 小 组 合 作 学 习 在 上 海 高 中 信 息 科 技 教 学 中 的 应 用 专 业 : 现 代 教 育 技 术 学 位 申 请 人 : 朱 翠 凤 指 导 教 师 : 孟 琦 摘 要 小 组 合 作 学 习 是 目 前 世 界 上 许 多 国 家 普 遍 采 用 的 一 种 富 有 创 意 的 教 学 理 论 与 策 略, 其 在 培 养 学 生 的 合 作 精

More information

(Microsoft Word - \261M\256\327\272\353\302\262\263\370\247iEnd.doc)

(Microsoft Word - \261M\256\327\272\353\302\262\263\370\247iEnd.doc) 摘 要 長 榮 大 學 資 訊 管 理 學 系 畢 業 專 案 實 作 專 案 編 號 : 旅 遊 行 程 規 劃 - 以 台 南 市 為 例 Tour Scheduling for Tainan City CJU-IM- PRJ-096-029 執 行 期 間 : 95 年 2 月 13 日 至 96 年 1 月 20 日 陳 貽 隆 陳 繼 列 張 順 憶 練 哲 瑋 專 案 參 與 人 員 :

More information

2/80 2

2/80 2 2/80 2 3/80 3 DSP2400 is a high performance Digital Signal Processor (DSP) designed and developed by author s laboratory. It is designed for multimedia and wireless application. To develop application

More information

-2 4 - cr 5 - 15 3 5 ph 6.5-8.5 () 450 mg/l 0.3 mg/l 0.1 mg/l 1.0 mg/l 1.0 mg/l () 0.002 mg/l 0.3 mg/l 250 mg/l 250 mg/l 1000 mg/l 1.0 mg/l 0.05 mg/l 0.05 mg/l 0.01 mg/l 0.001 mg/l 0.01 mg/l () 0.05 mg/l

More information

! # % & ( & # ) +& & # ). / 0 ) + 1 0 2 & 4 56 7 8 5 0 9 7 # & : 6/ # ; 4 6 # # ; < 8 / # 7 & & = # < > 6 +? # Α # + + Β # Χ Χ Χ > Δ / < Ε + & 6 ; > > 6 & > < > # < & 6 & + : & = & < > 6+?. = & & ) & >&

More information

Gassama Abdoul Gadiri University of Science and Technology of China A dissertation for master degree Ordinal Probit Regression Model and Application in Credit Rating for Users of Credit Card Author :

More information

Ρ Τ Π Υ 8 ). /0+ 1, 234) ς Ω! Ω! # Ω Ξ %& Π 8 Δ, + 8 ),. Ψ4) (. / 0+ 1, > + 1, / : ( 2 : / < Α : / %& %& Ζ Θ Π Π 4 Π Τ > [ [ Ζ ] ] %& Τ Τ Ζ Ζ Π

Ρ Τ Π Υ 8 ). /0+ 1, 234) ς Ω! Ω! # Ω Ξ %& Π 8 Δ, + 8 ),. Ψ4) (. / 0+ 1, > + 1, / : ( 2 : / < Α : / %& %& Ζ Θ Π Π 4 Π Τ > [ [ Ζ ] ] %& Τ Τ Ζ Ζ Π ! # % & ( ) + (,. /0 +1, 234) % 5 / 0 6/ 7 7 & % 8 9 : / ; 34 : + 3. & < / = : / 0 5 /: = + % >+ ( 4 : 0, 7 : 0,? & % 5. / 0:? : / : 43 : 2 : Α : / 6 3 : ; Β?? : Α 0+ 1,4. Α? + & % ; 4 ( :. Α 6 4 : & %

More information

~ 10 2 P Y i t = my i t W Y i t 1000 PY i t Y t i W Y i t t i m Y i t t i 15 ~ 49 1 Y Y Y 15 ~ j j t j t = j P i t i = 15 P n i t n Y

~ 10 2 P Y i t = my i t W Y i t 1000 PY i t Y t i W Y i t t i m Y i t t i 15 ~ 49 1 Y Y Y 15 ~ j j t j t = j P i t i = 15 P n i t n Y * 35 4 2011 7 Vol. 35 No. 4 July 2011 3 Population Research 1950 ~ 1981 The Estimation Method and Its Application of Cohort Age - specific Fertility Rates Wang Gongzhou Hu Yaoling Abstract Based on the

More information

三維空間之機械手臂虛擬實境模擬

三維空間之機械手臂虛擬實境模擬 VRML Model of 3-D Robot Arm VRML Model of 3-D Robot Arm MATLAB VRML MATLAB Simulink i MATLAB Simulink V-Realm Build Joystick ii Abstract The major purpose of this thesis presents the procedure of VRML

More information

2005 The Analysis and Design for a Chain Supermarket Intelligent Delivery System () 2005 4 I Abstract The Analysis and Design for a Chain Supermarket Intelligent Delivery System The Analysis and Design

More information

Shanghai International Studies University A STUDY ON SYNERGY BUYING PRACTICE IN ABC COMPANY A Thesis Submitted to the Graduate School and MBA Center I

Shanghai International Studies University A STUDY ON SYNERGY BUYING PRACTICE IN ABC COMPANY A Thesis Submitted to the Graduate School and MBA Center I 上 海 外 国 语 大 学 工 商 管 理 硕 士 学 位 论 文 ABC 中 国 食 品 公 司 的 整 合 采 购 研 究 学 科 专 业 : 工 商 管 理 硕 士 (MBA) 作 者 姓 名 :0113700719 指 导 教 师 : 答 辩 日 期 : 2013 年 12 月 上 海 外 国 语 大 学 二 一 四 年 一 月 Shanghai International Studies

More information

WTO

WTO 10384 200015128 UDC Exploration on Design of CIB s Human Resources System in the New Stage (MBA) 2004 2004 2 3 2004 3 2 0 0 4 2 WTO Abstract Abstract With the rapid development of the high and new technique

More information

國立中山大學學位典藏

國立中山大學學位典藏 Schoenfeld Sternberg Krutetskii I II An analysis of Mathematics Problem-solving Processes of Gifted Primary School Children with General Intelligent Ability Huang Chia-Chieh Institute of Education National

More information

Microsoft Word - 刘 慧 板.doc

Microsoft Word - 刘  慧 板.doc 中 国 环 境 科 学 2012,32(5):933~941 China Environmental Science 系 统 动 力 学 在 空 港 区 域 规 划 环 境 影 响 评 价 中 的 应 用 刘 慧 1,2, 郭 怀 成 1*, 盛 虎 1, 都 小 尚 1,3, 李 娜 1 1, 杨 永 辉 (1. 北 京 大 学 环 境 科 学 与 工 程 学 院, 北 京 100871; 2.

More information

MHz 10 MHz Mbps 1 C 2(a) 4 GHz MHz 56 Msps 70 MHz 70 MHz 23 MHz 14 MHz 23 MHz 2(b)

MHz 10 MHz Mbps 1 C 2(a) 4 GHz MHz 56 Msps 70 MHz 70 MHz 23 MHz 14 MHz 23 MHz 2(b) 2011 32 ANNALS OF SHANGHAI OBSERVATORY ACADEMIA SINICA No. 32, 2011 1,2,3 1 2,3 2,3 2,3 2 1 1 ( 1. 200030 2. 100094 3. 100094 ) V474 1 (CEI) ( VLBI ), CEI 100 nrad ( 50 km) CEI 10 100 km 2 2 2 CEI [1]

More information

Construction of Chinese pediatric standard database A Dissertation Submitted for the Master s Degree Candidate:linan Adviser:Prof. Han Xinmin Nanjing

Construction of Chinese pediatric standard database A Dissertation Submitted for the Master s Degree Candidate:linan Adviser:Prof. Han Xinmin Nanjing 密 级 : 公 开 学 号 :20081209 硕 士 学 位 论 文 中 医 儿 科 标 准 数 据 库 建 设 研 究 研 究 生 李 楠 指 导 教 师 学 科 专 业 所 在 学 院 毕 业 时 间 韩 新 民 教 授 中 医 儿 科 学 第 一 临 床 医 学 院 2011 年 06 月 Construction of Chinese pediatric standard database

More information

東方設計學院文化創意設計研究所

東方設計學院文化創意設計研究所 東 方 設 計 學 院 文 化 創 意 設 計 研 究 所 碩 士 學 位 論 文 應 用 德 爾 菲 法 建 立 社 區 業 餘 油 畫 課 程 之 探 討 - 以 高 雄 市 湖 內 區 為 例 指 導 教 授 : 薛 淞 林 教 授 研 究 生 : 賴 秀 紅 中 華 民 國 一 o 四 年 一 月 東 方 設 計 學 院 文 化 創 意 設 計 研 究 所 碩 士 學 位 論 文 Graduate

More information

國立中山大學學位論文典藏

國立中山大學學位論文典藏 Transformation of Family-owned Business into Corporate Family The Case of San Shing Hardware Works Co., Ltd. Transformation of Family-owned Business into Corporate Family The Case of San Shing Hardware

More information

1 119 Clark 1951 Martin Harvey a 2003b km 2

1 119 Clark 1951 Martin Harvey a 2003b km 2 30 1 118 ~ 131 2014 3 EARTHQUAKE RESEARCH IN CHINA Vol. 30 No. 1 Mar. 2014 2014 30 1 118 ~ 131 650224 6 GIS - 2013 3 3 5. 5 0. 89 1001-4683 2014 01-118-14 P315 A 0 1481 20 500 6. 5 2012 1991 2012 10 1

More information

Chinese Journal of Applied Probability and Statistics Vol.25 No.4 Aug (,, ;,, ) (,, ) 应用概率统计 版权所有, Zhang (2002). λ q(t)

Chinese Journal of Applied Probability and Statistics Vol.25 No.4 Aug (,, ;,, ) (,, ) 应用概率统计 版权所有, Zhang (2002). λ q(t) 2009 8 Chinese Journal of Applied Probability and Statistics Vol.25 No.4 Aug. 2009,, 541004;,, 100124),, 100190), Zhang 2002). λ qt), Kolmogorov-Smirov, Berk and Jones 1979). λ qt).,,, λ qt),. λ qt) 1,.

More information

458 (25),. [1 4], [5, 6].,, ( ).,,, ;,,,. Xie Li (28),,. [9] HJB,,,, Legendre [7, 8],.,. 2. ( ), x = x x = x x x2 n x = (x 1, x 2,..., x

458 (25),. [1 4], [5, 6].,, ( ).,,, ;,,,. Xie Li (28),,. [9] HJB,,,, Legendre [7, 8],.,. 2. ( ), x = x x = x x x2 n x = (x 1, x 2,..., x 212 1 Chinese Journal of Applied Probability and Statistics Vol.28 No.5 Oct. 212 (,, 3387;,, 372) (,, 372)., HJB,. HJB, Legendre.,. :,,, Legendre,,,. : F83.48, O211.6. 1.,.,,. 199, Sharpe Tint (199),.,

More information

Abstract Due to the improving of living standards, people gradually seek lighting quality from capacityto quality. And color temperature is the important subject of it. According to the research from aboard,

More information

Vol. 36 ( 2016 ) No. 6 J. of Math. (PRC) HS, (, ) :. HS,. HS. : ; HS ; ; Nesterov MR(2010) : 90C05; 65K05 : O221.1 : A : (2016)

Vol. 36 ( 2016 ) No. 6 J. of Math. (PRC) HS, (, ) :. HS,. HS. : ; HS ; ; Nesterov MR(2010) : 90C05; 65K05 : O221.1 : A : (2016) Vol. 36 ( 6 ) No. 6 J. of Math. (PRC) HS, (, 454) :. HS,. HS. : ; HS ; ; Nesterov MR() : 9C5; 65K5 : O. : A : 55-7797(6)6-9-8 ū R n, A R m n (m n), b R m, b = Aū. ū,,., ( ), l ū min u s.t. Au = b, (.)

More information

E I

E I Research on Using Art-play to Construct Elementary School Students' Visual Art Aesthetic Sensibility ~Case of Da-Yuan Elementary School E I E II Abstract Research on Using Art-play to Construct Elementary

More information

标题

标题 第 19 卷 摇 第 4 期 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 模 式 识 别 与 人 工 智 能 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 Vol. 19 摇 No. 4 摇 006 年 8 月 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 PR & AI 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 摇 Aug 摇 摇

More information

Monetary Policy Regime Shifts under the Zero Lower Bound: An Application of a Stochastic Rational Expectations Equilibrium to a Markov Switching DSGE

Monetary Policy Regime Shifts under the Zero Lower Bound: An Application of a Stochastic Rational Expectations Equilibrium to a Markov Switching DSGE Procedure of Calculating Policy Functions 1 Motivation Previous Works 2 Advantages and Summary 3 Model NK Model with MS Taylor Rule under ZLB Expectations Function Static One-Period Problem of a MS-DSGE

More information

上饶师范学院

上饶师范学院 论 科 学 探 索 精 神 叶 仁 辉 指 导 教 师 : 仲 伟 良 ( 政 法 系 2002 级 1 班 ) 摘 要 : 科 学 是 一 种 动 态 的 知 识 体 系 ; 科 学 活 动 是 一 种 无 止 境 的 探 索 活 动 科 学 探 索 应 以 事 实 为 依 据, 以 揭 示 未 知 奥 秘, 获 得 真 理 性 认 识 为 追 求 目 标 科 学 探 索 过 程 中 所 凝 聚

More information

&! +! # ## % & #( ) % % % () ) ( %

&! +! # ## % & #( ) % % % () ) ( % &! +! # ## % & #( ) % % % () ) ( % &! +! # ## % & #( ) % % % () ) ( % ,. /, / 0 0 1,! # % & ( ) + /, 2 3 4 5 6 7 8 6 6 9 : / ;. ; % % % % %. ) >? > /,,

More information

Abstract After over ten years development, Chinese securities market has experienced from nothing to something, from small to large and the course of

Abstract After over ten years development, Chinese securities market has experienced from nothing to something, from small to large and the course of 2003 MBA 600795 SWOT Abstract After over ten years development, Chinese securities market has experienced from nothing to something, from small to large and the course of being standardized. To all securities

More information

UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 厦门大学博硕士论文摘要库

UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 厦门大学博硕士论文摘要库 10384 15620071151397 UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 2010 4 Duffee 1999 AAA Vasicek RMSE RMSE Abstract In order to investigate whether adding macro factors

More information

[9] R Ã : (1) x 0 R A(x 0 ) = 1; (2) α [0 1] Ã α = {x A(x) α} = [A α A α ]. A(x) Ã. R R. Ã 1 m x m α x m α > 0; α A(x) = 1 x m m x m +

[9] R Ã : (1) x 0 R A(x 0 ) = 1; (2) α [0 1] Ã α = {x A(x) α} = [A α A α ]. A(x) Ã. R R. Ã 1 m x m α x m α > 0; α A(x) = 1 x m m x m + 2012 12 Chinese Journal of Applied Probability and Statistics Vol.28 No.6 Dec. 2012 ( 224002) Euclidean Lebesgue... :. : O212.2 O159. 1.. Zadeh [1 2]. Tanaa (1982) ; Diamond (1988) (FLS) FLS LS ; Savic

More information

1

1 Activity- based Cost Management: A New Mode of Medical cost Management () 1 Activity - based Cost Management A New Mode of Medical cost Management Abstract With the development of medical market, the defects

More information

10384 X0115071 UDC The Research For The Actuality And Development Stratagem Of The China Securities Investment Fund (MBA) 2003 11 2003 12 2003 12 2 0 0 3 11 100 1991, WTO Abstract Abstract The Securities

More information

Microsoft PowerPoint - talk8.ppt

Microsoft PowerPoint - talk8.ppt Adaptive Playout Scheduling Using Time-scale Modification Yi Liang, Nikolaus Färber Bernd Girod, Balaji Prabhakar Outline QoS concerns and tradeoffs Jitter adaptation as a playout scheduling scheme Packet

More information

mm 5 1 Tab 1 Chemical composition of PSB830 finishing rolled rebars % C Si Mn P S V 0 38 ~ 1 50 ~ 0 80 ~ ~

mm 5 1 Tab 1 Chemical composition of PSB830 finishing rolled rebars % C Si Mn P S V 0 38 ~ 1 50 ~ 0 80 ~ ~ PSB830 365000 32 mm PSB830 PSB830 TG 335 64 A Productive Practition of PSB830 Finishing Rolled Rebars PAN Jianzhou Bar Steel Rolling Minguang Co Ltd of Fujian Sansteel Sanming 365000 China Abstract High

More information

~m~li~* ~ ± ~ 1Jz. IDfU Y:.. a~~.~.oor.~~b~.~fi~~p A Study of Developing a Mobile APP for Supporting the Chinese Medicine Pulse Diagnosis Based on Pul

~m~li~* ~ ± ~ 1Jz. IDfU Y:.. a~~.~.oor.~~b~.~fi~~p A Study of Developing a Mobile APP for Supporting the Chinese Medicine Pulse Diagnosis Based on Pul 南 華 大 學 資 訊 管 理 學 系 碩 士 論 文 基 於 脈 象 結 構 開 發 輔 助 中 醫 脈 診 之 行 動 APP A Study of Developing a Mobile APP for Supporting the Chinese Medicine Pulse Diagnosis Based on Pulse Components 研 究 生 : 吳 奇 燊 指 導 教 授

More information

! Ν! Ν Ν & ] # Α. 7 Α ) Σ ),, Σ 87 ) Ψ ) +Ε 1)Ε Τ 7 4, <) < Ε : ), > 8 7

! Ν! Ν Ν & ] # Α. 7 Α ) Σ ),, Σ 87 ) Ψ ) +Ε 1)Ε Τ 7 4, <) < Ε : ), > 8 7 !! # & ( ) +,. )/ 0 1, 2 ) 3, 4 5. 6 7 87 + 5 1!! # : ;< = > < < ;?? Α Β Χ Β ;< Α? 6 Δ : Ε6 Χ < Χ Α < Α Α Χ? Φ > Α ;Γ ;Η Α ;?? Φ Ι 6 Ε Β ΕΒ Γ Γ > < ϑ ( = : ;Α < : Χ Κ Χ Γ? Ε Ι Χ Α Ε? Α Χ Α ; Γ ;

More information

苗 栗 三 山 國 王 信 仰 及 其 地 方 社 會 意 涵 The Influences and Implications of Local Societies to Three Mountain Kings Belief, in Taiwan Miaoli 研 究 生 : 林 永 恩 指 導

苗 栗 三 山 國 王 信 仰 及 其 地 方 社 會 意 涵 The Influences and Implications of Local Societies to Three Mountain Kings Belief, in Taiwan Miaoli 研 究 生 : 林 永 恩 指 導 國 立 交 通 大 學 客 家 文 化 學 院 客 家 社 會 與 文 化 學 程 碩 士 論 文 苗 栗 三 山 國 王 信 仰 及 其 地 方 社 會 意 涵 The Influences and Implications of Local Societies to Three Mountain Kings Belief, in Taiwan Miaoli 研 究 生 : 林 永 恩 指 導 教

More information

Microsoft Word - 专论综述1.doc

Microsoft Word - 专论综述1.doc 2016 年 第 25 卷 第 期 http://www.c-s-a.org.cn 计 算 机 系 统 应 用 1 基 于 节 点 融 合 分 层 法 的 电 网 并 行 拓 扑 分 析 王 惠 中 1,2, 赵 燕 魏 1,2, 詹 克 非 1, 朱 宏 毅 1 ( 兰 州 理 工 大 学 电 气 工 程 与 信 息 工 程 学 院, 兰 州 730050) 2 ( 甘 肃 省 工 业 过 程 先

More information

Microsoft PowerPoint - Aqua-Sim.pptx

Microsoft PowerPoint - Aqua-Sim.pptx Peng Xie, Zhong Zhou, Zheng Peng, Hai Yan, Tiansi Hu, Jun-Hong Cui, Zhijie Shi, Yunsi Fei, Shengli Zhou Underwater Sensor Network Lab 1 Outline Motivations System Overview Aqua-Sim Components Experimental

More information

...1 Abstract

...1 Abstract Web Online Testing System Based on Web : : :S2002253 : : : 2002 4 1 ...1 Abstract...1...2 1.1...2 1.2...2 1.3...3...5 2.1...5 2.1.1...5 2.1.2...5 2.2...12 2.2.1...12 2.2.2...13 2.2.3...13...18 3.1...18

More information

考試學刊第10期-內文.indd

考試學刊第10期-內文.indd misconception 101 Misconceptions and Test-Questions of Earth Science in Senior High School Chun-Ping Weng College Entrance Examination Center Abstract Earth Science is a subject highly related to everyday

More information

<4D6963726F736F667420576F7264202D20B5DAC8FDB7BDBE57C9CFD6A7B8B6D6AEB7A8C2C98696EE7DCCBDBEBF2E646F63>

<4D6963726F736F667420576F7264202D20B5DAC8FDB7BDBE57C9CFD6A7B8B6D6AEB7A8C2C98696EE7DCCBDBEBF2E646F63> 題 目 : 第 三 方 網 上 支 付 之 法 律 問 題 探 究 Title:A study on legal issues of the third-party online payment 姓 名 Name 學 號 Student No. 學 院 Faculty 課 程 Program 專 業 Major 指 導 老 師 Supervisor 日 期 Date : 王 子 瑜 : 1209853J-LJ20-0021

More information

, GC/MS ph GC/MS I

, GC/MS ph GC/MS I S00017052 O O , GC/MS ph GC/MS I Abstract Drug abuse is a serious issue throughout the world. Amphetamine-type stimulants (ATS) are substances frequently used by drug abusers. There are significant needs

More information

., /,, 0!, + & )!. + + (, &, & 1 & ) ) 2 2 ) 1! 2 2

., /,, 0!, + & )!. + + (, &, & 1 & ) ) 2 2 ) 1! 2 2 ! # &!! ) ( +, ., /,, 0!, + & )!. + + (, &, & 1 & ) ) 2 2 ) 1! 2 2 ! 2 2 & & 1 3! 3, 4 45!, 2! # 1 # ( &, 2 &, # 7 + 4 3 ) 8. 9 9 : ; 4 ), 1!! 4 4 &1 &,, 2! & 1 2 1! 1! 1 & 2, & 2 & < )4 )! /! 4 4 &! &,

More information

SVM OA 1 SVM MLP Tab 1 1 Drug feature data quantization table

SVM OA 1 SVM MLP Tab 1 1 Drug feature data quantization table 38 2 2010 4 Journal of Fuzhou University Natural Science Vol 38 No 2 Apr 2010 1000-2243 2010 02-0213 - 06 MLP SVM 1 1 2 1 350108 2 350108 MIP SVM OA MLP - SVM TP391 72 A Research of dialectical classification

More information

Microsoft Word - 黃淑蓉碩士論文_0817

Microsoft Word - 黃淑蓉碩士論文_0817 樹 德 科 技 大 學 建 築 與 環 境 設 計 研 究 所 碩 士 論 文 異 質 空 間 觀 點 下 的 校 園 空 間 研 究 - 以 高 雄 市 中 正 高 工 為 例 Study on Campus Space from the Viewpoint of Heterotopia - A Case Study of Kaohsiung Municipal Jhong-Jheng Industrial

More information

Thesis for the Master degree in Engineering Research on Negative Pressure Wave Simulation and Signal Processing of Fluid-Conveying Pipeline Leak Candi

Thesis for the Master degree in Engineering Research on Negative Pressure Wave Simulation and Signal Processing of Fluid-Conveying Pipeline Leak Candi U17 10220 UDC624 Thesis for the Master degree in Engineering Research on Negative Pressure Wave Simulation and Signal Processing of Fluid-Conveying Pipeline Leak Candidate:Chen Hao Tutor: Xue Jinghong

More information

度 身 體 活 動 量 ; 芬 蘭 幼 兒 呈 現 中 度 身 體 活 動 量 之 比 例 高 於 臺 灣 幼 兒 (5) 幼 兒 在 投 入 度 方 面 亦 達 顯 著 差 異 (χ²=185.35, p <.001), 芬 蘭 與 臺 灣 幼 兒 多 半 表 現 出 中 度 投 入 與 高 度

度 身 體 活 動 量 ; 芬 蘭 幼 兒 呈 現 中 度 身 體 活 動 量 之 比 例 高 於 臺 灣 幼 兒 (5) 幼 兒 在 投 入 度 方 面 亦 達 顯 著 差 異 (χ²=185.35, p <.001), 芬 蘭 與 臺 灣 幼 兒 多 半 表 現 出 中 度 投 入 與 高 度 臺 灣 與 芬 蘭 幼 兒 園 室 內 自 由 遊 戲 內 涵 之 探 討 林 昭 溶 毛 萬 儀 經 國 管 理 暨 健 康 學 院 幼 兒 保 育 系 副 教 授 joyce@ems.cku.edu.tw 吳 敏 而 國 家 教 育 研 究 院 研 究 員 rozwu@mail.naer.edu.tw wanyi@ems.cku.edu.tw 摘 要 自 由 遊 戲 被 視 為 是 幼 兒 的

More information

A dissertation for Master s degree Metro Indoor Coverage Systems Analysis And Design Author s Name: Sheng Hailiang speciality: Supervisor:Prof.Li Hui,

A dissertation for Master s degree Metro Indoor Coverage Systems Analysis And Design Author s Name: Sheng Hailiang speciality: Supervisor:Prof.Li Hui, 中 国 科 学 技 术 大 学 工 程 硕 士 学 位 论 文 地 铁 内 移 动 通 信 室 内 覆 盖 分 析 及 应 用 作 者 姓 名 : 学 科 专 业 : 盛 海 亮 电 子 与 通 信 导 师 姓 名 : 李 辉 副 教 授 赵 红 媛 高 工 完 成 时 间 : 二 八 年 三 月 十 日 University of Science and Technology of Ch A dissertation

More information

穨control.PDF

穨control.PDF TCP congestion control yhmiu Outline Congestion control algorithms Purpose of RFC2581 Purpose of RFC2582 TCP SS-DR 1998 TCP Extensions RFC1072 1988 SACK RFC2018 1996 FACK 1996 Rate-Halving 1997 OldTahoe

More information

室内设计2015年第4期.indd

室内设计2015年第4期.indd ISSUE 4 AUG. 2015 / JOURNAL OF HUMAN SETTLEMENTS IN WEST CHINA / 018 DOI: 10.13791/j.cnki.hsfwest.20150405 任留柱, 田长丰. 环境设计 景观方向 研究生教育中的设计能力培养探索[J]. 西部人居环境学刊, 2015, 30(04): 18-22. 环境设计 景观方向 研究生教育中的设计能力培养探索*

More information

國立中山大學學位論文典藏

國立中山大學學位論文典藏 i Examinations have long been adopting for the selection of the public officials and become an essential tradition in our country. For centuries, the examination system, incorporated with fairness, has

More information

4= 8 4 < 4 ϑ = 4 ϑ ; 4 4= = 8 : 4 < : 4 < Κ : 4 ϑ ; : = 4 4 : ;

4= 8 4 < 4 ϑ = 4 ϑ ; 4 4= = 8 : 4 < : 4 < Κ : 4 ϑ ; : = 4 4 : ; ! #! % & ( ) +!, + +!. / 0 /, 2 ) 3 4 5 6 7 8 8 8 9 : 9 ;< 9 = = = 4 ) > (/?08 4 ; ; 8 Β Χ 2 ΔΔ2 4 4 8 4 8 4 8 Ε Φ Α, 3Γ Η Ι 4 ϑ 8 4 ϑ 8 4 8 4 < 8 4 5 8 4 4

More information

I

I The Effect of Guided Discovery on The Learning Achievement and Learning Transfer of Grade 5 Students in Primary Schools I II Abstract The Effect of Guided Discovery on The Learning Achievement And Learning

More information

202 The Sending Back of The Japanese People in Taiwan in The Beginning Years After the World War II Abstract Su-ying Ou* In August 1945, Japan lost th

202 The Sending Back of The Japanese People in Taiwan in The Beginning Years After the World War II Abstract Su-ying Ou* In August 1945, Japan lost th 201 1945 8 1945 202 The Sending Back of The Japanese People in Taiwan in The Beginning Years After the World War II Abstract Su-ying Ou* In August 1945, Japan lost the war and had to retreat from Taiwan.

More information

Microsoft Word - 期末結案報告20130104

Microsoft Word - 期末結案報告20130104 元 培 科 技 大 學 資 訊 工 程 系 101 學 年 度 專 題 結 案 報 告 守 塔 遊 戲 設 計 姓 名 : 李 宏 志 陳 文 鴻 何 侑 諺 張 承 恩 指 導 老 師 : 鄭 瑞 恒 教 授 中 華 民 國 101 年 12 月 守 塔 遊 戲 設 計 Tower Defense Game Design 學 生 : 李 宏 志 Student:Hong-Zhi Li 學 生 :

More information

A Study on the Relationships of the Co-construction Contract A Study on the Relationships of the Co-Construction Contract ( ) ABSTRACT Co-constructio in the real estate development, holds the quite

More information

Microsoft Word - 林文晟3.doc

Microsoft Word - 林文晟3.doc 台 灣 管 理 學 刊 第 8 卷 第 期,008 年 8 月 pp. 33-46 建 構 農 產 運 銷 物 流 中 心 評 選 模 式 決 策 之 研 究 林 文 晟 清 雲 科 技 大 學 企 業 管 理 系 助 理 教 授 梁 榮 輝 崇 右 技 術 學 院 企 業 管 理 系 教 授 崇 右 技 術 學 院 校 長 摘 要 台 灣 乃 以 農 立 國, 農 業 經 濟 在 台 灣 經 濟

More information

Untitled-3

Untitled-3 SEC.. Separable Equations In each of problems 1 through 8 solve the given differential equation : ü 1. y ' x y x y, y 0 fl y - x 0 fl y - x 0 fl y - x3 3 c, y 0 ü. y ' x ^ y 1 + x 3 x y 1 + x 3, y 0 fl

More information

1990-1997 1980-1997 Abstract The relationship between human resource development and economy increase has been an important research item with the coming of knowledge economy. In this paper, ahead item

More information

WTO

WTO 10384 X0115018 UDC MBA 2004 5 14 2004 6 1 WTO 2004 2006 7 2 Abstract According to the promise after our country enter into WTO, our country will open the readymade oil retail market in the end of 2004

More information

, ( 6 7 8! 9! (, 4 : : ; 0.<. = (>!? Α% ), Β 0< Χ 0< Χ 2 Δ Ε Φ( 7 Γ Β Δ Η7 (7 Ι + ) ϑ!, 4 0 / / 2 / / < 5 02

, ( 6 7 8! 9! (, 4 : : ; 0.<. = (>!? Α% ), Β 0< Χ 0< Χ 2 Δ Ε Φ( 7 Γ Β Δ Η7 (7 Ι + ) ϑ!, 4 0 / / 2 / / < 5 02 ! # % & ( ) +, ) %,! # % & ( ( ) +,. / / 01 23 01 4, 0/ / 5 0 , ( 6 7 8! 9! (, 4 : : ; 0.!? Α% ), Β 0< Χ 0< Χ 2 Δ Ε Φ( 7 Γ Β Δ 5 3 3 5 3 1 Η7 (7 Ι + ) ϑ!, 4 0 / / 2 / 3 0 0 / < 5 02 Ν!.! %) / 0

More information

13 A DSS B DSS C DSS D DSS A. B. C. CPU D. 15 A B Cache C Cache D L0 L1 L2 Cache 16 SMP A B. C D 17 A B. C D A B - C - D

13 A DSS B DSS C DSS D DSS A. B. C. CPU D. 15 A B Cache C Cache D L0 L1 L2 Cache 16 SMP A B. C D 17 A B. C D A B - C - D 2008 1 1 A. B. C. D. UML 2 3 2 A. B. C. D. 3 A. B. C. D. UML 4 5 4 A. B. C. D. 5 A. B. C. D. 6 6 A. DES B. RC-5 C. IDEA D. RSA 7 7 A. B. C. D. TCP/IP SSL(Security Socket Layer) 8 8 A. B. C. D. 9 9 A. SET

More information

Untitiled

Untitiled 学术 时空 中国高等教育自学考试研究 0 年 回顾与反思 中国高等教育自学考试研究0年 回顾与反思* 邵晓枫 摘要 高等教育自学考试制度创立 0 多年来 涌现出大量的研究成果 主要对这一制度的产生 发展 本 质 功能 考生 社会助学 教育教学 国家考试 以及农村自考等方面进行研究 形成了一个专门的研究领 域 自学考试既是一种国家考试制度 又是一种教育形式 也是一种以学习者为主体的自主学习制度 学术界对

More information

Microsoft PowerPoint SSBSE .ppt [Modo de Compatibilidade]

Microsoft PowerPoint SSBSE .ppt [Modo de Compatibilidade] SSBSE 2015, Bergamo Transformed Search Based Software Engineering: A New Paradigm of SBSE He JIANG, Zhilei Ren, Xiaochen Li, Xiaochen Lai jianghe@dlut.edu.cn School of Software, Dalian Univ. of Tech. Outline

More information

Public Projects A Thesis Submitted to Department of Construction Engineering National Kaohsiung First University of Science and Technology In Partial

Public Projects A Thesis Submitted to Department of Construction Engineering National Kaohsiung First University of Science and Technology In Partial Public Projects A Thesis Submitted to Department of Construction Engineering National Kaohsiung First University of Science and Technology In Partial Fulfillment of the Requirements For the Degree of Master

More information

Lecture 12: Policy Optimization II

Lecture 12: Policy Optimization II Lecture 12: Policy Optimization II Bolei Zhou The Chinese University of Hong Kong bzhou@ie.cuhk.edu.hk February 21, 2019 Bolei Zhou (CUHK) IERG6130 Reinforcement Learning February 21, 2019 1 / 21 Today

More information

風水理氣派操作方法運用於建築設計之研究--以紫白飛星法為例

風水理氣派操作方法運用於建築設計之研究--以紫白飛星法為例 Analysis of Feng Shui application in Architecture-- A Study of the School of Li Chi Based on the Zi Bai Fei Xing Method 2003 6 Analysis of Feng Shui application in architecture: A Study of the School

More information

,,.,, : 1),,,,, 2),,,,, 3),,,,,,,,,, [6].,,, ( ),, [9], : 1), 2),,,,, 3),,, 2.,, [10].,,,,,,,,, [11]. 2.1,, [12],, ;, ; Fig. 1 1 Granular hier

,,.,, : 1),,,,, 2),,,,, 3),,,,,,,,,, [6].,,, ( ),, [9], : 1), 2),,,,, 3),,, 2.,, [10].,,,,,,,,, [11]. 2.1,, [12],, ;, ; Fig. 1 1 Granular hier 36 7 Vol. 36, No. 7 2010 7 ACTA AUTOMATICA SINICA July, 2010 1, 2 1, 2, 3 1, 2,,,,,,, DOI,,, 10.3724/SP.J.1004.2010.00923 Distributed Simulation System Hierarchical Design Model Based on Quotient Space

More information

202,., IEC1123 (1991), GB8051 (2002) [4, 5],., IEC1123,, : 1) IEC1123 N t ( ). P 0 = 0.9995, P 1 = 0.9993, (α, β) = (0.05, 0.05), N t = 72574 [4]. [6

202,., IEC1123 (1991), GB8051 (2002) [4, 5],., IEC1123,, : 1) IEC1123 N t ( ). P 0 = 0.9995, P 1 = 0.9993, (α, β) = (0.05, 0.05), N t = 72574 [4]. [6 2013 4 Chinese Journal of Applied Probability and Statistics Vol.29 No.2 Apr. 2013 (,, 550004) IEC1123,,,., IEC1123 (SMT),,,. :,,, IEC1123,. : O212.3. 1. P.,,,, [1 5]. P, : H 0 : P = P 0 vs H 1 : P = P

More information

2008 Nankai Business Review 61

2008 Nankai Business Review 61 150 5 * 71272026 60 2008 Nankai Business Review 61 / 62 Nankai Business Review 63 64 Nankai Business Review 65 66 Nankai Business Review 67 68 Nankai Business Review 69 Mechanism of Luxury Brands Formation

More information

θ 1 = φ n -n 2 2 n AR n φ i = 0 1 = a t - θ θ m a t-m 3 3 m MA m 1. 2 ρ k = R k /R 0 5 Akaike ρ k 1 AIC = n ln δ 2

θ 1 = φ n -n 2 2 n AR n φ i = 0 1 = a t - θ θ m a t-m 3 3 m MA m 1. 2 ρ k = R k /R 0 5 Akaike ρ k 1 AIC = n ln δ 2 35 2 2012 2 GEOMATICS & SPATIAL INFORMATION TECHNOLOGY Vol. 35 No. 2 Feb. 2012 1 2 3 4 1. 450008 2. 450005 3. 450008 4. 572000 20 J 101 20 ARMA TU196 B 1672-5867 2012 02-0213 - 04 Application of Time Series

More information

(Pattern Recognition) 1 1. CCD

(Pattern Recognition) 1 1. CCD ********************************* ********************************* (Pattern Recognition) 1 1. CCD 2. 3. 4. 1 ABSTRACT KeywordsMachine Vision, Real Time Inspection, Image Processing The purpose of this

More information

LaDefense Arch Petronas Towers 2009 CCTV MOMA Newmark Hahn Liu 8 Heredia - Zavoni Barranco 9 Heredia - Zavoni Leyva

LaDefense Arch Petronas Towers 2009 CCTV MOMA Newmark Hahn Liu 8 Heredia - Zavoni Barranco 9 Heredia - Zavoni Leyva 39 6 2011 12 Journal of Fuzhou University Natural Science Edition Vol 39 No 6 Dec 2011 DOI CNKI 35-1117 /N 20111220 0901 002 1000-2243 2011 06-0923 - 07 350108 105 m 14 69% TU311 3 A Seismic analysis of

More information

1.2 资 金 的 管 理 1.1 权 利 义 务 来 源 MOU 1.3 数 据 的 使 用 和 保 护 2 国 际 空 间 站 资 源 分 配 方 案 54

1.2 资 金 的 管 理 1.1 权 利 义 务 来 源 MOU 1.3 数 据 的 使 用 和 保 护 2 国 际 空 间 站 资 源 分 配 方 案 54 第 29 卷 第 12 期 全 球 科 技 经 济 瞭 望 Vol. 29 No. 12 2014 年 12 月 Global Science, Technology and Economy Outlook Dec. 2014 刘 阳 子 ( 中 国 科 学 技 术 信 息 研 究 所, 北 京 ) 摘 要 : 空 间 探 索 既 复 杂 艰 巨 又 耗 资 甚 大, 因 此, 世 界 各 国 无

More information