CS 4649/7649 Robot Intelligence: Planning

Similar documents
Planning and Acting in Partially Observable Stochastic Domains

CS 4649/7649 Robot Intelligence: Planning

CS 4649/7649 Robot Intelligence: Planning

CS 4649/7649 Robot Intelligence: Planning

CS 4649/7649 Robot Intelligence: Planning

Problem Solving as Search - I

Problem Solving Agents

COMP219: Artificial Intelligence. Lecture 8: Combining Search Strategies and Speeding Up

CS Lecture 5. Vidroha debroy. Material adapted courtesy of Prof. Xiangnan Kong and Prof. Carolina Ruiz at Worcester Polytechnic Institute

/435 Artificial Intelligence Fall 2015

Lecture 1 Temporal constraints: source and characterization

Overview. Depth Limited Search. Depth Limited Search. COMP219: Artificial Intelligence. Lecture 8: Combining Search Strategies and Speeding Up

Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Better Search Improved Uninformed Search CIS 32

Game Theory. 1. Introduction. Bernhard Nebel and Robert Mattmüller. Summer semester Albert-Ludwigs-Universität Freiburg

#19 MONITORING AND PREDICTING PEDESTRIAN BEHAVIOR USING TRAFFIC CAMERAS

Uninformed search methods II.

Search I. Tuomas Sandholm Carnegie Mellon University Computer Science Department. [Read Russell & Norvig Chapter 3]

NATIONAL INSTRUMENTS AUTONOMOUS ROBOTICS COMPETITION Task and Rules Document

Fault Diagnosis based on Particle Filter - with applications to marine crafts

Robotics and Autonomous Systems

Sensing and Modeling of Terrain Features using Crawling Robots

Robotics and Autonomous Systems

Efficient Gait Generation using Reinforcement Learning

beestanbul RoboCup 3D Simulation League Team Description Paper 2012

Uninformed search methods II.

PERCEPTIVE ROBOT MOVING IN 3D WORLD. D.E- Okhotsimsky, A.K. Platonov USSR

Introduction to Pattern Recognition

Lecture 16: Chapter 7, Section 2 Binomial Random Variables

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Application of Bayesian Networks to Shopping Assistance

Autonomous blimp control with reinforcement learning

Uninformed Search (Ch )

Solving Problems by Searching chap3 1. Problem-Solving Agents

Uninformed search methods

Introduction. AI and Searching. Simple Example. Simple Example. Now a Bit Harder. From Hammersmith to King s Cross

High-Resolution Measurement-Based Phase-Resolved Prediction of Ocean Wavefields

Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) Computer Science Department

Algorithm for Line Follower Robots to Follow Critical Paths with Minimum Number of Sensors

4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER FA C AUTHOR(S) 5d. PROJECT NUMBER

Rescue Rover. Robotics Unit Lesson 1. Overview

CENG 466 Artificial Intelligence. Lecture 4 Solving Problems by Searching (II)

A new Decomposition Algorithm for Multistage Stochastic Programs with Endogenous Uncertainties

Uninformed Search Strategies

Bézier Curves and Splines

YAN GU. Assistant Professor, University of Massachusetts Lowell. Frederick N. Andrews Fellowship, Graduate School, Purdue University ( )

Design and Modeling of a Mobile Robot

AFRL-RI-RS-TR

Kenzo Nonami Ranjit Kumar Barai Addie Irawan Mohd Razali Daud. Hydraulically Actuated Hexapod Robots. Design, Implementation. and Control.

Process Control Loops

knn & Naïve Bayes Hongning Wang

Prof. Brian C. Williams February 23 rd, /6.834 Cognitive Robotics. based on [Kim,Williams & Abramson, IJCAI01] Outline

Bicycle Safety Map System Based on Smartphone Aided Sensor Network

FMSN60/MASM18 Financial Statistics Lecture 1, Introduction and stylized facts. Magnus Wiktorsson

Spring Locomotion Concepts. Roland Siegwart, Margarita Chli, Martin Rufli. ASL Autonomous Systems Lab. Autonomous Mobile Robots

CRL Processing Rules. Santosh Chokhani March


Smart Bike as the 1 persons car of the future

STICTION: THE HIDDEN MENACE

Kungl Tekniska Högskolan

Centre for Autonomous Systems

G53CLP Constraint Logic Programming

Introduction to Pattern Recognition

LOCOMOTION CONTROL CYCLES ADAPTED FOR DISABILITIES IN HEXAPOD ROBOTS

LQG Based Robust Tracking Control of Blood Gases during Extracorporeal Membrane Oxygenation

Karachi Koalas 3D Simulation Soccer Team Team Description Paper for World RoboCup 2013

Overview. Time Learning Activities Learning Outcomes. 10 Workshop Introduction

M. TECH. (Computer Science, Information Technology)

Modelling Today for the Future. Advanced Modelling Control Techniques

Assignment for Next Class. Information Systems ISM Put In Nonsense, Get Out Chaos. System and Modeling Concepts. Components of a System.

A computational study of on-demand accuracy level decomposition for twostage stochastic programs

The Mathematics of Gambling

Biomechanics and Models of Locomotion

Collision Avoidance System using Common Maritime Information Environment.

Using Markov Chains to Analyze a Volleyball Rally

Self-Driving Vehicles That (Fore) See


Intelligent Decision Making Framework for Ship Collision Avoidance based on COLREGs

IVR Caller Experience Metrics. World IA Day 2013 March 2

Optimization and Search. Jim Tørresen Optimization and Search

Simulating Major League Baseball Games

FUT-K Team Description Paper 2016

An experimental validation of a robust controller on the VAIMOS autonomous sailboat. Fabrice LE BARS

Gerald D. Anderson. Education Technical Specialist

COMP 406 Lecture 05. Artificial. Fiona Yan Liu Department of Computing The Hong Kong Polytechnic University

Evaluation of the ACC Vehicles in Mixed Traffic: Lane Change Effects and Sensitivity Analysis

Author s Name Name of the Paper Session. Positioning Committee. Marine Technology Society. DYNAMIC POSITIONING CONFERENCE September 18-19, 2001

Motion Control of a Bipedal Walking Robot

Human Pose Tracking III: Dynamics. David Fleet University of Toronto

Modeling of Scroll Compressor Using GT-Suite

Pokemon Robotics Challenge: Gotta Catch em All 2.12: Introduction to Robotics Project Rules Fall 2016

Teaching Notes. Contextualised task 35 The 100 Metre Race

Module 3 Developing Timing Plans for Efficient Intersection Operations During Moderate Traffic Volume Conditions

Application of Dijkstra s Algorithm in the Evacuation System Utilizing Exit Signs

Thresholded-Rewards Decision Problems: Acting Effectively in Timed Domains

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Student Outcomes. Lesson Notes. Classwork. Discussion (20 minutes)

IRB Staff Administration Guide

Parsimonious Linear Fingerprinting for Time Series

Safety Criticality Analysis of Air Traffic Management Systems: A Compositional Bisimulation Approach

Transcription:

CS 4649/7649 Robot Intelligence: Planning Partially Observable MDP Sungmoon Joo School of Interactive Computing College of Computing Georgia Institute of Technology S. Joo (sungmoon.joo@cc.gatech.edu) 1 Some slides adapted from Dr. Mike Stilman s lecture slides Administrative Three lectures left - Nov. 25 th : POMDP and Summary of Planning under Uncertainties - Dec. 2 nd : Extension of Planning/Control: Language, Hybrid System - Dec. 4 th : Wrap up Due Reminder: - Project report: Due Dec. 4 th - Project report review: Due Dec. 11 th - Project presentation & presentation evaluation: Dec. 11 th S. Joo (sungmoon.joo@cc.gatech.edu) 2 1

Reality Two Sources of Error Sensing & State Estimation Uncertainty Sensors have noise You don t know exactly what the state is (e.g. mapping, localization, ) Action Execution Uncertainty Your actuators do not do what you tell them to The system responds differently than you expect : Friction gears, air resistance, etc. 10/21/2014 S. Joo (sungmoon.joo@cc.gatech.edu) 3 Reality Estimated state (uncertainty) (uncertainty) S. Joo (sungmoon.joo@cc.gatech.edu) 4 2

Reality State Estimation Estimated state (uncertainty) MDP (uncertainty) Plan, (Control) Policy S. Joo (sungmoon.joo@cc.gatech.edu) 5 Reality POMDP Estimated state (uncertainty) MDP (uncertainty) S. Joo (sungmoon.joo@cc.gatech.edu) 6 3

Markov Decision Process (MDP) Mathematical Frameworks Action Uncertainty Observation Uncertainty Markov Chain -Markov Property Hidden Markov Model (HMM) Markov Decision Process (MDP) Partially Observable MDP (POMDP) Both S. Joo (sungmoon.joo@cc.gatech.edu) 7 POMDP MDP Uncertainty about action outcome Uncertainty about the state due to imperfect observation Don t get to observe the state itself, instead get sensory measurements S. Joo (sungmoon.joo@cc.gatech.edu) 8 4

State Estimation Belief State (ex. Kalman Filter) S. Joo (sungmoon.joo@cc.gatech.edu) 9 Belief States: Example Kalman Filter: Gaussian(Mean & Covariance) Continuous Belief States S. Joo (sungmoon.joo@cc.gatech.edu) 10 5

POMDP Belief state (uncertainty) (uncertainty) S. Joo (sungmoon.joo@cc.gatech.edu) 11 POMDP MDP o ) ) MDP S. Joo (sungmoon.joo@cc.gatech.edu) 12 6

POMDP Probability distributions over states of the underlying MDP (i.e. belief state) The agent keeps an internal belief state, b, that summarizes its experience(observation & control input history). The agent uses a state estimator, SE, for updating the belief state b based on the last action a t-1, the current observation o at t, and the previous belief state b at t-1. S. Joo (sungmoon.joo@cc.gatech.edu) 13 Converting POMDP to Belief-States MDP Current belief distribution Current observation Previous action S. Joo (sungmoon.joo@cc.gatech.edu) 14 7

Converting POMDP to Belief-States MDP s 2 s 2 How? s 2 s 1 s 1 s 2 Normalizing Factor S. Joo (sungmoon.joo@cc.gatech.edu) 15 Converting POMDP to Belief-States MDP s 2 s 2 s 2 s 2 s 2 s s s 1 2 1 s 1 s 2 s 1 s 1 s 2 S. Joo (sungmoon.joo@cc.gatech.edu) 16 8

Total Probability If {B n : n = 1,2,3 } is a finite or countably infinite partition of a sample space, and each event B n is measurable, then for any event A of the same probability space, the following holds The T.P. can also be stated for conditional probabilities. Taking the B n as above, and assuming C is an event independent with any of the B n S. Joo (sungmoon.joo@cc.gatech.edu) 17 Converting POMDP to Belief-States MDP s 2 s 2 s 2 s 2 s 2 s 2 s 1 s 1 s 1 s 2 s 1 s 1 s 2 s 1 s 1 s 2 S. Joo (sungmoon.joo@cc.gatech.edu) 18 9

Converting POMDP to Belief-States MDP s 2 s 2 s 2 s 1 s 1 s 2 State Estimation S. Joo (sungmoon.joo@cc.gatech.edu) 19 POMDP to MDP Action update S. Joo (sungmoon.joo@cc.gatech.edu) 20 10

POMDP to MDP Observation update S. Joo (sungmoon.joo@cc.gatech.edu) 21 POMDP Example S. Joo (sungmoon.joo@cc.gatech.edu) 22 11

POMDP Example S. Joo (sungmoon.joo@cc.gatech.edu) 23 POMDP Example = = S. Joo (sungmoon.joo@cc.gatech.edu) 24 12

POMDP Example (conditioned on A1 and O2) S. Joo (sungmoon.joo@cc.gatech.edu) 25 POMDP Example S. Joo (sungmoon.joo@cc.gatech.edu) 26 13

POMDP Example S. Joo (sungmoon.joo@cc.gatech.edu) 27 POMDP Example T.P. S. Joo (sungmoon.joo@cc.gatech.edu) 28 14

POMDP Example S. Joo (sungmoon.joo@cc.gatech.edu) 29 POMDP Example S. Joo (sungmoon.joo@cc.gatech.edu) 30 15

How to Solve Belief-State MDP? S. Joo (sungmoon.joo@cc.gatech.edu) 31 Solving a POMDP S. Joo (sungmoon.joo@cc.gatech.edu) 32 16

Solving a POMDP S. Joo (sungmoon.joo@cc.gatech.edu) 33 Solving a POMDP: Step1 S. Joo (sungmoon.joo@cc.gatech.edu) 34 17

Solving a POMDP: Step1 S. Joo (sungmoon.joo@cc.gatech.edu) 35 Solving a POMDP: Step1 S. Joo (sungmoon.joo@cc.gatech.edu) 36 18

Solving a POMDP: Step2 S. Joo (sungmoon.joo@cc.gatech.edu) 37 Solving a POMDP: Step2 S. Joo (sungmoon.joo@cc.gatech.edu) 38 19

Solving a POMDP: Step2 S. Joo (sungmoon.joo@cc.gatech.edu) 39 Solving a POMDP: Step2 S. Joo (sungmoon.joo@cc.gatech.edu) 40 20

Solving a POMDP: Step2 S. Joo (sungmoon.joo@cc.gatech.edu) 41 POMDP in Higher Dimensions: Hyperplanes S. Joo (sungmoon.joo@cc.gatech.edu) 42 21

POMDP Summary Complex but Powerful technique - State explodes upon conversion to MDP - State becomes difficult to understand upon conversion to MDP - Unique cohesive method that trades off: : Value of ascertaining state : Value of pursuing a goal Exist more efficient algorithms: - Witness Algorithm (Littman 94) - Policy Iteration (Sondik, Hansen 97) Typically complexity is still prohibitive for large problems S. Joo (sungmoon.joo@cc.gatech.edu) 43 POMDP Summary Canonical solution method 1 Covered today - Run value iteration, but now the state space is the space of probability distributions :value and optimal action for every possible probability distribution :will automatically trade off information gathering actions versus actions that affect the underlying state Canonical solution method 2 Finite-horizon/MPC-style - Search over sequences of actions with limited look-ahead - Branching over actions and observations Canonical solution method 3 LQG-style - Plan in the MDP - Run probabilistic inference (filtering) to track probability distribution - Choose optimal action for MDP for what is currently the most likely state S. Joo (sungmoon.joo@cc.gatech.edu) 44 22

Active Monocular SLAM Example Robot s trajectory matters! Control objective Probing Obstacle Goal Trade-off : Control Objective vs Probing Dual Control Trade-off S. Joo (sungmoon.joo@cc.gatech.edu) 45 Active Monocular SLAM Example Scenario S. Joo (sungmoon.joo@cc.gatech.edu) 46 23

Active Monocular SLAM Example Sungmoon Joo, SLAM-based nonlinear optimal control approach to robot navigation with limited resources S. Joo (sungmoon.joo@cc.gatech.edu) 47 Active Monocular SLAM Example S. Joo (sungmoon.joo@cc.gatech.edu) 48 24