In memory of Dr. Kevin P. Granata, my graduate advisor, who was killed protecting others on the morning of April 16, 2007.

Similar documents
OPTIMAL TRAJECTORY GENERATION OF COMPASS-GAIT BIPED BASED ON PASSIVE DYNAMIC WALKING

Biomechanics and Models of Locomotion

Sample Solution for Problem 1.a

Using GPOPS-II to optimize sum of squared torques of a double pendulum as a prosthesis leg. Abstract

Walking with coffee: when and why coffee spills

From Passive to Active Dynamic 3D Bipedal Walking - An Evolutionary Approach -

LOCOMOTION CONTROL CYCLES ADAPTED FOR DISABILITIES IN HEXAPOD ROBOTS

Toward a Human-like Biped Robot with Compliant Legs

Humanoid Robots and biped locomotion. Contact: Egidio Falotico

Motion Control of a Bipedal Walking Robot


Body Stabilization of PDW toward Humanoid Walking

John Sushko. Major Professor: Kyle B. Reed, Ph.D. Rajiv Dubey, Ph.D. Jose Porteiro, Ph.D. Date of Approval: October 21, 2011

ABSTRACT PATTERNS USING 3D-DYNAMIC MODELING. Kaustav Nandy, Master of Science, Department of Electrical And Computer Engineering

An investigation of kinematic and kinetic variables for the description of prosthetic gait using the ENOCH system

Asymmetric Passive Dynamic Walker

Toward a Human-like Biped Robot with Compliant Legs

Dynamically stepping over large obstacle utilizing PSO optimization in the B4LC system

Effects of Ankle Stiffness on Gait Selection of Dynamic Bipedal Walking with Flat Feet

STABILITY AND CHAOS IN PASSIVE-DYNAMIC LOCOMOTION

Biomechanical analysis of the medalists in the 10,000 metres at the 2007 World Championships in Athletics

+ t1 t2 moment-time curves

EXSC 408L Fall '03 Problem Set #2 Linear Motion. Linear Motion

Evolving Gaits for the Lynxmotion Hexapod II Robot

ARTIFICIAL NEURAL NETWORK BASED DESIGN FOR DUAL LATERAL WELL APPLICATIONS

Human Pose Tracking III: Dynamics. David Fleet University of Toronto

Analysis and Research of Mooring System. Jiahui Fan*

Limit Cycle Walking and Running of Biped Robots

by Michael Young Human Performance Consulting

Data Driven Computational Model for Bipedal Walking and Push Recovery

Gait Analysis of Wittenberg s Women s Basketball Team: The Relationship between Shoulder Movement and Injuries

INSTRUMENT INSTRUMENTAL ERROR (of full scale) INSTRUMENTAL RESOLUTION. Tutorial simulation. Tutorial simulation

Waves. harmonic wave wave equation one dimensional wave equation principle of wave fronts plane waves law of reflection

UNIT IV: SOUND AND LIGHT Chapter 25-31

Emergent walking stop using 3-D ZMP modification criteria map for humanoid robot

Walking and Running BACKGROUND REVIEW. Planar Pendulum. BIO-39 October 30, From Oct. 25, Equation of motion (for small θ) Solution is

BUILDING A BETTER PASSIVE WALKER

EVOLVING HEXAPOD GAITS USING A CYCLIC GENETIC ALGORITHM

,WHUDWLYH3URGXFW(QJLQHHULQJ(YROXWLRQDU\5RERW'HVLJQ

ABSTRACT 1 INTRODUCTION

Controlling Walking Behavior of Passive Dynamic Walker utilizing Passive Joint Compliance

Numerical study on the wrist action during the golf downswing

Kungl Tekniska Högskolan

Centre for Autonomous Systems

YAN GU. Assistant Professor, University of Massachusetts Lowell. Frederick N. Andrews Fellowship, Graduate School, Purdue University ( )

Joint Torque Evaluation of Lower Limbs in Bicycle Pedaling

Journal of Chemical and Pharmaceutical Research, 2016, 8(6): Research Article. Walking Robot Stability Based on Inverted Pendulum Model

The Incremental Evolution of Gaits for Hexapod Robots

Current issues regarding induced acceleration analysis of walking using the integration method to decompose the GRF

An Evolved Neural Controller for Bipedal Walking with Dynamic Balance

Compliance Control for Biped Walking on Rough Terrain

intended velocity ( u k arm movements

Decentralized Autonomous Control of a Myriapod Locomotion Robot

Available online at Prediction of energy efficient pedal forces in cycling using musculoskeletal simulation models

Kinematic Differences between Set- and Jump-Shot Motions in Basketball

Episode 320: Superposition

Coaching the Triple Jump Boo Schexnayder

Autonomous blimp control with reinforcement learning

ZMP Trajectory Generation for Reduced Trunk Motions of Biped Robots

Putting Report Details: Key and Diagrams: This section provides a visual diagram of the. information is saved in the client s database

Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Discovering Special Triangles Learning Task

Mecánica de Sistemas Multicuerpo:

Investigation of Suction Process of Scroll Compressors

Aerodynamic study of a cyclist s moving legs using an innovative approach

Swing leg retraction helps biped walking stability

Preview. Vibrations and Waves Section 1. Section 1 Simple Harmonic Motion. Section 2 Measuring Simple Harmonic Motion. Section 3 Properties of Waves

Supplementary Figure 1 An insect model based on Drosophila melanogaster. (a)

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher

Effects of a Passive Dynamic Walker s Mechanical Parameters on Foot- Ground Clearance

Dynamic Lateral Stability for an Energy Efficient Gait

Energetic Consequences of Walking Like an Inverted Pendulum: Step-to-Step Transitions

ON PASSIVE MOTION OF THE ARMS FOR A WALKING PLANAR BIPED

@ Chee-Meng Chew, MM. All rights reserved.

FORECASTING OF ROLLING MOTION OF SMALL FISHING VESSELS UNDER FISHING OPERATION APPLYING A NON-DETERMINISTIC METHOD

Open Research Online The Open University s repository of research publications and other research outputs

LOCAL STABILITY ANALYSIS OF PASSIVE DYNAMIC BIPEDALROBOT

CHAPTER IV FINITE ELEMENT ANALYSIS OF THE KNEE JOINT WITHOUT A MEDICAL IMPLANT

ITF Coaches Education Programme Biomechanics of the forehand stroke

A NEW GOLF-SWING ROBOT MODEL UTILIZING SHAFT ELASTICITY

Using sensory feedback to improve locomotion performance of the salamander robot in different environments

Artifacts Due to Filtering Mismatch in Drop Landing Moment Data

Reliable Real-Time Recognition of Motion Related Human Activities using MEMS Inertial Sensors

Wave Motion. interference destructive interferecne constructive interference in phase. out of phase standing wave antinodes resonant frequencies

The Starting Point. Prosthetic Alignment in the Transtibial Amputee. Outline. COM Motion in the Coronal Plane

Computer Simulation of Semi-Passive Walking Robot with Four Legs and Verification of It s Validity Using Developed Experimental Robot

Posture influences ground reaction force: implications for crouch gait

PERCEPTIVE ROBOT MOVING IN 3D WORLD. D.E- Okhotsimsky, A.K. Platonov USSR

Controlling Velocity In Bipedal Walking: A Dynamic Programming Approach

Energetics of Actively Powered Locomotion Using the Simplest Walking Model

Impact Points and Their Effect on Trajectory in Soccer

Characteristics of ball impact on curve shot in soccer

Robots With Legs. Helge Wrede

Machine Optimisation of Dynamic Gait Parameters for Bipedal Walking

CAM Final Report John Scheele Advisor: Paul Ohmann I. Introduction

A Bio-inspired Behavior Based Bipedal Locomotion Control B4LC Method for Bipedal Upslope Walking

Robotics and Autonomous Systems

Gait Evolution for a Hexapod Robot

Toward a human-like biped robot with compliant legs

Foot Placement in the Simplest Slope Walker Reveals a Wide Range of Walking Solutions

Transcription:

Acknowledgement In memory of Dr. Kevin P. Granata, my graduate advisor, who was killed protecting others on the morning of April 16, 2007. There are many others without whom I could not have completed my thesis. I wish to thank: My family, for their love and support. Kelli Sparks, for being with me through the hardest of times, giving me a place to stay, and helping me keep my sanity. My graduate committee, especially Dr. Mary Kasarda who took the time to edit this thesis. Dr. Jeffery Holland and Dr. Naira Hovakimyan for externally reviewing my thesis. Shimon Whiteson for his helpful correspondence on the NEAT+Q algorithm. The Kevin P. Granata Musculoskeletal Biomechanics Lab, especially Brad Davidson and Martin Tanaka for their helpful discussions spanning many of the topics covered in this thesis. Dr. Jim Shine for taking the time to help me with neural networks and back-propagation. Dr. Ramu Krishna for providing information on neural networks. My friend Ross Alameddine (who was also lost on April 16), for volunteering to edit my thesis. iii

Table of Contents 1 Introduction... 1 1.1 Motivation & Significance... 1 1.2 Hypotheses & Specific Aims... 2 1.3 Project Scope... 3 1.4 Overview... 4 2 Literature Review & Interpretation, Mathematical basis... 6 2.1 Literature Review & Interpretation... 6 2.1.1 Bipedal Robots... 6 2.1.1.1 Active Static Walking... 6 2.1.1.2 Passive Dynamic Walking... 6 2.1.1.3 Active Dynamic Walking... 7 2.1.2 Machine Learning... 8 2.1.2.1 Markov Decision Process... 9 2.1.2.2 Reinforcement Learning... 9 2.1.2.3 Q-Learning... 11 2.1.2.4 Neural Networks... 13 2.1.2.5 Neuro-Evolution of Augmenting Topologies... 18 2.2 Mixed Differential-Algebraic Equations of Motion... 22 3 Pendulum Swing-up/Balance Task using Q-Learning... 26 3.1 Introduction... 26 3.2 Methods... 27 3.2.1 Pendulum Dynamics... 27 3.2.1.1 Q-Learning... 29 iv

3.3 Results and Discussion... 30 3.4 Conclusion... 34 4 Pendulum Swing-up/Balance Task using NEAT+Q... 36 4.1 Introduction... 36 4.2 Methods... 38 4.2.1 Pendulum Dynamics... 38 4.2.2 Evolutionary Function Approximation for Reinforcement Learning... 39 4.3 Results and Discussion... 40 4.4 Conclusion... 45 5 Active Dynamic Bipedal Walking with NEAT+Q... 46 5.1 Introduction... 46 5.2 Methods... 47 5.2.1 Walker Dynamics... 47 5.2.2 Evolutionary Function Approximation for Reinforcement Learning... 50 5.3 Results and Discussion... 51 5.4 Conclusion... 56 6 Summary, Interpretation, Future work... 57 6.1 Summary... 57 6.2 Interpretation... 58 6.3 Future Work... 58 7 Appendices:... 60 7.1.1 Complete set of data tables... 60 7.1.2 Computer code... 61 7.1.2.1 C++ Table-Based Q-Learning Program Code:... 61 7.1.2.2 MATLAB NEAT+Q Main File:... 67 v

7.1.2.3 MATLAB NEAT+Q Walker Experiment File:... 72 7.1.3 References... 80 vi

List of Figures Figure 2.1. Q-learning: An off-policy TD control algorithm. From Sutton et al.[47].... 11 Figure 2.2. Visual representation of how the SxA Q-value matrix (a) composes the policy (b). The states shown (, ) represent position and velocity respectively. Each matrix of (a) corresponds to one of three control actions: top = +20 Nm, middle = 0 Nm, bottom = -20 Nm. The policy, represented as a combined matrix (b) shows which actions are favored in which states.... 12 Figure 2.3. Diagram of an artificial node y. Inputs are denoted by x i, and the bias unit is represented by input b j.... 14 Figure 2.4. Plot of the sigmoid function, and the derivative of the sigmoid function over the interval [-8, 8].... 15 Figure 2.5. Simple example of how NEAT evolves more complicated structure. A node is added by splitting a connection in two, a link is added by adding a connection between nodes (bold line) that did not previously exist. From Whiteson et al. [59].... 18 Figure 2.6. Genetic cross-over involves matching up the genomes of both parents. Innovation numbers are shown at the top of each gene; connections are represented in the middle. Non-matching genes are either disjoint or excess, depending on whether their innovation number is larger or smaller than the other parent s largest innovation number. If a gene is disabled in either parent, there is a chance it will be disabled in the offspring as well. From Stanley et al. [44].... 19 Figure 2.7. NEAT+Q Algorithm designed to evolve an ANN function approximator for Q-learning. From Whiteson et al. [59].... 21 Figure 3.1. Diagram of single pendulum (mass, m = 10 kg, length, L =2 m, length scalar d = 0.5).... 27 Figure 3.2. Q-learning: An off-policy TD control algorithm. From Sutton et al. [47].... 29 Figure 3.3. (Color Plot) Optimized Value Function: maximum Q-value of each action over state-space.. 31 Figure 3.4. (Color Plot) The policy determined by selecting the best action in each state. The policy appears to be converging on a radially-symmetric solution. The state-space behavior of the pendulum (white) as it swings-up and balances.... 32 Figure 3.5. Periodic behavior of the pendulum as it oscillates near the vertical.... 33 Figure 3.6. (Color Plot) De-parameterized pendulum behavior. Where ω is the angular frequency of the pendulum defined by equation 3.5.... 34 Figure 4.1. Diagram of single pendulum (mass, m = 10 kg, length, L =2 m, length scalar d = 0.5).... 38 vii

Figure 4.2. (color plot) Evolved neural network structure for the pendulum swing-up/balance task. The connection weights ranged from red excitory (positive) connections to blue inhibitory (negative) connections with values ranging from 8 to + 8 respectively. Node 1 represents position, node 2 represents velocity, node 3 is the bias node, and nodes 4-6 are the Q-values corresponding to +20, 0, and -20 Nm.... 41 Figure 4.3. (Color Plot) Evolved Value Function: maximum Q-value of each action over state-space.... 42 Figure 4.4. (Color Plot) The policy determined by selecting the best action in each state. The white line represents the state-space behavior of the pendulum. The policy with NEAT+Q appears to only consider one of the two solutions that the look-up table method found.... 43 Figure 4.5. Periodic behavior of the pendulum as it oscillates about (0, π).... 44 Figure 4.6. (Color Plot) De-parameterized pendulum behavior. Where ω is the angular frequency of the pendulum defined by equation 4.5.... 45 Figure 5.1. Diagram of the three-segment bipedal walker. The square denotes the revolute ankle joint of the stance leg fixed to the coordinate plane. The joint at the hip is also revolute.... 48 Figure 5.2. (color plot) Evolved ANN topology for the active dynamic walking task. The network is composed of input nodes (1-8), a bias node (9), output nodes (10-15), and a hidden node (110). 1-3 represent stance, swing, and torso angle; 4-6 represent stance, swing, and torso angular velocity; 7-8 represent torso and swing leg torque; 10-15 represent a Q-value corresponding to each action respectively (Table 5.1). The connection weights ranged from blue inhibitory (negative) connections to red excitory (positive) connections with values of 8 to + 8 respectively.... 52 Figure 5.3. Alternative walking solutions. Both networks are more complex with worse behavior than the solution presented previously.... 53 Figure 5.4. Filtered control torques, normalized by the maximum available torque. Maximum torque values were 200 Nm and 30 Nm, respectively. (a) Torso torque (blue) spikes during foot-strike and decreases throughout the swing phase. Swing leg torque (green) is largest right after foot-strike and tends to decrease by the end of the swing phase. (b) More detail can be seen by focusing in on a shorter time period.... 53 Figure 5.5. (Color Plot) (a) Stance leg (red), swing leg (green), and torso (blue) angles over time. Discontinuities in leg angles exist at foot-strike when the stance leg becomes the swing leg, and viceversa. (b) Focusing on the first few seconds reveals how chaotic the gait was in the first few steps. (c) Several seconds later the gait appears to become more uniform, and certain patterns emerge.... 54 Figure 5.6. Results of a passive downhill compass-gait walker with a constrained torso. From Wisse [64].... 55 viii

Figure 5.7. Comparison between the mid-point of the stance and swing leg angles (maroon) and the torso angle (blue). Foot-strike events are indicated with orange diamonds.... 56 ix

List of Tables Table 4.1. An approximate number of elements composing a value function with 3 actions. White rows utilize the multi-dimensional matrix representation, while gray* rows utilize the equivalent neural net function approximator (note: table assumes a fully connected network with no hidden nodes).... 37 Table 5.1. Each action represents two binary inputs (torque recruitment) to two first-order torque filters.... 51 Table 7.1. Anthropomorphic data used for three-segment active dynamic bipedal walker. From de Leva [10].... 60 x