A (Biased) Survey Of (Robot) Walking Controllers

A (Biased) Survey Of (Robot) Walking s 7 Chris Atkeson CMU Dynamic Walking 7 www.cs.cmu.edu/~cga/dw7 Andy s Questions For Discussion Over Beer What are ways to think about and implement (robot) walking control? Pros and cons; computational demands for design and execution. trajectories, use of feedback, policies, state machines, Manual design, pole placement, model matching, optimization, N step lookahead Calibration, tuning, adaptation, learning, What has worked and why? Which are connected to biology? Clocked state machines and CPGs Which are connected to animation and videogames? Vocabulary State: x or s (desired state: x d ) Action/command/control: u or a Dynamics: x k+1 = f(x k,u k ) Policy/control law: u(x) or a(s) Regulator: u = K (x x d ) Trajectory: x(t), u(t) Trajectory/tracking controller: u(t) = K(t)(x x d (t)) + u ff (t) One step cost function c(x,u) Total cost: C() = c T (x T ) + Σc(x k,u k ) Value function V(x) = c(x,u(x)) More Vocabulary For periodic systems it is helpful to use a phase variable φ which is periodic instead of time t. Trajectory: x(φ), u(φ) Trajectory/tracking controller: u(φ) = K(φ)(x x d (φ)) + u ff (φ) Useful to be able to phase lock/reset phase of x d,u ff, and K to the actual phase of the system. Passive Dynamic Walking No control Questions For Beer Discussion Any PDW robot with a large basin of attraction? That is easy to build and maintain? Why not? Not human-like? Anatomy: big feet, torso Kinematics: step time too long Ground forces 1

Active Passive Dynamic Walking Feedback triggered addition of energy MIT: Ankle extension. Cornell: Ankle extension. Delft: Hip spring swings leg forward. Amount of energy added state dependent, but control action by computer is a fixed action pattern. The reflexive neuronal controller of Runbot www.cn.stir.ac.uk/~tgeng Reflex Chain Runbot: www.cn.stir.ac.uk/~tgeng Open Loop Command Trajectories u = u ff (t) applied using a clock. Clock reset by sensory events (touchdown) Clock rate affected by feedback? Behavioral phase (ankle angle) instead of clock? Beer Discussion: Are these CPGs? Trajectories With Feedback u = K(t)(x x_d(t)) + u ff (t) K often constant, independent joint control (k_i for each joint) vs. fully coupled control. u ff (feedforward command) often not used. Clock t can be reset and its rate adjusted by feedback. Behavioral phase can be used instead Trajectory Learning in Walking (Jun Morimoto, ATR)

Why trajectories? (Beer) Learning Motor Tapes: Learning Control Commands Compact representation Easy to learn BUT need to introduce clock Are these a good way to think about CPGs? Before After Errors CMU/MIT Leg Lab, BDI, State machine: In-air, on-ground (actually more complicated, with twilight states). In each state a constant policy was applied u = f(x). Typically these policies were very simple independent PD controllers for each joint with fixed setpoints x_d: u = k_p(x x_d) k_v(xd) and motivated by low dimensional idealized models: Linear foot placement to control speed. Torque hip in stance to balance body. Hodgins 1991 Big Dog KangKang Yin, Kevin Loken, and Michiel van de Panne SIMBICON: Simple Biped Locomotion Control ACM SIGGRAPH 7 www.cs.ubc.ca/~van/ Poster Here: Stelian Caros 3

Questions For Beer Discussion Why does this approach succeed? Why is everything oversimplified? Independent control of degrees of freedom (*foot placement) Setpoints vs trajectories x_d(t) or u ff (t) (*Simbicon) Where is the theory? Stability proof? How place feet on targets? (Hodgins thesis) State machines = discrete CPGs? Why no ZMP? Running, point feet. Zero Moment Point (ZMP) ZMP = center of pressure (COP) COPx = My/Fz, COPy =Mx/Fz ZMP ideology: Don t let ZMP/COP reach boundary of support polygon. No unintentional tipping. Other constraints: Fz > ; Friction cone: Fx/Fz, Fy/Fz < u; Spin constraint on Mz. Kajita: ZMP based scheme is good when footholds are specified. I am using material from: http://nobunaga.t.u-tokyo.ac.jp/icraws/handout/kajitaicraws6.pdf ZMP is useful for position controlled robots g(com COP) = h COM Kajita Ankle Torque Rigid Joints Torque controlled robot with compliant joints: COP/ZMP is directly related to Ankle Torque. Rest of body has little effect. Position controlled robot with stiff joints: COP is related to acceleration of body parts, since torques transmitted across rigid joints. ZMP Pattern Generation Choose ZMP(t), COM(t), foot locations and timing, force allocation in DSP, and what to do with all those redundant DOF. Use simple idealized models. Use human motion capture. Use optimization (what criteria?). Kajita 4

Handling Errors ZMP pattern has stability guarantees. Perfect trajectory tracking, perfectly flat and level floors, no impact forces, and no slipping are assumed. BUT once feedback control is added to handle errors and impacts, stability guarantees go away What does it take to really make ZMP work? Honda world.honda.com/asimo/history/technology.html Floor Reaction Control: handle floor unevenness (foot force control). Target ZMP Control: accelerate torso to move ZMP. Foot Planting Location Control: Use foot placement for further balance control. Realization of Dynamic Walking of Biped Humanoid Robot Basic Structure for Biped Walking m m m Observation Human Walking Motions Simple Dynamic Model www.ri.cmu.edu/people/kim_jung_yup.html Walking Pattern Planning (slow process) Posture Stabilization (fast process) Biped Walking System ID Based on Experiments Online Design Based On Sensory Feedback - Vibration - ZMP Compensator - Landing Position - Landing Timing... Displacement [mm] 4 3 1 Pelvis trajectory in Y direction Right foot trajectory in Z direction Left foot trajectory in Z direction H foot A pelvis -1 T ssp - T dsp T -3 dsp T step T deley T -4 stride 1 3 4 5 6 7 8 9 1 Time [1 msec] Cosine function is used to generate smooth curve - basically cyclic function - easy to prevent velocity discontinuity - easy to differentiate the function Walking pattern design parameters A pelvis H foot T stride T step T delay κ dsp T ssp T dsp Description Lateral swing amplitude of pelvis Maximum elevation of foot Walking period (stride time) Step time Delay time Double support ratio Single support time Double support time Value 3 (mm) 4 (mm) 1.9 (seconds).95 (seconds). (second).5 (5 %) Tstep ( 1. κ dsp ) Tstep κ Department of Mechanical Engineering dsp Online Control Structure Cartesian Space Joint Space Walking Pattern Control (WPC) Torso Roll & Pitch Pelvis Swing Amp. Landing Position Tilt Over Landing Timing Predictive Motion Control(PMC) θ body, ω body ZMP θ i Walking Parameters Setting Walking Type Selection Walking Pattern Generation Inverse Kinematics Ref θ i - + PD controller HUBO F z : Step length, step period, DSP ratio : Go forward/backward, Turning around, Go aside Landing Detection Real-Time Balance Control (RTBC) Vibration controller M, M x y ZMP Upright Pose ZMP Compensator Soft Landing Department of Mechanical Engineering 5

- -4-6 -8-1 -1 4 6 8 1 Walking Pattern Control (1) Vibration Control Modifies the walking pattern at every walking cycle. Learning strategy : X k +1 = X k + αek T stride,k th where, X k = Apelvis : states at k walking cycle κ dsp α : learning rate, Ek : error vector Objective : Suppress the vibrations due to the joint compliance. (a) Vibration control of torso Torque(Nm) Mx without damping control 3 Mx with damping control 1-1 - -3 4 6 8 1 Time(1 msec) Department of Mechanical Engineering Department of Mechanical Engineering (1) Vibration Control ()ZMP Compensator Objective : Suppress the vibrations due to the joint compliance. (b) Vibration control of swing foot Objective : Maintain dynamic balance all the time. 15 Y-comp. of foot acceleration without control Y-comp. of foot acceleration with control Pelvis Center Transverse Plane Acceleration [m/s ] 1 5-5 -1-15 Z Y X Mode 1 (Double Support Phase) X pelvis Z Ypelvis Y X ZMP X ZMP YZMP Mode (Double Support Phase) Z Y X Mode 3 (Single Support Phase) - 1 3 4 5 Time [1 msec] Department of Mechanical Engineering Horizontal pelvis motion on the transverse plane is used for the ZMP compensation f Experimental results (Mode 3) No control 6 Fast + Slow SSP ZMP compensation with damping control No control Fast + Slow SSP ZMP compensation with damping control (3) Landing Position Control 4 X ZMP [mm] - Steady State Value : 1.5-4 -6 4 6 8 1 Time [1 ms] Y ZMP [mm] Time [1 ms] Steady State Value : - 8.3 Steady-state error is nearly zero! Objective : Find out best landing position at every step. d Video ω 1 ω1 ω M Just before landing t Just after landing Impulse-Momentum Principle on the Coronal Plane : Iω Iω + M t = 1 D f M h i l E i i 6

1 8 6 4 - -4-6 -8-1 -1-8 -6-4 - 4 6 8 1. 1.5 1..5. -.5-1. -1.5 -. -. -1.5-1. -.5..5 1. 1.5. 1 8 6 4 - -4-6 -8-1 -1-8 -6-4 - 4 6 8 1. 1.5 1..5. -.5-1. -1.5 -. -. -1.5-1. -.5..5 1. 1.5. Experimental results Video (4) Landing Timing Control Objective : Find out best landing time at every step. [deg/sec] Pitching angular velocity ω p torso Without control Rolling angular velocity ω torso r [deg/sec] (a) Without landing position control torso Pitching angular velocity ω p [deg/sec] Rolling angular velocity ω r torso [deg/sec] (b) With landing position control With control Angular velocity of body ω Angular Velocity[deg/s] or Displacement[mm] Prescribed left foot height Stable region A B A B Prescribed right foot height E torso Pitching angle θ p [deg] torso Pitching angle θ p [deg] Without control With control (Coronal Plane View) ZCP(Zero Crossing Point) (inside tilt over case) Compensated right foot altitude Time [sec] E ZCP(Zero crossing point) (normal case) Angular velocity,ω Rolling angle θ r torso [deg] (c) Without landing position control Rolling angle θ r torso [deg] (d) With landing position control (5) Stable Landing Control Objective : Absorb the landing impact at every step. Sagittal plane view m 1 z 1 1 z1( s) = F( s) m1s + c1s + k1 Impedance Control : 1 z ( s) = T ( s) m s + c s + k Virtual Spring & Damper C 1 K 1 Virtual Spring & Damper 1 z K C m where, T : torque, F : normal T F force Footstep Planning (CMU: Chestnutt, Kuffner) 7

Moving Obstacles (CMU: Chestnutt Kuffner) More Discussion Over Beer ZMP is dynamic walking (see Honda videos). And we all walk without tipping, so obey ZMP. Can ZMP be applied to compliant/backdrivable robots? Smaller feet? How handle redundancy: what body parts move to affect COM? How handle uneven floor, multiple contacts, hands, grasps, Kajita: More flexible pattern generation Kajita: Stabilizers for full-body contact Kajita: More humanlike mechanism and motion Policy u=f(x) How design full policy u = f(x)? No state machine Pole placement or model matching Optimization Learning (which is really optimization) Dimensionality reduction followed by one of above Standard approach for non-periodic tasks (arms, ). Reintroduce discrete control/state machine when have multiple tasks or desired parameters (such as velocity). Spring Flamingo: Model Matching Rabbit 8

Trajectories To Approximate Policy Using Many Trajectories To Approximate A Full Policy u=f(x) Each point x_d(t) has a u_d(t) associated with it. You can even have a linear controller associated with it: K(t) To approximate u = f(x), find nearest neighbor to x, and then compute appropriate command: u = u_d(x) or u = u_d(x) + K(x)(x x_d(x)) Poincaré State Simulation Transition Top Poincaré section CPGs Lateral Lateral Oscillation (Morimoto, ATR) Extension Sagittal Walker 9

Walking (Morimoto, ATR) CPGs and Beer Classical control puts internal state into controllers (lead lag filters, ) Modern control: controller is static map u = f(x). Any internal state is in the state estimator. The state estimator replicates the plant dynamics. If the plant is oscillatory, the state estimator is oscillatory. Are CPGs state estimators? Is their role to estimate behavioral phase? Or is a different view such as coupled oscillators more useful? D 3D (Beer) D: Easy to stabilize with goosestep strategy Use energy American style: wastefully Add a lot of energy initially: Push off (ankle, knee) D Swing (hip) Hold leg at desired step length Remove energy with impact Prevent falling backwards, need to limit forward velocity so swing works. This does not work in 3D due to yaw problems. Biology is Soft/compliant Well damped Redundant Versatile Handles errors well (slip, trip, ) Has central pattern generators (CPGs) BUT also has significant delays. Summary Reflexes (Learned) trajectories State machines and policies ZMP and feedback control Full policies Poincare policies CPGs 1