Reliable Real-Time Recognition of Motion Related Human Activities using MEMS Inertial Sensors

Reliable Real-Time Recognition of Motion Related Human Activities using MEMS Inertial Sensors K. Frank,, y; M.J. Vera-Nadales, University of Malaga, Spain; P. Robertson, M. Angermann,

v36 Motivation 2

Folie 2 v36 Sensor module: you get information from the subject and its interaction with the external world. Feature computation: you extract features of what you observe vera_ma;.08.2009

Activity Recognition for Human Motion Behaviour Activity Recognition is helpful in many fields: Indoor positioning Rescue teams Ambient Assisted Living We target seven motion related activities: Lying, Sitting, Standing, Walking, Running Jumping, Falling (+ Getting Up / Down) 3

State of the art - Examples Different activities Different sensors Different locations on the body Different data sets No standard 4

v40 v4 Our Approach for Activity Recognition Sensor device: Inertial Measurement Unit Measurements: 3D acceleration 3D turn rate 3D magnetic field Frame: Sensor Frame Earth Centric Frame (NED) 5

Folie 5 v40 v4 Global Frame ~ Navigation frame vera_ma; 07.08.2009 Flat terrain. Constant gravity field. Coriolis and transport rate effect negligible. No need of gravity correction for activity recognition. -> MTx Sensor gives the measurements and provides attitude information. vera_ma; 0.08.2009

Where do we place the Sensor? FOOT or ANKLE: Lack of information of the upper part of the body. HAND or WATCH: It could be independent of human motion. BELT or POCKET: It has information of the upper and lower part of the body. GOOD FOR RECOGNITION OF HUMAN MOTION 6

Approach Placement of sensor Collect labeled data Investigate a large set of signals and features Identify a subset of them (~20) Machine learning: construct a Bayesian Network Extend to a dynamic Bayesian NW 7

Examples of Features Temporal domain: Mean, standard deviation Median, mean absolute deviation Amplitude Interquartile range Maximum, minimum Integrated value Mean crossings Correlation coefficients between the axis Frequency domain: Main frequency component Spectral entropy Energy in some frequency bands Window length: 32 samp. (0.32 s.) 64 samp. (0.64 s.) 28 samp. (.28 s.) 256 samp. (2.56 s.) 52 samp. (5.2 s.) Feature vector updates each 0.25 seconds 8

v32 Looking at Possible Feautues Examples: walking, running, jumping y-axis: Jumping Window size: 28 Running To distinguish running from jumping, the main frequency component is relevant. x-axis: Walking Window size: 28 9

Folie 9 v32 In running, you go at such velocity that once you hit the floor with your foot, the fast change in velocity means a peak in the acceleration. That peak, much bigger in running than in walking, is the reason that the signal is not that symmetric anymore and why you observe an offset, a mean value. In walking, there is also an offset, but not that big, and that will depend also in the shoes you are wearing. vera_ma;.08.2009

v3 Final features Examples: walking, running, jumping y-axis: Jumping Running Jumping Window size: 52 The same feature at a different window length could be used for different activities. x-axis: Walking Window size: 32 0

Folie 0 v3 In running, you go at such velocity that once you hit the floor with your foot, the fast change in velocity means a peak in the acceleration. That peak, much bigger in running than in walking, is the reason that the signal is not that symmetric anymore and why you observe an offset, a mean value. In walking, there is also an offset, but not that big, and that will depend also in the shoes you are wearing. vera_ma;.08.2009

v42 Final features Examples: standing, sitting, lying y-axis: Window size: 28 Standing Sitting Standing and sitting is one of the most difficult distinctions. Lying x-axis: Window size: 28

Folie v42 In running, you go at such velocity that once you hit the floor with your foot, the fast change in velocity means a peak in the acceleration. That peak, much bigger in running than in walking, is the reason that the signal is not that symmetric anymore and why you observe an offset, a mean value. In walking, there is also an offset, but not that big, and that will depend also in the shoes you are wearing. vera_ma;.08.2009

v33 Final features Examples: jumping, falling y-axis: Window size: 28 Jumping Falling Jumping Falling can be also distinguished from the rest. x-axis: Window size: 32 2

Folie 2 v33 In running, you go at such velocity that once you hit the floor with your foot, the fast change in velocity means a peak in the acceleration. That peak, much bigger in running than in walking, is the reason that the signal is not that symmetric anymore and why you observe an offset, a mean value. In walking, there is also an offset, but not that big, and that will depend also in the shoes you are wearing. vera_ma;.08.2009

Labelled Data Set 6 people: 6 females and 0 males Different shoes and body build/constitution Fixed schedule but some individual freedom allowed during every activity. Activities: Standing ~ 07 minutes Sitting ~ 55 minutes Lying ~ 25 minutes Walking ~ 70 minutes Total: Running ~ 5 minutes Jumping ~ 7 minutes over 4h 30 min Falling ~ 2 minutes Transitions: Up ~ 3 minutes Down ~ minute Accelerating ~ minute Decelerating ~ 0.4 minutes 3

v35 Final features 9 features were selected: 8 features from the norm of the acceleration 3 features from horizontal acceleration BF 5 features from vertical acceleration BF feature from horizontal angular velocity BF feature from attitude Computed for different window lengths feature from vertical acceleration GF 4

Folie 4 v35 Which gives us robust information? Which gives new information to the system? Which is a good discriminator? Does it make sense? Which window length is the best? Remember real time! vera_ma; 07.08.2009

Towards a Bayesian Estimation Solution activity Feet Complex Body Motion Patterns Body Motion Hip Acceleration Turn rate Trunk Attitude Arms Hidden nodes Signals and Features 5

Recognition algorithm Naïve Bayes (NB) Approach activity Features It considers features to be independent: p( f M i = i, f2,... fm act ) p( f j act ) j= 6

Learning the Full Bayesian Network Structure and parameter learning: Greedy Hill Climber with Random Restarts based on the Cooper and Herskovits Log score for fully observed data sets and Dirichlet distributions of the conditional probability tables (conjugate prior). Typically 0 8 ~ 0 9 network structures searched (~-5 days). Added restrictions and modifications: Impose causality: activity always a parent of features. Limiting the number of parents of every node. 7

Recognition algorithm Unrestricted Bayesian Network (BN) Approach activity e e e e e e e Features Observations are given for all nodes in the Markov blanket of activity : Very simple inference! We just compute the joint probability distribution without propagating evidences! We can throw out unnecessary/redundant features! e 8

a v a 32 32 Short term vertical and overall acceleration Activity Medium term attitude, a ρ a x 28 LPF a < 2.2Hz 28 LPF a < 2.2Hz 52 64 att x,z Main freq 28 a 256 σ a a 52 28 a v,max 28 IQR a v δα X S, Z G 28 att x 64, y, z Changes in acceleration Long term Vertical acceleration, roll and main acc. frequency Medium term 9

Evaluation Four recognition algorithms: Static Naïve Bayes (NB) Dynamic Naïve Bayes (DNB) Static unrestricted Bayesian network (BN) Dynamic unrestricted Bayesian network (DBN) Each of them defined for several discretizations 20

Two-Tiered Estimation: ) Bayesian NW and 2) Discrete First Order Markov Process t = k- activity Dynamic state transition model: Use a grid based estimator t = k activity Bayesian NW Signals and Features 2

Static vs. Dynamic Example Static BN Dynamic BN 22

Recognition delay Able to recognize the activities nearly 00% but misclassification of the activities often occurs at their onset: Labeling manually possible human errors Sliding window processing Delay Impose a recognition delay of 0.5s Activity must be stable for 0.5s until flagged 23

Evaluation: Recall and precision Ground truth Classification Standing Others Standing TP FN Others FP TN TP = true positive TN = true negative FN = false negative FP = false positive We compute precision and recall for every activity. Which % of the positive cases did we catch? You recognize it when it happens! Which % of the positive predictions was correct? You do not classify something else as it. TP Recall = TP+FN TP Precision = TP+FP 24

Evaluation: Recall and precision Dynamic unrestricted Bayesian network approach achieves the best results improving highly the recognition of activities such as jumping and falling. 25

Distinguishing Activities ACTIVITIES Walking from running HOW TO? Easy to distinguish Standing from Sitting Running from Jumping Falling from jumping Mainly the attitude Dynamic model is needed Main frequency component Could be joined as doing sports Mainly the attitude Both are difficult to characterize set of features that reflects all the physical phenomena related to these activities 26

Outlook Evaluation of the system under naturalistic conditions. Add more activities such as going into a vehicle, cycling, going in an elevator Try the system in the pocket Increasing delay to distinguish falling from jumping or sitting down rapidly Infer high-level activities such as being in a meeting, being busy, cooking, 27

Thanks for your attention! You can try it yourself: Demo during coffee break Maria Vera-Nadales Master s Thesis and videos, slides, papers, and the data set are available at: www.kn-s.dlr.de/activity 28

Evaluation Real time evaluation Intel Core 2 Duo microprocessor, E8400 at 3.00 GHz 2 GB of RAM Operation Feature,5 computation NB inference 0,34 DNB inference 0,36 BN 7,2 DBN 7,7 Mean value (ms) Feature computation is not time consuming Use the dynamic approach does not have a high cost in terms of real time computation What is more expensive in terms of execution time is increasing the complexity of the BNs! 29

v39 Our approach for Activity Recognition Assumptions Initial activity: standing Sensor position: attached to the belt 30

Folie 30 v39 Standing still at the beginning for almost 0 seconds Requirement for the sensor attitude computation and for some features related to standing still. Sensor attached to the belt The wearer doesn t take it with the hand or changes its location. If it does, the recognition is not reliable. Well put sensor To quit in the future with inertial navigation mechanization vera_ma; 07.08.2009

v38 Final features Examples: static, walking y-axis: Walking Window size: 256 Static from walking is not difficult to distinguish except for the effect of the sliding window. x-axis: Static Window size: 32 3

Folie 3 v38 In running, you go at such velocity that once you hit the floor with your foot, the fast change in velocity means a peak in the acceleration. That peak, much bigger in running than in walking, is the reason that the signal is not that symmetric anymore and why you observe an offset, a mean value. In walking, there is also an offset, but not that big, and that will depend also in the shoes you are wearing. vera_ma;.08.2009

Recognition algorithm Feature discretization 2D Identifying the activities the features are meaningful or good for. Looking for meaningful areas in the plots to define the states of a pair of features. 32

Dynamic Bayesian Filter: Grid Based p 2 act t ( : act t O t ) act t 2 p ( act t + actt HIDDEN STATES p ( act t + actt 2 p ( act t + actt ) p( act A 2 2 p ( act t + actt 2 j j t+ O: t ) = p( actt O: t ) p( actt+ actt ) j= ) ) ) 2 act t + act t + p ( : 2 act t O t ) p( act 2 2 j 2 j t+ O: t ) = p( actt O: t ) p( actt+ actt ) j= OBSERVATIONS Time t 33

HIDDEN STATES 2 act t + act t + p( act p( act O ) t+ : t+ = 2 O ) 2 t+ : t+ = 2 j= p( act j= p( act t+ p( act 2 t+ p( act j t+ O j t+ O : t O : t O : t ) p( O : t ) p( O t+ ) p( O t+ ) p( O t+ act t+ act t+ act 2 t+ act ) j t+ ) j t+ ) ) p( O t act 2 + t+ ) p( O t act B + t+ ) O t+ OBSERVATIONS Time t 34

Short term vertical and overall acceleration a v 32 Particularly interesting for jumping: Is around 0 in falling phase, particularly high when hitting the floor Activity Correlation coefficient between the x-axis of the sensor frame and acceleration High values for running and walking, as they relate to the vertical axis of the human body, X S Medium term attitude, a ρ a x 28 Short time activities (falling or jumping) a 32 Changes in acceleration Long term Low pass filter of the norm of the jerk for repetitive activities Together with LPF 28 good identification of running and walking Attitude of the sensor with x and z-axis. Variations indicate a change regarding sitting and standing LPF a < 2.2Hz 28 LPF a < 2.2Hz 52 64 att x,z Low pass filter of the norm of the jerk for static activities Some information about short term activities 28 samples as tradeoff for repetitive and short term activities Differentiation falling / jumping vs. running Main freq 28 a Vertical acceleration, roll and main acc. frequency Medium term 256 σ a a 52 28 a v,max 28 IQR a v δα X S, Z G 28 att x 64, y, z 256 samples to cope with dynamic, repetitve patterns Distinguishes static (low values) and dynamic (high) activites. walking vs. running Repetitive activities: (running, walking) Maximal vertical acceleration: Together with IQR good to separate walking from running/jumping Inter Quartile Range represents the deviation range from the median Differentiates static from non-static, as well as falling from running/jumping Difference between biggest and smallest angle between global and local vertical axis. High values during falling Attitude in 3D Useful together with att x,z Distinction between lying / falling and upright activities. 35

Evaluation Maximal reduction of a machine-learnt Bayesian network if all input values are observed Machine-learnt Bayesian network. It has 9 nodes and 49 edges. 36