An integrated telemedicine platform for the assessment of affective physiological states
© Katsis et al. 2006
Received: 15 February 2006
Accepted: 01 August 2006
Published: 01 August 2006
Skip to main content
© Katsis et al. 2006
Received: 15 February 2006
Accepted: 01 August 2006
Published: 01 August 2006
AUBADE is an integrated platform built for the affective assessment of individuals. The system performs evaluation of the emotional state by classifying vectors of features extracted from: facial Electromyogram, Respiration, Electrodermal Activity and Electrocardiogram. The AUBADE system consists of: (a) a multisensorial wearable, (b) a data acquisition and wireless communication module, (c) a feature extraction module, (d) a 3D facial animation module which is used for the projection of the obtained data through a generic 3D face model; whereas the end-user will be able to view the facial expression of the subject in real time, (e) an intelligent emotion recognition module, and (f) the AUBADE databases where the acquired signals along with the subject's animation videos are saved. The system is designed to be applied to human subjects operating under extreme stress conditions, in particular car racing drivers, and also to patients suffering from neurological and psychological disorders. AUBADE's classification accuracy into five predefined emotional classes (high stress, low stress, disappointment, euphoria and neutral face) is 86.0%. The pilot system applications and components are being tested and evaluated on Maserati's car. racing drivers.
The use of emotional understanding using computers is a field of increasing importance. In many ways emotions are one of the last and least explored frontiers of intuitive human computer interaction. This may be explained by the fact that computers are traditionally viewed as logical and rational tools, something that is incompatible with the often irrational and seeming illogical nature of emotions . It is also apparent that we as humans, while extremely good at feeling and expressing emotions, still cannot agree on how they should best be defined .
After a century of research, there is little agreement about a definition of emotions and many theories have been proposed. A number of these could not be verified until recently when improved measurement of specific physiological signals became available. In general emotions are short-term, whereas moods are long-term, and temperaments or personalities are very long-term . Furthermore, the physiological muscle movements, comprising what looks to an outsider to be a facial expression, may not always correspond to a real underlying emotional state.
Emotion consists of more than its outward physical expression; it also consists of internal feelings and thoughts, as well as other internal processes of which the person experiencing the emotion may not be aware. As machines and people begin to co-exist and cooperatively share a variety of tasks, the need for machines to constantly evaluate the affective condition of humans becomes more than apparent [4, 5]. This has prompted researchers in the engineering and computer science communities to develop automatic ways for computers to recognise emotions. The labelling of emotions into different states led most researchers to use pattern recognition approaches for recognising emotions, using different modalities as inputs to the emotion recognition models. The work in automatic understanding of affective condition has focused on classification of the universal expressions (FACS) defined by Ekman . These expressions are sadness, anger, fear, disgust, surprise, happiness, neutral and contempt. Thus, the implemented algorithms were tailored towards developing models to recognise the universal expressions from static images or video sequences [7–11]. These facial actions are essentially facial phonemes, which can be assembled to form facial expressions. There are also recent methods that employ a combination of audio and video signals for emotion recognition [12–18].
One of the hallmarks in emotion theory is whether distinct physiological patterns accompany each emotion . Ekman et al.  and Winton et al.  provided some of the first findings showing significant differences in autonomic nervous system signals according to a small number of emotional categories or dimensions, but there was no exploration of automated classification. Flidlund and Izard  appear to be the first who applied pattern recognition (linear discriminants) on the classification of four different emotions (happiness, sadness, anger, fear) from physiological signals, attaining rates of 38–51 % accuracy. Similar efforts aimed at finding physiological correlates, focusing on t-tests or analysis of variance comparisons and combining data over many subjects, where each was measured for a relatively small amount of time [23, 24]. Finally Picard et al.  classified physiological patterns for a set of eight emotions (including neutral) by applying pattern recognition techniques and focusing on felt emotions of a single subject over sessions spanning many weeks.
Although dealing with emotion recognition, the aforementioned techniques present the following limitations: (i) they are all materialized in laboratory environments therefore their effectiveness in real conditions is unknown, (ii) they are not real time and (iii) the data acquisition systems used for them are not wearable. The work in this paper is novel, since it presents a system that automatically monitors and classifies the psychological condition of human subjects from a set of emotions. The system is designed to be applicable to persons operating under extreme stress conditions, such as car-racing drivers. Medical applications are mainly based on the ability of supporting clinical diagnosis related to all the pathologies according to which the patient's capability to feel and express emotions is limited or totally absent.
When we are frightened, our heart races; our breathing becomes rapid; our mouth becomes dry; our muscles tense; and our palms may become sweaty. These bodily changes are mediated by the autonomic nervous system, which controls heart muscle, smooth muscle, and exocrine glands . The autonomic nervous system itself can be divided into sympathetic and parasympathetic branches. Both operate in conjunction with each other and with the somatic motor system to regulate most types of behavior, whether in normal or emergency situations. Certain emotions may result in a wide variety of bodily reactions comparable to the ones described above. These bodily reactions can be monitored and measured. Our goal is to use these reactions and by means of special bio-sensors, to deduce the emotional state of the user.
AUBADE estimates the emotional state of human subjects by classifying vectors of features extracted from: Facial Electromyogram (EMG), Respiration, Electrodermal Activity (EDA) and Electrocardiogram (ECG).
It is also referred as skin conductance activity because of the underlying principle of measurement. EDA describes alterations – in skin's ability to conduct electricity – that occur due to interactions between environmental events and an individual's psycho-physiological state. More Specifically, it is related to sympathetic nervous system activity, which innervates the eccrine sweat glands; and has been associated with measures of emotion, arousal, and attention . The EDA reading is typically characterized by two components: a tonic baseline level and short term phasic responses superimposed on the tonic baseline level. Phasic responses (momentary increases in skin conductance) determine the event-related responses that occur in an individual, due to environmental stimuli. A stimulus may be anything from a thought burst to a deep sigh. EDA is one of the fastest, most robust and well-studied physiological measures. It has been previously employed in assessing the difficulty of driving tasks ; in determining stress in anticipatory anxiety studies  and as part of lie detectors .
It refers to the muscle activity or frequency of muscle tension of a certain muscle. Muscle activity has been shown to increase during stress. People may unconsciously clench their muscles in a state of mental stress or fatigue even when no physical activity is required . Firing from this muscle could indicate either unconscious clenching due to stress or firing due to motion.
The ECG signal is the manifestation of contractile activity of the heart. Heart activity is a valuable indicator of the individual's overall activity level. For example heart rate accelerations occur in response to exercise, emotional states, loud noises, sexual arousal and mental effort . Lower heart rate is generally associated with a relaxed state or a state of experiencing pleasant stimuli.
Respiration is an indicator of how deep and fast a person is breathing. Emotional excitement and physical activity are reported to lead to faster and deeper respiration . Peaceful rest and relaxation lead to slower and shallower respiration. A state of stress would therefore be indicated by frequent respiration; however, sudden stressors such as startle tend to cause momentary cession of respiration.
It is a non-invasive, ergonomic, comfortable and easy to use wearable that includes a number of sophisticated bio-sensors gathering raw physiological data (facial EMG, Resp, EDA and ECG). The wearable is composed of three pieces: i) the mask containing sixteen EMG textile fireproof sensors, ii) the three-lead ECG and Respiration sensors on the thorax of the driver and iii) the EDA textile and fireproof sensor placed inside the drivers glove.
Facial EMG (16 channels)
Low pass (500 Hz)
ECG (3 channels)
Low pass (100 Hz)
AUBADE's wireless communication module is activated by the system end-user and is responsible for the secure transfer of the vital signs collected and processed by the Data Acquisition Unit. The user measurements are transferred through either the existing or a wireless LAN (Bluetooth or IEEE 802.11B) to the Centralised System for further analysis. Bluetooth is superior for medical applications based on the following properties:
(i) range: The range can vary from 1 m (Class 3) to 100 m (Class 1). No direct optical connection is necessary.
(ii) bandwidth: The bandwidth is up to 721 kbit/s in one direction. These values are theoretically sufficient for about 100 ECG channels and can be robustly attained even in a "noisy" environment by means of frequency hopping .
The pre-processed biosignals are converted into vectors of extracted features that can be used by the Intelligent Emotion Recognition module in order to determine subject's basic emotions. The selected features provide a combination of simple statistics and complicated characteristics which are related to the nature of the physiological signals and the underlying classification problem. Furthermore, in this module sensor behaviour is also controlled; unseemly signals are not taken into account for processing and no features are extracted.
They compute vectors of mean and median frequencies over time for a specific input signal.
Means of the absolute values of the first and second differences (mean_abs_fd and mean_abs_sd): For an acquired biosignal X N = (x1, x2,...,x N ) the mean_abs_fd and mean_abs_sd are defined as:
where x i denotes a signal sample and N is the number of samples. These features are approximations of the first and second derivate respectively and therefore indicate fast changes in the recorded biosignals.
They compute vectors of the mean rise duration and the standard deviation over time.
It calculates vectors of the heart, respiration and EDA rate over time.
The facial animation module models the deformation of skin tissue according to a 3-layer model, consisting of skull, muscle and skin layers. Each layer consists of a number of nodes, which are connected with neighbouring nodes of the same layer and nodes in the layers above/below. Each node represents a mass and each link between nodes is modelled as a spring.
The features of the EMG signals, as extracted by the Feature Extraction Module, are used to estimate the contraction of the subject's monitored muscles. The outcome of this procedure is the quantification of muscle contraction for the sixteen muscles being monitored.
The contraction level drives the muscle model, to calculate the new position of muscle-nodes. The muscle model is simulating linear and sphincter muscles, which are the kinds of muscles involved in AUBADE.
Numerical methods, through the attachment of muscle nodes in the face's geometry, solve the mathematical model of the mass-spring network, given the new position of the muscle nodes.
The displacement of each node of the skin mesh is then applied to the face's geometry, as calculated by the mathematical model in the previous step.
The generation of muscle force is computed by using integrated EMG as a measure of muscle activity, as follows:
where S is the muscle cross-sectional area, E is the integrated EMG level normalised to a range between 0 (mean of baseline muscle activity) and 1 (maximum activity recorded, including a series of "maximal" facial gestures), and k f = 2500 dyne/cm2 is a scaling coefficient.
Although initially developed for binary classification problems, SVMs can be adapted to deal with multi-class problems using the one-against-one method . This method constructs k(k-1)/2 classifiers (where k is the number of classes) where each one is trained using data from two classes. Although other methods for multi-class SVMs exist, the above mentioned approach has been chosen due to the low training time required and its comparable performance .
The system's databases store the acquired raw signals which are ranked per user, per date, per event etc. They can be recalled any time from this database and can be analysed by specialists and researchers who are able to draw statistical and other information. The databases also store the medical history of the subjects as well as their facial animation videos.
After the extraction process where the abovementioned feature extraction algorithms are applied, vectors of the desired features are formed for each type of signal. A dataset is created containing the vector of extracted features along with the expert's annotation for every period of 10 s. This time window is a significant factor, for the output of the AUBADE intelligence module, since it determines how often it will provide updates about the emotional state of a user. The objective of a real time or near real time emotion classifier is to first recognize as correctly as possible the emotional state of the user (high classification rate), and second to recognize it as soon as possible (high sensitivity). The former suggests a large window size, to minimize variance in the features within a class. On the contrary, the latter suggests a small window size. The 10 second period window has been identified as the suitable compromise between these two arguments, based on the acknowledgment that there is a time delay between the instance that the subject experienced an emotion and the corresponding response changes in the selected biosignals .
The system has been validated using data obtained from four drivers in simulated race conditions. An experienced psychologist supervised the whole procedure and annotated each driver's emotional state every 10 s. The emotional classes identified were high stress, low stress, disappointment, euphoria and neutral face. The extracted vector of features along with the expert's annotation for every period of 10 s constituted the dataset for the classifier. The classification into predefined emotional classes was achieved using SVM with RBF kernel.
The AUBADE system recognizes and estimates basic emotions in real-time, in the form of a "diagnosis". AUBADE is a multifunctional system that can be utilized in many different ways and in multiple application fields.
The system's clinical application is based on the ability of supporting clinical diagnosis related to all the pathologies according to which the patient's capability to feel and express emotions is limited or totally absent. In those cases, doctors need to know the physiological condition of their patients. This is achieved by recording the expressions of the patient's face. Thus, muscle spasms as well as skin conductivity measurements are of key importance. As far the medical domain, the system is applied in the following cases:
In general patients affected by Parkinson's disease lose their capability to express emotions and become inexpressive. AUBADE will be used on patients affected by Parkinson's disease at different stage of disease development (classified using Unified Parkinson's Disease Rating Scale), in order to assess the capability to express emotions.
Stroke deeply impacts emotional behaviour and Stroke survivors often show inappropriate emotions and extreme mood fluctuations. In particular, they may laugh when something isn't funny or cry for no apparent reason. AUBADE system will then be used to correlate emotions with the stage of disease.
Patients with Huntington's disease show deficits in the recognition of anger and fear, and an especially severe problem with disgust, which was recognized only at chance level. Consequently, some neurologists are investigating if the same patients may be able of feeling and expressing disgust themselves.
lesions of the supplementary motor area (medial part of the frontal lobe), which lead to contralateral facial paresis, with spontaneous emotional expression more affected than voluntary.
lesions of the motor cortex (also with contralateral facial hemiparesis), which affect voluntary movements but leave intact spontaneous smiling.
frontal lobe lesions, which lead to fewer spontaneous expressions of brow raising, smiling, lip tightening, tongue protrusion, etc. during neuropsychological testing of brain injured subjects.
The effects of biofeedback, used therapeutically for this condition could be tracked using AUBADE for facial expression analysis.
AUBADE will be used mainly to detect the absence of a specific emotion (in this case we are talking about a basic emotion according to FACS methodology) as well as the absence of every type of emotion (or the presence of a neutral face independently from the stimulus provided) as well as the presence of a "wrong" (not expected) emotion in response of a particular stimulus provided.
As far as the car racing domain, AUBADE will be a useful tool for the mechanics of car racings, because they will be able to monitor emotionally the users. Moreover the car's setting and development will not only be based in subjective questionnaires filled by the driver, but in driver's emotional state (fear, stress level), which straightly correlates with the car's performance. Finally, it may reduce accidents in car racings. Emotions and our psychological situation generally affect our behavior and reactions. Thus, if some emotion is detected that in some way may affect the behavior of the user, then the observer will be able to provide him with additional advices and guidance, preventing some reaction of the user that would be fateful.
AUBADE's classification accuracy into five predefined emotional classes is 86.0%. It must be noticed that the above results, although promising, are only indicative. The system will be extensively tested and evaluated on car racing drivers of Maserati, following all relevant Federal Insurance Administration (FIA) regulations and other European ethical directives in relation to privacy of personal data and secure transfer of medical information.
A novel system that automatically monitors and classifies the psychological condition of human subjects from a set of emotions by applying pattern recognition techniques is presented. AUBADE estimates the emotional state of human subjects by classifying vectors of features extracted from: facial Electromyogram, Respiration, Electrodermal Activity and Electrocardiogram. It is designed to be applicable to persons operating under extreme stress conditions, such as car racing drivers. In the medical field, AUBADE may be effectively utilized for patients suffering from neurological and psychological disorders.
The usual way to assess human emotion is by employing advanced image-processing techniques in order to extract the facial characteristics. In our case, it is very difficult to apply image-processing techniques, since for safety reasons the users are wearing a mask and above it a casque. The proposed system realises an alternative method in order to record the facial expressions of the subject. Instead of using image-processing techniques, AUBADE utilizes the processing of surface EMG sensors, placed on the fireproof mask that the users are currently wearing.
A computational method for emotion recognition utilizing an SVM classifier is introduced. The method appears to have high performance both in terms of accuracy and computational efficiency. Due to the fact that emotions vary from person to person, the system must be trained using a variety of subjects and then tested for its performance.
The work is part funded by the European Commission; project AUBADE (IST – 2002 – 507605).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.