How our Emotion AI works – Responsibility, Transparency and Clarity

Vedi anche in Italiano
Last updated: August 30, 2023

Dear professional customer,

We want to provide you with transparency and clarity regarding our emotional artificial intelligence system. This document details how it works, the methods for extracting data, information about the sources of data, methods used for evaluating the system, as well as any known limitations or biases. We also provide information on potential risks and benefits of using the emotional AI system, as well as guidelines for safe and effective use.

The technology developed and distributed by Cynny S.p.A. (hereinafter “Cynny“), called “MorphCast® EMOTION AI HTML5 SDK” and/or the other available products, services and features that integrate it (e.g. MorphCast Web Apps or MorphCast Emotion AI Interactive Video Platform) is an artificial intelligence software capable of processing images from a video stream that allows you to establish, with varying degrees of probability and accuracy (see Annex A to this document), the emotional state of the framed subject and some other apparent characteristics such as, for example, the cut or color of the hair, the apparent age and gender, facial emotion expressions, in any case without any identification of the subject (hereinafter, the “Software“). For a complete list of features extracted from the Software, please refer to Annex B to this document.

This technology, which Cynny makes available to its users (qualified as “Business Users”, pursuant to the applicable “Terms of Use”, and hereinafter referred to as “Customers“) has been specifically designed and developed so as not to identify the framed subjects. These subjects are the end users of the Software chosen by the Customer (hereinafter, the “End Users“), with whom Cynny has no relationship whatsoever.

The Software can be used autonomously by the Customer or be activated, by the Customer’s choice, in the context of use of other services provided by Cynny, whose terms of use and privacy policies are available at this link: mission page. As regards the registration data of the Customers, the relative Privacy Policy will also apply, available at this link: Emotion AI – Privacy Policy.


How it works

The Software works as a Javascript library embedded in a web page usually belonging to the Customers’ domain, running in the browser of the End User device, which, as indicated above, has a relationship exclusively with the Customer (and not with Cynny).

The Software does not process biometric data that allow the univocal identification of a subject, but only an instant (volatile) processing of personal data during the processing of images (frames), read from the video stream (for example of a camera, which is used only as a sensor and not as an image and/or video recorder) provided by the End User’s browser. Cynny does not carry out any processing or control over such personal data as they are processed through the Software automatically and directly on the End User’s device.

Accordingly, the use of the Software is subject to the following terms of use, both of which are beyond Cynny’s control: (1) the terms of use of the End User’s browser in which the Software is incorporated and (2) any software or third party API required to allow the web page to function.

Notwithstanding the foregoing, as the supplier of the Software, Cynny (a) warrants that it has adopted all appropriate security measures in the design of the Software also with regard to the processing of personal data, taking into account privacy-by-design approach and ensuring adequate security of the software itself and; (b) makes itself available to collaborate with the Customer (who, in case it uses the extracted data autonomously, assumes the title of Data Controller pursuant to EU Regulation 2016/679, hereinafter “GDPR“) for any processing of personal data relating to the End User, showing and explaining all the functions of the Software, in order to enable it to carry out a precise risk analysis with respect to the fundamental rights and freedoms of the persons concerned (including any impact assessments, where necessary and / or appropriate, in the opinion of the Data Controller). In fact, it will be the Data Controller who will have to establish the purposes, methods and legal basis of the processing, the time and place (physical or virtual) for the storage of personal data relating to End Users, also providing them with specific online information. with the applicable legislation for the protection of personal data (including the GDPR).

According to the provisions of the Terms of Use of the Software (available here: Terms of Use), the same is aimed solely and exclusively at professional users, with the express exclusion of consumers and, therefore, of the use of the Software for domestic / personal purposes. In any case, whatever the type of processing carried out by the Customer, as Data Controller, the consequences deriving from the application of the applicable legislation, including that for the protection of personal data, are exclusively the responsibility of the Data Controller.

It is also specified that the Software operates integrated in a private corporate hardware / software context of the Customer’s End User (on end-point resources, including web pages belonging to the Customer’s domain and therefore under its control) equipped with security necessary to avoid the compromise of the data affected by the processing by the Software. In line with the foregoing, the consequent responsibilities relating to the processing of personal data of End Users by the Customers fall exclusively on the Customer, which therefore undertakes, in accordance with the provisions of the “Indemnification” section of the terms of use of the Software, to fully indemnify Cynny from any prejudicial consequence that may arise in this regard.

As per the emotional data made available to the Customers by the Software through the “Data Storage” module and the relative dashboard and also if the Customer uses the MorphCast Emotion AI Interactive Video Platform service by activating the “Statistic Dashboard” function, it is specified that such statistic data cannot be considered private since they are not associated with natural persons, they are anonymous and aggregated with daily granularity, furthermore they are made available only if the aggregation exceeds a minimum threshold to ensure that anybody cannot connect them to a natural person. Therefore, these data are to be considered not subject to privacy laws, including the GDPR.

For these reasons Cynny, to the extent necessary and to the maximum extent permitted by applicable law in accordance with the “Exclusive Remedy and Limitation of Liability” section of the terms of use of the Software, declines all responsibility for the content of the images and/or videos generated by the Customer, including through the use of the Software, as well as collected, stored and processed exclusively by the Data Controller and therefore cannot in any case be responsible for the correctness and/or legitimacy of such contents pursuant to the applicable legislation (including, by way of example, the regulations on privacy, copyright, morality, public order, etc.).


Risk Assessment

Cynny, while providing the Software as a tool that allows the processing of personal data relating to End Users, is not in a position to know (i) how such Software will be integrated and used by its Customers and End Users; nor, even less, (ii) which categories of personal data will be processed. In this regard, Cynny therefore qualifies as a “third party” pursuant to the GDPR, which has not received from the Customer any legitimacy to process any personal data of the End Users, as this treatment is reserved exclusively to the Customer. In any case, as indicated above, Cynny, in compliance with the general principles for the protection of privacy, has been concerned with the security aspects of the Software, making available to the Customer a Software that integrates the security measures of the host environment, against the violation or compromise of personal data (with the express exclusion of biometric data, which, as indicated above, are in no way processed by the Software) represented by the images read by the camera stream.

In fact, the video stream, for example from the camera of the host device of the End User, is made accessible to the Software, at the request of the End User, directly on the browser of that device; such access is allowed only with the authorization and control of the End User. The Software then reads the images of the stream on the End User’s browser, only if granted by the End User himself, only when the application is active and only for the purposes declared according to the specific privacy policy that will be provided by the Customer to the Final user.

Each image of the video stream remains in the volatile memory of the device (e.g. smartphone, tablet, PC etc.) of the End User, accessible from the Software within the browser sandbox, only for the time strictly necessary for processing the result, estimated in about 100ms, after which this image is destroyed and overwritten by the next image. Even the last frame of the stream is destroyed as soon as it is processed; therefore, no image is stored either in the browser of the End User device or, even less, within tools under the control of Cynny.

Cynny therefore has no control over these images, which therefore can never be transmitted, archived or shared with others by the latter.


Constant commitment to conscious and ethical use of Emotion AI

Cynny is constantly committed to providing all the information needed for better understanding and use of Emotion AI technologies. Check our guidelines and policies for a responsible use of Emotion AI. Cynny has also adopted a code of ethics which you can find here.


Contacts

For any clarifications and/or problems related to the operation of the Software with respect to privacy, please contact [email protected].

Cynny has also appointed a Data Protection Officer, who may be reached at the email address [email protected].




Annex A: transparency on the degree of probability and accuracy

Emotion AI facial recognition is a form of AI that is designed to analyze facial expressions and identify emotions such as happiness, sadness, anger, and surprise. The technology used by MorphCast is a combination of computer vision and convolutional deep neural network  algorithms to analyze images or videos of people’s faces, and identify patterns that correspond to specific emotions.

The accuracy of emotion AI facial recognition technology can be affected by a number of factors, including:

– Image or video quality: The technology is more likely to produce accurate results when the images or videos being analyzed are of high quality, with clear and well-lit faces. Factors such as very low resolution, poor lighting, or camera angle can make it more difficult for the algorithm to accurately identify emotions.

Algorithm: Different algorithms have different levels of accuracy. MorphCast algorithms use deep learning techniques like convolutional neural networks (CNNs) and are trained on large datasets, these are considered to be more accurate than the ones that use traditional machine learning techniques.

– Training data: The accuracy of emotion AI facial recognition technology can also be affected by the quality and diversity of the training data that the algorithm has been exposed to. MorphCast algorithms are trained on a diverse dataset of images and videos from people of different ages, genders, and ethnicities. They are likely to be more accurate than those that are trained on a more limited dataset.

It is also important to note that emotion recognition technology is still in its early stages, and in MorphCast there is continuously ongoing research to improve its accuracy. Furthermore, there is a degree of bias in these technologies, particularly with regards to race and gender. As a result, it is important to be aware of these biases and take steps to mitigate them. This is the commitment that MorphCast has given itself since its inception. 

Precisely with regard to the Bias we want to cite a recent research on it:

Intersectionality in emotion signaling and recognition: The influence of gender, ethnicity, and social class.

Abstract: Emotional expressions are a language of social interaction. Guided by recent advances in the study of expression and intersectionality, the present investigation examined how gender, ethnicity, and social class influence the signaling and recognition of 34 states in dynamic full-body expressive behavior. One hundred fifty-five Asian, Latinx, and European Americans expressed 34 emotional states with their full bodies. We then gathered 22,174 individual ratings of these expressions. In keeping with recent studies, people can recognize up to 29 full-body multimodal expressions of emotion. Neither gender nor ethnicity influenced the signaling or recognition of emotion, contrary to hypothesis. Social class, however, did have an influence: in keeping with past studies, lower class individuals proved to be more reliable signalers of emotion, and more reliable judges of full body expressions of emotion. Discussion focused on intersectionality and emotion. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

Emotional expressions

Additionally, many studies have shown that emotion recognition technology is more accurate for certain emotions (e.g. happiness, anger) than others (e.g. sadness, surprise), and there is a degree of variability across different algorithms, which makes it challenging to give a general number for the accuracy of these technologies. Despite this, the current emotion prediction accuracy can be considered between 70% and 80% for almost all algorithms on the market.

In this regard, MorphCast has been evaluated in its accuracy by a university study that compares the accuracy of different software, detected on the six basic emotions of Paul Ekman, with the perception of the human being. These are the results of the accuracy study:

Damien Dupré, Eva G. Krumhuber, Dennis Küster, & Gary McKeown by respectively Dublin City University, University College London, University of Bremen, Queen’s University Belfast published here: Emotion recognition in humans and machine using posed and spontaneous facial expression. The “Emotional Tracking (SDK v1.0) was developed by the company MorphCast founded in 2013. Emotional Tracking SDK has the distinction of being a javascript engine less than 1MB and works directly on mobile browsers (i.e, without remote server and API processing).”

Comparison of emotion recognition in humans and machine using posed and spontaneous facial expression (general)

Comparison of emotion recognition in humans and machine using posed and spontaneous facial expression (for each emotion)

 



Annex B: type of data extracted

MorphCast SDK has a modular architecture which allows you to load only what you need. Here a quick description of the modules available:

  • FACE DETECTOR
  • POSE
  • AGE
  • GENDER
  • EMOTIONS
  • AROUSAL VALENCE
  • ATTENTION
  • WISH
  • POSITIVITY
  • ALARMS
  • OTHER FEATURES


FACE DETECTOR

It detects the presence of a face in the field of view of the webcam, or in the input image.

The tracked face is passed to the other modules and then analyzed. In case of multiple faces, the total number of faces is available but, then, only the main face can be analyzed.


POSE

It estimates the head-pose rotation angles expressed in radians as pitch, roll and yaw.

The ZERO point is when a face looks straight at the camera

AGE

It estimates the likely age of the main face with a granularity of years, or within an age group for better numerical stability. 


GENDER

It estimates the most likely gender of the main face, Male or Female. 


EMOTIONS

It estimates the presence and the respective intensities of facial expressions in the format of seven core emotions – anger, disgust, fear, happiness, sadness, and surprise, plus the neutral expression – according to the Ekman discrete model

AROUSAL VALENCE

It estimates the emotional arousal and valence intensity. According to the dimensional model of Russell. Arousal is the degree of engagement (positive arousal), or disengagement (negative arousal); valence is the degree of pleasantness (positive valence), or unpleasantness (negative valence).

It outputs an object containing the smoothed probabilities in the range 0.00, – 1.00 of the 98 emotional affects, computed from the points in the 2D (valence, arousal) emotional space, according to the mappings of Schrerer and Ahn:

 

Adventurous

Afraid

Alarmed

Ambitious

Amorous

Amused

Angry

Annoyed

Anxious

Apathetic

Aroused

Ashamed

Astonished

At Ease

Attentive

Bellicose

Bitter

Bored

Calm

Compassionate

Conceited

Confident

Conscientious

Contemplative

Contemptuous

Content

Convinced

Courageous

Defiant

Dejected

Delighted

Depressed

Desperate

Despondent

Determined

Disappointed

Discontented

Disgusted

Dissatisfied

Distressed

Distrustful

Doubtful

Droopy

Embarrassed

Enraged

Enthusiastic

Envious

Excited

Expectant

Feel Guilt

Feel Well

Feeling Superior

Friendly

Frustrated

Glad

Gloomy

Happy

Hateful

Hesitant

Hopeful

Hostile

Impatient

Impressed

Indignant

Insulted

Interested

Jealous

Joyous

Languid

Light Hearted

Loathing

Longing

Lusting

Melancholic

Miserable

Passionate

Peaceful

Pensive

Pleased

Polite

Relaxed

Reverent

Sad

Satisfied

Selfconfident

Serene

Serious

Sleepy

Solemn

Startled

Suspicious

Taken Aback

Tense

Tired

Triumphant

Uncomfortable

Wavering

Worried

    

 

Circumflex model of affects (Representation in the 2D emotional space with 98 words)
Representation in the 2D (valence, arousal) emotional space

Circumflex model of affects (Representation in the 2D emotional space with 38 words)
The simplified 38 emotional affects output can be useful in several cases

Circumflex model of affects (Representation in the 2D emotional space with quadrant only)

An output with the quadrant of the 2D emotional space is also provided. This gives information about the appraisal dimensions: goal conduciveness/obstructiveness and coping potential (control/power).

 

ATTENTION

It estimates the attention level of the user to the screen, considering whether the user’s face is in or out of the field of view of the webcam, head position and other emotional and mood behavior. The speed of attention fall and rise can be properly tuned in an independent manner according to the use case.


WISH 

It estimates the value of the MorphCast® Face Wish index. This is a proprietary metric that, considering the interest and sentiment of a customer, summarizes in a holistic manner his/her experience about a particular content or product presented on the screen. This output is very sensible to the variation of expressions, giving him a value in estimate a change of sentiment looking a product or a scene in a video.

POSITIVITY

It gauges the intensity of arousal and valence based on the 17-degree angle of the circumplex model of affect (Russel). This exclusive metric provides a comprehensive overview of an individual’s positivity, capturing facial expressions.

α = 17 (customizable)

This output can be used for multiple types of detection by modifying the alpha angle on the whole model.

ALARMS

Several alarm are outputing by the AI engine to help developer to trigger reactions at possible cheating situations (NO FACE, MORE FACES, LOW ATTENTION…)

OTHER FEATURES

It estimates the presence of the following face features:

Arched Eyebrows

Double Chin

Narrow Eyes

Attractive

Earrings

Necklace

Bags Under Eyes

Eyebrows Bushy

Necktie

Bald

Eyeglasses

No Beard

Bangs

Goatee

Oval Face

Beard 5 O’Clock Shadow

Gray Hair

Pale Skin

Big Lips

Hat

Pointy Nose

Big Nose

Heavy Makeup

Receding Hairline

Black Hair

High Cheekbones

Rosy Cheeks

Blond Hair

Lipstick

Sideburns

Brown Hair

Mouth Slightly Open

Straight Hair

Chubby

Mustache

Wavy Hair


Bibliography