Natural Language Processing
One of the principal applications of Artificial Intelligence (AI) in psychology is the Natural Language Processing (NLP). This AI technology can understand, interpret and manipulate human language. According to Rong et al. (2020) after collecting data for behavioral analysis, NLP techniques are based in AI application through embedded machine learning.
Studies like Hirschberg and Manning’s (2015) pointed out that AI helps to better understand language through high performance tools and methods. From the data collected, these new methods allow the analysis of speech identifying syntax, semantic information and context.
Understanding language as a communication method, and communication as a qualitative source, makes it hard to think that language, or communication, can be measured with quantifiable techniques. Until now, communication was defined as part of social sciences, where usually qualitative techniques take place for its measurement, as well with words and descriptions. The opposite occurs with numbers which can be measured in a completely quantitative way (Taylor y Bogdan, 1987).
As exposed by Steckler et al., (1992), this differentiation has caused that researchers divide between qualitative and quantitative when selecting the techniques to work with the different types of data, assigning one or the other to their investigation according to the field they are working on. Despite this a big debate has broken out in the social sciences applied to the Health field, as researchers on these areas consider their investigation can not be solely framed under qualitative research, mainly due to its lack of validity.
Qualitative research methods have been used for decades in diverse disciplines of social sciences. For example, in psychology it has been used to analyze words said by patients who attend therapy. Nevertheless, this method takes a lot of work and effort besides a big amount of personal resources and materials.
In Erudit AI we analyze text through qualitative methods, which consist in our researchers reading data, assigning a label to each of them and training the model with the aim of achieving a deeper text analysis. Thanks to NLP this arduous task for researchers is automated.
Studies such as Gutterman et al.’s (2018) establish that combined techniques which form the NLP are necessary for a faster qualitative code, and they also act as a base to validate a qualitative methodology.
Escrivá, Peyró, Vayá, Montell y Fabra (2020) performed a study of AI and NLP, and established these as novel area both in mental health and in the use of training analysis.
Erudit AI's Methodologies
In relation to these previous studies, Erudit AI, following the same methodology, has specialized in NLP in semantic analysis to understand the natural language applying AI. Agreeing with these authors, in Erudit AI we have confirmed that, when applying semantic analysis to the text previously analyzed through qualitative methods, the result obtained is that semantic analysis comes to the same conclusion as the qualitative method, but it also detects in the same text certain questions that qualitative methods would have missed. Besides, semantic analysis inspects a bigger amount of text in a shorter period of time.
The results from different studies confirmed that there are similarities and differences between the traditional qualitative method and the semantic analysis with regard to mental health and communications. These authors therefore support that applying AI to detect mental health aspects is a novel area with great potential that is growing all around the world.
Erudit AI applications are based in studies which confirm that AI can be used in a most beneficial way in the mental health fields through NLP. In our specific case we use semantic analysis for interpretation, understanding and manipulation of human language, with the objective of automating it and turning the Neural Networks (AI) into the ones in charge of detecting employee’s mental health problems.
In summary, changing the paradigm of what has been done until now through qualitative methods spending a lot of time, resources and materials, can now be done in an automated way through quantitative methods (semantic analysis).
There are several authors such as Maslach and Jackson (1981) and Beatrice (2020) that have concluded that burnout is a psychological syndrome associated with prolonged response to chronic work-related interpersonal stressors. It is mainly related to depression and anxiety and to symptoms such as physical and mental exhaustion.
It is an attitude towards work and a emotional state in which an employee identifies him/herself with a particular organization and its goal, wishing to remain in it and being proud of it. This denotes personal commitment and a psychological state of well-being vis-a-vis the company. Romero-Espínola and Palacini (2020) define engagement as a “positive, satisfying and work-related state of mind, characterized by dedication and absorption”.
Authors such as Cachón-Zagalaz, Lara-Sánchez, Zagalaz-Sánchez, López-Manrique and González de Mesa (2018) believe that engagement is the opposite of burnout. They state that engagement presents an energetic and affective connection with their work, with employees having the necessary tools and skills to face the demands of their work. Moreover, Maslach and Leiter (1997) consider that engagement is composed by energy, involvement and efficacy, while burnout is composed of three factors: exhaustion, cynicism and low efficacy.
This metric is based on the company’s need to prevent burnout and increase engagement. It is common that different types of conflicts arise between employees or departments. If you are able to identify those conflicts, what differences of ideas have led to those conflicts and who are the people involved in them, you will be able to mediate in tense moments (Osborne and Hammoud, 2017).
Keeping in mind at all times the semantic analysis that we are going to perform using Artificial Intelligence, Erudit AI research psychologists identify many different words/texts to be subjected to this analysis, and are responsible for performing a laborious labeling of these words/texts by hand to create probabilistic criteria and to be able to parameterize the metrics.
On the one hand, the division of the labeling numbers is based on Cantor’s set theory, which states that numbers not only converge to infinity but that there are infinities larger than others. Based on this theory, we assign words a finite probability of occurrence that in our particular case, it is from 0 to 100. We divide this probability into 6 levels (0 to 5) to catalog the words that form the text. On the other hand, the value given to each of these words is based on the criteria of the research psychologists, based on different theories.
For the burnout metric, they have taken into account the three variables on which the Maslach Burnout Inventory questionnaire was based, which are emotional exhaustion, depersonalization and low personal fulfillment. The Maslach Burnout Inventory is the classic and most used measure of burnout, published in 1981 by Maslach and Jackson, when they managed to establish some criteria for measuring and evaluating burnout. Therefore, this instrument is standardized. When running this questionnaire, a Likert-style scale ranging from 0 to 6, from never to always, is use. According to Gil-Monte, Salanova, Aragón and Schaufeli this instrument is useful to make a first measurement of burnout among employees and it allows the comparison of results obtained with a normative sample of the working population in the country being analyzed.
For the engagement metric, we have considered the Utrecht Scale of Engagement at Work (Wilmar Schaufeli and Arnold Bakker, 2003). This instrument also uses a Likert-style scale ranging from 0 to 5. It has both an English and a Spanish version.
For the friction metric, we have taken into account the Measurement of human service staff satisfaction: Development of the Job Satisfaction Survey (Spector,1985).
Accordingly, when Erudit AI’s research psychologists label the data, they assign manually to them a value from 0 to 5 depending on the level of burnout, engagement and friction that each text represents. Once the labelling is completed, they train the Neural Networks in order to obtain the highest possible reliability to human judgment.
Artificial Intelligence learns, automates and semantically links neighboring words in a different manner, in order to grab the communication's context. For this reason if we only apply a qualitative analysis method without algorithmic learning, we might lose text nuances. This combined AI method is known as Data-Driven Analysis (Rodriguez et al., 2011).
Neural Networks not only analyze the words individually, they understand the context and know how to interpret the meaning of each word in a sentence. Here is an example to help us understand this:
Neural Networks analyze the words and give them a value depending on the other words with which they are accompanied. For example, the word “depression” which is a very meaningful word, will not be evaluated in the same way if it goes after “I am suffering from” than if it goes after “Help me describe what I have” or “Help me describe what”. Neural Networks will clearly identify that in the first sentence above this word is not giving information about the person.
The application of NLP has an important advantage: the researcher or psychologist does not have to interpret the texts, and it is the algorithms that, by learning from data, generate results.