Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz
| Ano de defesa: | 2025 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Tese |
| Tipo de acesso: | Acesso aberto |
| Idioma: | por |
| Instituição de defesa: |
Universidade Federal da Paraíba
Brasil Ciências Exatas e da Saúde Programa de Pós-Graduação em Modelos de Decisão e Saúde UFPB |
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://repositorio.ufpb.br/jspui/handle/123456789/37244 |
Resumo: | INTRODUCTION: Voice is produced by a complex neurophysiological process that involves several structures and systems of the body, and can be influenced by the individual's emotional state and personality traits. The construction of a model that uses the voice signal in different emotions will serve as a basis for the development and development of recognition systems that are robust in the automation of emotion recognition. The definition of acoustic characteristics specific to each emotional state may allow the effective separation of emotions through voice pattern recognition techniques. OBJECTIVE: To develop a model based on fuzzy logic capable of recognizing emotions from the voices of Brazilian Portuguese speakers. METHOD: This is a technological, descriptive, observational and cross-sectional study that used secondary data from the Brazilian Voice Bank in Variations of Emotions - EmoVox-BR, which has a data set composed of 182 sound signals associated with the basic emotions: joy, fear, sadness, anger, surprise, disgust and neutral state, produced by 26 professional actors and actors in training. Acoustic parameters such as fundamental frequency (fo), jitter, shimmer, glottal noise measures Glottal to Noise Excitation Ratio (GNE) and Harmonics-to-Noise Ratio (HNR), cepstral measures Cepstral Peak Prominence Smoothed (CPPS), Mel-Frequency Cepstral Coefficients (MFCC) and acoustic-prosodic parameters of fo, duration and intensity were extracted from the EmoVox-BR data. From these, the emotion recognition model was developed. RESULTS: The study presented an innovative model based on fuzzy logic for the recognition of emotional states from voice, with the integration of carefully selected acoustic and prosodic parameters. The model uses 18 input and 7 output parameters. Twenty-three fuzzy rules were implemented to discriminate seven emotional categories. The model presented an accuracy of 89.01% and a Kappa coefficient of 87.18%, with sensitivity ranging from 76.92% for happiness to 100% for sadness and neutral state. Specificities were higher than 93% in all categories, indicating a high capacity for emotional differentiation. Overlapping variables, such as some acoustic and mel- cepstral parameters, were removed to simplify the model and refine its performance. In addition to optimizing simplicity and efficiency in emotion recognition, the model outperformed conventional methods in robustness compared to machine learning algorithms such as Random Forest and Kernel SVM. CONCLUSION: The emotion recognition model, developed with fuzzy logic from the human voice, achieved high accuracy and precision in differentiating emotions. The result predicts promising applications in affective computing and interactive technologies; the approach demonstrates effectiveness in capturing emotional nuances, with adaptation to different contexts to make human-machine interactions more empathetic and personalized. |
| id |
UFPB_eb81f07ce7d98cb44850ae9a21023ff3 |
|---|---|
| oai_identifier_str |
oai:repositorio.ufpb.br:123456789/37244 |
| network_acronym_str |
UFPB |
| network_name_str |
Biblioteca Digital de Teses e Dissertações da UFPB |
| repository_id_str |
|
| spelling |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da vozInteligência artificial - VozReconhecimento de vozEmoções expressasProcessamento de vozLógica fuzzyVoice recognitionExpressed emotionsVoice processingArtificial intelligenceFuzzy logicCNPQ::CIENCIAS DA SAUDE::SAUDE COLETIVAINTRODUCTION: Voice is produced by a complex neurophysiological process that involves several structures and systems of the body, and can be influenced by the individual's emotional state and personality traits. The construction of a model that uses the voice signal in different emotions will serve as a basis for the development and development of recognition systems that are robust in the automation of emotion recognition. The definition of acoustic characteristics specific to each emotional state may allow the effective separation of emotions through voice pattern recognition techniques. OBJECTIVE: To develop a model based on fuzzy logic capable of recognizing emotions from the voices of Brazilian Portuguese speakers. METHOD: This is a technological, descriptive, observational and cross-sectional study that used secondary data from the Brazilian Voice Bank in Variations of Emotions - EmoVox-BR, which has a data set composed of 182 sound signals associated with the basic emotions: joy, fear, sadness, anger, surprise, disgust and neutral state, produced by 26 professional actors and actors in training. Acoustic parameters such as fundamental frequency (fo), jitter, shimmer, glottal noise measures Glottal to Noise Excitation Ratio (GNE) and Harmonics-to-Noise Ratio (HNR), cepstral measures Cepstral Peak Prominence Smoothed (CPPS), Mel-Frequency Cepstral Coefficients (MFCC) and acoustic-prosodic parameters of fo, duration and intensity were extracted from the EmoVox-BR data. From these, the emotion recognition model was developed. RESULTS: The study presented an innovative model based on fuzzy logic for the recognition of emotional states from voice, with the integration of carefully selected acoustic and prosodic parameters. The model uses 18 input and 7 output parameters. Twenty-three fuzzy rules were implemented to discriminate seven emotional categories. The model presented an accuracy of 89.01% and a Kappa coefficient of 87.18%, with sensitivity ranging from 76.92% for happiness to 100% for sadness and neutral state. Specificities were higher than 93% in all categories, indicating a high capacity for emotional differentiation. Overlapping variables, such as some acoustic and mel- cepstral parameters, were removed to simplify the model and refine its performance. In addition to optimizing simplicity and efficiency in emotion recognition, the model outperformed conventional methods in robustness compared to machine learning algorithms such as Random Forest and Kernel SVM. CONCLUSION: The emotion recognition model, developed with fuzzy logic from the human voice, achieved high accuracy and precision in differentiating emotions. The result predicts promising applications in affective computing and interactive technologies; the approach demonstrates effectiveness in capturing emotional nuances, with adaptation to different contexts to make human-machine interactions more empathetic and personalized.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPESINTRODUÇÃO: A voz é produzida por um complexo processo neurofisiológico que envolve diversas estruturas e sistemas do corpo, pode ser influenciada pelo estado emocional do indivíduo e por características da personalidade. A construção de um modelo que utilize o sinal de voz em diferentes emoções irá servir de base para elaboração e desenvolvimento de sistemas de reconhecimento que sejam robustos na automatização do reconhecimento das emoções. A definição de características acústicas próprias de cada estado emocional poderá permitir a efetiva separação das emoções por meio de técnicas de reconhecimento de padrões de voz. OBJETIVO: Desenvolver um modelo baseado na lógica fuzzy capaz de reconhecer as emoções a partir da voz de falantes do português brasileiro. MÉTODO: Trata-se de um estudo de natureza tecnológica, descritiva, observacional e transversal, que utilizou dados secundários, provenientes do Banco de Vozes Brasileiro nas Variações das Emoções - EmoVox-BR, que conta com um conjunto de dados composto por 182 sinais sonoros associados às emoções básicas: alegria, medo, tristeza, raiva, surpresa, nojo e estado neutro, produzidos por 26 atores profissionais e em formação. Foram extraídos parâmetros acústicos como medidas de frequência fundamental (fo), jitter, shimmer, medidas de ruído glótico Glottal to Noise Excitation Ratio (GNE) e Harmonics-to-Noise Ratio (HNR), medidas cepstrais Cepstral Peak Prominence Smoothed (CPPS), Mel-Frequency Cepstral Coefficients (MFCC) e os parâmetros acústico-prosódicos da fo, duração e intensidade a partir dos dados do EmoVox-BR. A partir desses, desenvolveu-se o modelo de reconhecimento das emoções. RESULTADOS: O estudo apresentou um modelo inovador baseado em lógica fuzzy para o reconhecimento dos estados emocionais a partir da voz, com a integração de parâmetros acústicos e prosódicos selecionados de forma criteriosa. O modelo utiliza 18 parâmetros de entrada e 7 de saída. Foram implementadas 23 regras fuzzy para discriminação de sete categorias emocionais. O modelo apresentou acurácia de 89,01% e coeficiente Kappa de 87,18%, com sensibilidade variando entre 76,92% para alegria e 100% para tristeza e estado neutro. As especificidades foram superiores a 93% em todas as categorias, indicando alta capacidade de diferenciação emocional. Variáveis sobrepostas, como alguns parâmetros acústicos e mel-cepstrais, foram removidas para simplificar o modelo e refinar seu desempenho. Além de otimizar a simplicidade e eficiência no reconhecimento das emoções, o modelo superou métodos convencionais em robustez em comparação com algoritmos de aprendizado de máquina como o Random Forest e Kernel SVM. CONCLUSÃO: O modelo de reconhecimento emocional, desenvolvido com lógica fuzzy a partir da voz humana, alcançou alta acurácia e precisão na diferenciação de emoções. O resultado prevê aplicações promissoras em computação afetiva e tecnologias interativas. A abordagem demonstra eficácia ao capturar nuances emocionais, com adaptação a diferentes contextos para tornar interações humano- máquina mais empáticas e personalizadas.Universidade Federal da ParaíbaBrasilCiências Exatas e da SaúdePrograma de Pós-Graduação em Modelos de Decisão e SaúdeUFPBAlmeida, Anna Alice Figueirêdo dehttp://lattes.cnpq.br/8539341671152883Moraes, Ronei Marcos dehttp://lattes.cnpq.br/7925449690046513Constantini, Ana Carolinahttp://lattes.cnpq.br/2219719644820133Santiago, Regivan Hugo Nuneshttp://lattes.cnpq.br/7536988783793885Machado, Liliane dos Santoshttp://lattes.cnpq.br/0240551533292579Lopes, Leonardo Wanderleyhttp://lattes.cnpq.br/0982550255078545Aguiar, Alexandra Christine Leite2026-01-04T19:05:49Z2025-06-182026-01-04T19:05:49Z2025-02-20info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesishttps://repositorio.ufpb.br/jspui/handle/123456789/37244porAttribution-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nd/3.0/br/info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFPBinstname:Universidade Federal da Paraíba (UFPB)instacron:UFPB2026-01-05T19:52:20Zoai:repositorio.ufpb.br:123456789/37244Biblioteca Digital de Teses e Dissertaçõeshttps://repositorio.ufpb.br/PUBhttp://tede.biblioteca.ufpb.br:8080/oai/requestdiretoria@ufpb.br|| bdtd@biblioteca.ufpb.bropendoar:2026-01-05T19:52:20Biblioteca Digital de Teses e Dissertações da UFPB - Universidade Federal da Paraíba (UFPB)false |
| dc.title.none.fl_str_mv |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz |
| title |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz |
| spellingShingle |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz Aguiar, Alexandra Christine Leite Inteligência artificial - Voz Reconhecimento de voz Emoções expressas Processamento de voz Lógica fuzzy Voice recognition Expressed emotions Voice processing Artificial intelligence Fuzzy logic CNPQ::CIENCIAS DA SAUDE::SAUDE COLETIVA |
| title_short |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz |
| title_full |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz |
| title_fullStr |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz |
| title_full_unstemmed |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz |
| title_sort |
Modelo com base na lógica fuzzy para o reconhecimento de diferentes estados emocionais a partir da voz |
| author |
Aguiar, Alexandra Christine Leite |
| author_facet |
Aguiar, Alexandra Christine Leite |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Almeida, Anna Alice Figueirêdo de http://lattes.cnpq.br/8539341671152883 Moraes, Ronei Marcos de http://lattes.cnpq.br/7925449690046513 Constantini, Ana Carolina http://lattes.cnpq.br/2219719644820133 Santiago, Regivan Hugo Nunes http://lattes.cnpq.br/7536988783793885 Machado, Liliane dos Santos http://lattes.cnpq.br/0240551533292579 Lopes, Leonardo Wanderley http://lattes.cnpq.br/0982550255078545 |
| dc.contributor.author.fl_str_mv |
Aguiar, Alexandra Christine Leite |
| dc.subject.por.fl_str_mv |
Inteligência artificial - Voz Reconhecimento de voz Emoções expressas Processamento de voz Lógica fuzzy Voice recognition Expressed emotions Voice processing Artificial intelligence Fuzzy logic CNPQ::CIENCIAS DA SAUDE::SAUDE COLETIVA |
| topic |
Inteligência artificial - Voz Reconhecimento de voz Emoções expressas Processamento de voz Lógica fuzzy Voice recognition Expressed emotions Voice processing Artificial intelligence Fuzzy logic CNPQ::CIENCIAS DA SAUDE::SAUDE COLETIVA |
| description |
INTRODUCTION: Voice is produced by a complex neurophysiological process that involves several structures and systems of the body, and can be influenced by the individual's emotional state and personality traits. The construction of a model that uses the voice signal in different emotions will serve as a basis for the development and development of recognition systems that are robust in the automation of emotion recognition. The definition of acoustic characteristics specific to each emotional state may allow the effective separation of emotions through voice pattern recognition techniques. OBJECTIVE: To develop a model based on fuzzy logic capable of recognizing emotions from the voices of Brazilian Portuguese speakers. METHOD: This is a technological, descriptive, observational and cross-sectional study that used secondary data from the Brazilian Voice Bank in Variations of Emotions - EmoVox-BR, which has a data set composed of 182 sound signals associated with the basic emotions: joy, fear, sadness, anger, surprise, disgust and neutral state, produced by 26 professional actors and actors in training. Acoustic parameters such as fundamental frequency (fo), jitter, shimmer, glottal noise measures Glottal to Noise Excitation Ratio (GNE) and Harmonics-to-Noise Ratio (HNR), cepstral measures Cepstral Peak Prominence Smoothed (CPPS), Mel-Frequency Cepstral Coefficients (MFCC) and acoustic-prosodic parameters of fo, duration and intensity were extracted from the EmoVox-BR data. From these, the emotion recognition model was developed. RESULTS: The study presented an innovative model based on fuzzy logic for the recognition of emotional states from voice, with the integration of carefully selected acoustic and prosodic parameters. The model uses 18 input and 7 output parameters. Twenty-three fuzzy rules were implemented to discriminate seven emotional categories. The model presented an accuracy of 89.01% and a Kappa coefficient of 87.18%, with sensitivity ranging from 76.92% for happiness to 100% for sadness and neutral state. Specificities were higher than 93% in all categories, indicating a high capacity for emotional differentiation. Overlapping variables, such as some acoustic and mel- cepstral parameters, were removed to simplify the model and refine its performance. In addition to optimizing simplicity and efficiency in emotion recognition, the model outperformed conventional methods in robustness compared to machine learning algorithms such as Random Forest and Kernel SVM. CONCLUSION: The emotion recognition model, developed with fuzzy logic from the human voice, achieved high accuracy and precision in differentiating emotions. The result predicts promising applications in affective computing and interactive technologies; the approach demonstrates effectiveness in capturing emotional nuances, with adaptation to different contexts to make human-machine interactions more empathetic and personalized. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025-06-18 2025-02-20 2026-01-04T19:05:49Z 2026-01-04T19:05:49Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
| format |
doctoralThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://repositorio.ufpb.br/jspui/handle/123456789/37244 |
| url |
https://repositorio.ufpb.br/jspui/handle/123456789/37244 |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
Attribution-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nd/3.0/br/ info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
Attribution-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nd/3.0/br/ |
| eu_rights_str_mv |
openAccess |
| dc.publisher.none.fl_str_mv |
Universidade Federal da Paraíba Brasil Ciências Exatas e da Saúde Programa de Pós-Graduação em Modelos de Decisão e Saúde UFPB |
| publisher.none.fl_str_mv |
Universidade Federal da Paraíba Brasil Ciências Exatas e da Saúde Programa de Pós-Graduação em Modelos de Decisão e Saúde UFPB |
| dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da UFPB instname:Universidade Federal da Paraíba (UFPB) instacron:UFPB |
| instname_str |
Universidade Federal da Paraíba (UFPB) |
| instacron_str |
UFPB |
| institution |
UFPB |
| reponame_str |
Biblioteca Digital de Teses e Dissertações da UFPB |
| collection |
Biblioteca Digital de Teses e Dissertações da UFPB |
| repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da UFPB - Universidade Federal da Paraíba (UFPB) |
| repository.mail.fl_str_mv |
diretoria@ufpb.br|| bdtd@biblioteca.ufpb.br |
| _version_ |
1854304289790361600 |