Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Vitor Mesaque Alves de Lima
Orientador(a): Ricardo Marcondes Marcacini
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Fundação Universidade Federal de Mato Grosso do Sul
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufms.br/handle/123456789/6373
Resumo: Opinion mining for app reviews aims to analyze user comments on app stores to support software engineering activities, primarily software maintenance and evolution. One of the main challenges in maintaining software quality is promptly identifying emerging issues, such as bugs. However, manually analyzing these comments is challenging due to the large amount of textual data. Methods based on machine learning have been employed to automate opinion mining and address this issue. While recent methods have achieved promising results in extracting and categorizing issues from users' opinions, existing studies mainly focus on assisting software engineers in exploring users' historical behavior regarding app functionalities and do not explore mechanisms for trend detection and risk classification of emerging issues. Furthermore, these studies do not cover the entire issue analysis process through an unsupervised approach. This doctoral project advances state of the art in opinion mining for app reviews by proposing an entire automated issue analysis approach to identify, prioritize, and monitor the risk of emerging issues. Our proposal introduces a two-fold approach that (i) identifies possible defective software requirements and trains predictive models for anticipating requirements with a higher probability of negative evaluation and (ii) detect issues in reviews, classifies them in a risk matrix with prioritization levels, and monitors their evolution over time. Additionally, we present a risk matrix construction approach from app reviews using the recent Large Language Models (LLMs). We introduce an analytical data exploration tool that allows engineers to browse the risk matrix, time series, heat map, issue tree, alerts, and notifications. Our goal is to minimize the time between the occurrence of an issue and its correction, enabling the quick identification of problems. We processed over 6.6 million reviews across 20 domains to evaluate our proposal, identifying and ranking the risks associated with nearly 270,000 issues. The results demonstrate the competitiveness of our unsupervised approach compared to existing supervised models. We have proven that opinions extracted from user reviews provide crucial insights into app issues and risks and can be identified early to mitigate their impact. Our opinion mining process implements an entire automated issue analysis with risk-based prioritization and temporal monitoring.
id UFMS_ab2248d565569cc886ab9c9b89f559ba
oai_identifier_str oai:repositorio.ufms.br:123456789/6373
network_acronym_str UFMS
network_name_str Repositório Institucional da UFMS
repository_id_str
spelling 2023-07-20T21:08:24Z2023-07-20T21:08:24Z2023https://repositorio.ufms.br/handle/123456789/6373Opinion mining for app reviews aims to analyze user comments on app stores to support software engineering activities, primarily software maintenance and evolution. One of the main challenges in maintaining software quality is promptly identifying emerging issues, such as bugs. However, manually analyzing these comments is challenging due to the large amount of textual data. Methods based on machine learning have been employed to automate opinion mining and address this issue. While recent methods have achieved promising results in extracting and categorizing issues from users' opinions, existing studies mainly focus on assisting software engineers in exploring users' historical behavior regarding app functionalities and do not explore mechanisms for trend detection and risk classification of emerging issues. Furthermore, these studies do not cover the entire issue analysis process through an unsupervised approach. This doctoral project advances state of the art in opinion mining for app reviews by proposing an entire automated issue analysis approach to identify, prioritize, and monitor the risk of emerging issues. Our proposal introduces a two-fold approach that (i) identifies possible defective software requirements and trains predictive models for anticipating requirements with a higher probability of negative evaluation and (ii) detect issues in reviews, classifies them in a risk matrix with prioritization levels, and monitors their evolution over time. Additionally, we present a risk matrix construction approach from app reviews using the recent Large Language Models (LLMs). We introduce an analytical data exploration tool that allows engineers to browse the risk matrix, time series, heat map, issue tree, alerts, and notifications. Our goal is to minimize the time between the occurrence of an issue and its correction, enabling the quick identification of problems. We processed over 6.6 million reviews across 20 domains to evaluate our proposal, identifying and ranking the risks associated with nearly 270,000 issues. The results demonstrate the competitiveness of our unsupervised approach compared to existing supervised models. We have proven that opinions extracted from user reviews provide crucial insights into app issues and risks and can be identified early to mitigate their impact. Our opinion mining process implements an entire automated issue analysis with risk-based prioritization and temporal monitoring.A mineração de opinião para avaliações de aplicativos tem como objetivo analisar os comentários dos usuários nas lojas de aplicativos para apoiar as atividades de engenharia de software, principalmente a manutenção e evolução de software. Identificar prontamente problemas emergentes, como bugs, é um dos principais desafios na manutenção da qualidade do software. No entanto, analisar manualmente esses comentários é um desafio devido à grande quantidade de dados textuais. Métodos baseados em aprendizado de máquina têm sido empregados para automatizar a mineração de opinião e lidar com essa questão. Embora métodos recentes tenham alcançado resultados promissores na extração e categorização de problemas a partir das opiniões dos usuários, os estudos existentes concentram-se principalmente em auxiliar os engenheiros de software a explorar o comportamento histórico dos usuários em relação às funcionalidades do aplicativo e não exploram mecanismos de deteção de tendências e classificação de risco de problemas emergentes. Além disso, os estudos anteriores não abrangem o processo completo de análise de problemas e riscos por meio de uma abordagem não supervisionada. Este projeto de doutorado avança o estado da arte na mineração de opinião para reviews de aplicativos, propondo uma abordagem não supervisionada para identificar e priorizar problemas emergentes. Nosso objetivo é minimizar o tempo entre a ocorrência de um problema e sua correção, permitindo uma rápida identificação do problema. Propomos duas novas abordagens que (i) identifica possíveis requisitos de software defeituosos e treina modelos preditivos para antecipar requisitos com maior probabilidade de avaliação negativa e (ii) detecta problemas a partir de avaliações, classifica-os em uma matriz de risco com níveis de priorização e monitora sua evolução ao longo do tempo. Adicionalmente, apresentamos uma abordagem de construção da matriz de risco usando os recentes Large Language Models (LLMs). Processamos mais de 6.6 milhões de comentários de usuários para avaliar nossa proposta, identificando e classificando o risco associado a quase 270.000 problemas. Os resultados demonstram a competitividade de nossa abordagem não supervisionada em comparação com modelos supervisionados existentes. Comprovamos que as opiniões extraídas dos comentários dos usuários fornecem percepções importantes sobre os problemas e riscos associados aos aplicativos, que podem ser detectados antecipadamente para mitigar seu impacto. Nosso processo de mineração de opinião implementa a análise automatizada de problemas, com priorização baseada em risco e monitoramento temporal.Fundação Universidade Federal de Mato Grosso do SulUFMSBrasilopinion mining, text mining, app reviews, software engineering, software evolutionOpinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolutioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisRicardo Marcondes MarcaciniVitor Mesaque Alves de Limainfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da UFMSinstname:Universidade Federal de Mato Grosso do Sul (UFMS)instacron:UFMSORIGINALThesis___Opinion_Mining_for_App_Reviews__Identifying_and_Prioritizing_Emerging_Issues_for_Software_Maintenance_and_Evolution-74.pdfThesis___Opinion_Mining_for_App_Reviews__Identifying_and_Prioritizing_Emerging_Issues_for_Software_Maintenance_and_Evolution-74.pdfapplication/pdf10252931https://repositorio.ufms.br/bitstream/123456789/6373/-1/Thesis___Opinion_Mining_for_App_Reviews__Identifying_and_Prioritizing_Emerging_Issues_for_Software_Maintenance_and_Evolution-74.pdf223133bb86b35a4066352f056468d314MD5-1123456789/63732023-07-20 17:08:25.82oai:repositorio.ufms.br:123456789/6373Repositório InstitucionalPUBhttps://repositorio.ufms.br/oai/requestri.prograd@ufms.bropendoar:21242023-07-20T21:08:25Repositório Institucional da UFMS - Universidade Federal de Mato Grosso do Sul (UFMS)false
dc.title.pt_BR.fl_str_mv Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution
title Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution
spellingShingle Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution
Vitor Mesaque Alves de Lima
opinion mining, text mining, app reviews, software engineering, software evolution
title_short Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution
title_full Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution
title_fullStr Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution
title_full_unstemmed Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution
title_sort Opinion Mining for App Reviews: Identifying and Prioritizing Emerging Issues for Software Maintenance and Evolution
author Vitor Mesaque Alves de Lima
author_facet Vitor Mesaque Alves de Lima
author_role author
dc.contributor.advisor1.fl_str_mv Ricardo Marcondes Marcacini
dc.contributor.author.fl_str_mv Vitor Mesaque Alves de Lima
contributor_str_mv Ricardo Marcondes Marcacini
dc.subject.por.fl_str_mv opinion mining, text mining, app reviews, software engineering, software evolution
topic opinion mining, text mining, app reviews, software engineering, software evolution
description Opinion mining for app reviews aims to analyze user comments on app stores to support software engineering activities, primarily software maintenance and evolution. One of the main challenges in maintaining software quality is promptly identifying emerging issues, such as bugs. However, manually analyzing these comments is challenging due to the large amount of textual data. Methods based on machine learning have been employed to automate opinion mining and address this issue. While recent methods have achieved promising results in extracting and categorizing issues from users' opinions, existing studies mainly focus on assisting software engineers in exploring users' historical behavior regarding app functionalities and do not explore mechanisms for trend detection and risk classification of emerging issues. Furthermore, these studies do not cover the entire issue analysis process through an unsupervised approach. This doctoral project advances state of the art in opinion mining for app reviews by proposing an entire automated issue analysis approach to identify, prioritize, and monitor the risk of emerging issues. Our proposal introduces a two-fold approach that (i) identifies possible defective software requirements and trains predictive models for anticipating requirements with a higher probability of negative evaluation and (ii) detect issues in reviews, classifies them in a risk matrix with prioritization levels, and monitors their evolution over time. Additionally, we present a risk matrix construction approach from app reviews using the recent Large Language Models (LLMs). We introduce an analytical data exploration tool that allows engineers to browse the risk matrix, time series, heat map, issue tree, alerts, and notifications. Our goal is to minimize the time between the occurrence of an issue and its correction, enabling the quick identification of problems. We processed over 6.6 million reviews across 20 domains to evaluate our proposal, identifying and ranking the risks associated with nearly 270,000 issues. The results demonstrate the competitiveness of our unsupervised approach compared to existing supervised models. We have proven that opinions extracted from user reviews provide crucial insights into app issues and risks and can be identified early to mitigate their impact. Our opinion mining process implements an entire automated issue analysis with risk-based prioritization and temporal monitoring.
publishDate 2023
dc.date.accessioned.fl_str_mv 2023-07-20T21:08:24Z
dc.date.available.fl_str_mv 2023-07-20T21:08:24Z
dc.date.issued.fl_str_mv 2023
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://repositorio.ufms.br/handle/123456789/6373
url https://repositorio.ufms.br/handle/123456789/6373
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Fundação Universidade Federal de Mato Grosso do Sul
dc.publisher.initials.fl_str_mv UFMS
dc.publisher.country.fl_str_mv Brasil
publisher.none.fl_str_mv Fundação Universidade Federal de Mato Grosso do Sul
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMS
instname:Universidade Federal de Mato Grosso do Sul (UFMS)
instacron:UFMS
instname_str Universidade Federal de Mato Grosso do Sul (UFMS)
instacron_str UFMS
institution UFMS
reponame_str Repositório Institucional da UFMS
collection Repositório Institucional da UFMS
bitstream.url.fl_str_mv https://repositorio.ufms.br/bitstream/123456789/6373/-1/Thesis___Opinion_Mining_for_App_Reviews__Identifying_and_Prioritizing_Emerging_Issues_for_Software_Maintenance_and_Evolution-74.pdf
bitstream.checksum.fl_str_mv 223133bb86b35a4066352f056468d314
bitstream.checksumAlgorithm.fl_str_mv MD5
repository.name.fl_str_mv Repositório Institucional da UFMS - Universidade Federal de Mato Grosso do Sul (UFMS)
repository.mail.fl_str_mv ri.prograd@ufms.br
_version_ 1793867392237961216