Atoms of confusion in the Android Open Source Project: a prevalence study

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Tabosa, Davi Batista
Orientador(a): Carvalho, Windson Viana de
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Área do conhecimento CNPq:
Link de acesso: http://repositorio.ufc.br/handle/riufc/80257
Resumo: The Android Open Source Project (AOSP) is the open-source project responsible for the Android operating system. In the AOSP, developers collaborate to add features, fix bugs, and improve the performance of a code base of more than 14 million lines written in different programming languages (e.g., C, Java, and Kotlin). This code must have minimum quality requirements to facilitate its maintenance and evolution. However, latent software quality problems can persist in such complex projects, leading to maintainability and evolution issues. An example of a latent problem is Atoms of Confusion (AC), small indivisible code fragments that cause comprehension difficulties. To shed light on this matter, we conducted an empirical study in the AOSP with twofold goals: (i) perform a prevalence and frequency analysis of ACs in this project, and (ii) relate ACs presence with Chidamber & Kemerer (CK) metrics suite. We found that 331 of the 370 AOSP repositories analyzed have at least one atom, with Logic as Control Flow being the most frequent and prevalent, with more than 110,000 occurrences found and present in 96% of the repositories with the presence of atoms. We also observed that the presence of ACs has a positive correlation with some CK metrics, such as WMC (Weighted Methods per Class) and RFC (Response for a Class), as well as the number of LOC (Lines of Code).
id UFC-7_b8671666f76520ca4070dd3614899807
oai_identifier_str oai:repositorio.ufc.br:riufc/80257
network_acronym_str UFC-7
network_name_str Repositório Institucional da Universidade Federal do Ceará (UFC)
repository_id_str
spelling Tabosa, Davi BatistaRocha, Lincoln SouzaCarvalho, Windson Viana de2025-04-02T16:47:41Z2025-04-02T16:47:41Z2024TABOSA, Davi Batista. Atoms of confusion in the Android Open Source Project: a prevalence study. 2025. 60 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2024.http://repositorio.ufc.br/handle/riufc/80257The Android Open Source Project (AOSP) is the open-source project responsible for the Android operating system. In the AOSP, developers collaborate to add features, fix bugs, and improve the performance of a code base of more than 14 million lines written in different programming languages (e.g., C, Java, and Kotlin). This code must have minimum quality requirements to facilitate its maintenance and evolution. However, latent software quality problems can persist in such complex projects, leading to maintainability and evolution issues. An example of a latent problem is Atoms of Confusion (AC), small indivisible code fragments that cause comprehension difficulties. To shed light on this matter, we conducted an empirical study in the AOSP with twofold goals: (i) perform a prevalence and frequency analysis of ACs in this project, and (ii) relate ACs presence with Chidamber & Kemerer (CK) metrics suite. We found that 331 of the 370 AOSP repositories analyzed have at least one atom, with Logic as Control Flow being the most frequent and prevalent, with more than 110,000 occurrences found and present in 96% of the repositories with the presence of atoms. We also observed that the presence of ACs has a positive correlation with some CK metrics, such as WMC (Weighted Methods per Class) and RFC (Response for a Class), as well as the number of LOC (Lines of Code).O Android Open Source Project (AOSP) é o projeto de código aberto responsável pelo sistema operacional Android. No AOSP, os desenvolvedores colaboram para adicionar recursos, corrigir bugs e melhorar o desempenho de uma base de código de mais de 14 milhões de linhas escritas em diferentes linguagens de programação (por exemplo, C, Java e Kotlin). Esse código deve ter requisitos mínimos de qualidade para facilitar sua manutenção e evolução. No entanto, problemas latentes de qualidade de software podem persistir em projetos tão complexos, levando a problemas de manutenção e evolução. Um exemplo de problema latente é o Atoms of Confusion (AC), pequenos fragmentos de código indivisíveis que causam dificuldades de compreensão. Para esclarecer essa questão, realizamos um estudo empírico no AOSP com dois objetivos: (i) realizar uma análise de prevalência e frequência de ACs nesse projeto e (ii) relacionar a presença de ACs com o conjunto de métricas Chidamber & Kemerer (CK). Descobrimos que 331 dos 370 repositórios AOSP analisados têm pelo menos um átomo, sendo o Logic as Control Flow o mais frequente e predominante, com mais de 110.000 ocorrências encontradas e presentes em 96% dos repositórios com a presença de átomos. Também observamos que a presença de ACs tem uma correlação positiva com algumas métricas de CK, como WMC (Weighted Methods per Class) e RFC (Response for a Class), bem como o número de LOC (Lines of Code).Atoms of confusion in the Android Open Source Project: a prevalence studyinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisCompreensão de programaÁtomos de confusãoEstudo empíricoMineração de dadosProgram comprehensionAtoms of confusionEmpirical studyMining software repositoriesCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOinfo:eu-repo/semantics/openAccessengreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFChttps://orcid.org/0009-0002-9244-2449http://lattes.cnpq.br/0975174093759383https://orcid.org/0000-0002-8627-0823http://lattes.cnpq.br/1744732999336375https://orcid.org/0000-0001-5402-8744http://lattes.cnpq.br/06569777425905152025-04-02ORIGINAL2024_dis_dbtabosa.pdf2024_dis_dbtabosa.pdfapplication/pdf2381841http://repositorio.ufc.br/bitstream/riufc/80257/1/2024_dis_dbtabosa.pdf00034d42bd3b088fb5df48b797469afeMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ufc.br/bitstream/riufc/80257/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52riufc/802572025-04-02 13:47:48.373oai:repositorio.ufc.br:riufc/80257Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2025-04-02T16:47:48Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false
dc.title.pt_BR.fl_str_mv Atoms of confusion in the Android Open Source Project: a prevalence study
title Atoms of confusion in the Android Open Source Project: a prevalence study
spellingShingle Atoms of confusion in the Android Open Source Project: a prevalence study
Tabosa, Davi Batista
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
Compreensão de programa
Átomos de confusão
Estudo empírico
Mineração de dados
Program comprehension
Atoms of confusion
Empirical study
Mining software repositories
title_short Atoms of confusion in the Android Open Source Project: a prevalence study
title_full Atoms of confusion in the Android Open Source Project: a prevalence study
title_fullStr Atoms of confusion in the Android Open Source Project: a prevalence study
title_full_unstemmed Atoms of confusion in the Android Open Source Project: a prevalence study
title_sort Atoms of confusion in the Android Open Source Project: a prevalence study
author Tabosa, Davi Batista
author_facet Tabosa, Davi Batista
author_role author
dc.contributor.co-advisor.none.fl_str_mv Rocha, Lincoln Souza
dc.contributor.author.fl_str_mv Tabosa, Davi Batista
dc.contributor.advisor1.fl_str_mv Carvalho, Windson Viana de
contributor_str_mv Carvalho, Windson Viana de
dc.subject.cnpq.fl_str_mv CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
topic CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
Compreensão de programa
Átomos de confusão
Estudo empírico
Mineração de dados
Program comprehension
Atoms of confusion
Empirical study
Mining software repositories
dc.subject.ptbr.pt_BR.fl_str_mv Compreensão de programa
Átomos de confusão
Estudo empírico
Mineração de dados
dc.subject.en.pt_BR.fl_str_mv Program comprehension
Atoms of confusion
Empirical study
Mining software repositories
description The Android Open Source Project (AOSP) is the open-source project responsible for the Android operating system. In the AOSP, developers collaborate to add features, fix bugs, and improve the performance of a code base of more than 14 million lines written in different programming languages (e.g., C, Java, and Kotlin). This code must have minimum quality requirements to facilitate its maintenance and evolution. However, latent software quality problems can persist in such complex projects, leading to maintainability and evolution issues. An example of a latent problem is Atoms of Confusion (AC), small indivisible code fragments that cause comprehension difficulties. To shed light on this matter, we conducted an empirical study in the AOSP with twofold goals: (i) perform a prevalence and frequency analysis of ACs in this project, and (ii) relate ACs presence with Chidamber & Kemerer (CK) metrics suite. We found that 331 of the 370 AOSP repositories analyzed have at least one atom, with Logic as Control Flow being the most frequent and prevalent, with more than 110,000 occurrences found and present in 96% of the repositories with the presence of atoms. We also observed that the presence of ACs has a positive correlation with some CK metrics, such as WMC (Weighted Methods per Class) and RFC (Response for a Class), as well as the number of LOC (Lines of Code).
publishDate 2024
dc.date.issued.fl_str_mv 2024
dc.date.accessioned.fl_str_mv 2025-04-02T16:47:41Z
dc.date.available.fl_str_mv 2025-04-02T16:47:41Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv TABOSA, Davi Batista. Atoms of confusion in the Android Open Source Project: a prevalence study. 2025. 60 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2024.
dc.identifier.uri.fl_str_mv http://repositorio.ufc.br/handle/riufc/80257
identifier_str_mv TABOSA, Davi Batista. Atoms of confusion in the Android Open Source Project: a prevalence study. 2025. 60 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2024.
url http://repositorio.ufc.br/handle/riufc/80257
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Ceará (UFC)
instname:Universidade Federal do Ceará (UFC)
instacron:UFC
instname_str Universidade Federal do Ceará (UFC)
instacron_str UFC
institution UFC
reponame_str Repositório Institucional da Universidade Federal do Ceará (UFC)
collection Repositório Institucional da Universidade Federal do Ceará (UFC)
bitstream.url.fl_str_mv http://repositorio.ufc.br/bitstream/riufc/80257/1/2024_dis_dbtabosa.pdf
http://repositorio.ufc.br/bitstream/riufc/80257/2/license.txt
bitstream.checksum.fl_str_mv 00034d42bd3b088fb5df48b797469afe
8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)
repository.mail.fl_str_mv bu@ufc.br || repositorio@ufc.br
_version_ 1847793265402707968