Enhancing the SZZ Algorithm to Deal with Refactoring Changes
| Ano de defesa: | 2018 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Tese |
| Tipo de acesso: | Acesso aberto |
| Idioma: | por |
| Instituição de defesa: |
Brasil
UFRN PROGRAMA DE PÓS-GRADUAÇÃO EM SISTEMAS E COMPUTAÇÃO |
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://repositorio.ufrn.br/jspui/handle/123456789/26204 |
Resumo: | SZZ was proposed by Śliwerski, Zimmermann, and Zeller (hence the SZZ abbreviation) to identify fix-inducing changes, i.e., changes that are likely to induce bugs. Despite the wide adoption of this algorithm, SZZ still faces limitations, which have been recently reported. No existing research work widely surveys how SZZ has been used, extended, and evaluated by the software engineering community. Furthermore, only a few studies have investigated the SZZ improvements. Therefore, this thesis aims to explore the existing limitations that have been documented in the SZZ literature and to enhance the state-of-the-art of SZZ by proposing solutions to some of its limitations. First, we perform a systematic mapping study to determine the current state-of-the-art of the SZZ algorithm. Our results exhibit that majority of the analyzed studies use SZZ as a foundation to conduct their empirical studies (79%), while only a few studies have proposed direct improvements to SZZ (6%) or evaluated it (4%). We further observe that SZZ exhibits several unaddressed limitations, such as the bias related to the refactoring changes. Second, we conducted an empirical study to investigate the relationship between the refactoring changes and the SZZ results. We use RefDiff — a tool that detects code refactoring with high precision — to analyze an extensive dataset, including 31,518 issues of ten systems, with 64,855 bug-fixes and 20,298 fix-inducing changes. We run RefDiff for both bug-fix and fix-inducing changes that were generated by a recent SZZ implementation, meta-change aware SZZ (MASZZ). The results indicate a refactoring ratio of 6.5% for fix-inducing changes and 19.9% for bug-fix changes. We incorporated RefDiff into MA-SZZ and proposed a refactoring aware SZZ implementation (RA-SZZ). RA-SZZ reduces the number of lines flagged as fix-inducing changes by MA-SZZ by 20.8%. These results indicate that refactoring can impact the SZZ results. Using an evaluation framework, we observe that RA-SZZ reduces the disagreement ratio compared to previous implementations; however, our results suggest the SZZ accuracy must still be improved. Finally, we evaluated the RA-SZZ accuracy using a well-accepted dataset, which we validated for evaluating SZZ implementations. Furthermore, we revisited the known RA-SZZ limitations to improve the accuracy of the algorithm by integrating a novel refactoring-detection tool, RMiner. We observed that after refining RA-SZZ, in the median, 44% of the lines that were flagged as fix-inducing are accurate, while only 29% were flagged accurately in case of the MA-SZZ-generated results. We also manually analyzed the RA-SZZ results and observed that there are still refactoring (31.17%) and equivalent changes (13.64%) to be recognized by SZZ. This result reiterates that detecting refactoring indeed increases the SZZ accuracy. Our thesis results contribute to SZZ maturation and indicate that the impact of refactoring upon SZZ may be even higher if further improvements can be made in future studies. |
| id |
UFRN_1780fa89bb0c1ea4482342d66d2ab2df |
|---|---|
| oai_identifier_str |
oai:repositorio.ufrn.br:123456789/26204 |
| network_acronym_str |
UFRN |
| network_name_str |
Repositório Institucional da UFRN |
| repository_id_str |
|
| spelling |
Enhancing the SZZ Algorithm to Deal with Refactoring ChangesSZZ algorithmFix-inducing changesBug-fix changesRefactoring changesCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAOSZZ was proposed by Śliwerski, Zimmermann, and Zeller (hence the SZZ abbreviation) to identify fix-inducing changes, i.e., changes that are likely to induce bugs. Despite the wide adoption of this algorithm, SZZ still faces limitations, which have been recently reported. No existing research work widely surveys how SZZ has been used, extended, and evaluated by the software engineering community. Furthermore, only a few studies have investigated the SZZ improvements. Therefore, this thesis aims to explore the existing limitations that have been documented in the SZZ literature and to enhance the state-of-the-art of SZZ by proposing solutions to some of its limitations. First, we perform a systematic mapping study to determine the current state-of-the-art of the SZZ algorithm. Our results exhibit that majority of the analyzed studies use SZZ as a foundation to conduct their empirical studies (79%), while only a few studies have proposed direct improvements to SZZ (6%) or evaluated it (4%). We further observe that SZZ exhibits several unaddressed limitations, such as the bias related to the refactoring changes. Second, we conducted an empirical study to investigate the relationship between the refactoring changes and the SZZ results. We use RefDiff — a tool that detects code refactoring with high precision — to analyze an extensive dataset, including 31,518 issues of ten systems, with 64,855 bug-fixes and 20,298 fix-inducing changes. We run RefDiff for both bug-fix and fix-inducing changes that were generated by a recent SZZ implementation, meta-change aware SZZ (MASZZ). The results indicate a refactoring ratio of 6.5% for fix-inducing changes and 19.9% for bug-fix changes. We incorporated RefDiff into MA-SZZ and proposed a refactoring aware SZZ implementation (RA-SZZ). RA-SZZ reduces the number of lines flagged as fix-inducing changes by MA-SZZ by 20.8%. These results indicate that refactoring can impact the SZZ results. Using an evaluation framework, we observe that RA-SZZ reduces the disagreement ratio compared to previous implementations; however, our results suggest the SZZ accuracy must still be improved. Finally, we evaluated the RA-SZZ accuracy using a well-accepted dataset, which we validated for evaluating SZZ implementations. Furthermore, we revisited the known RA-SZZ limitations to improve the accuracy of the algorithm by integrating a novel refactoring-detection tool, RMiner. We observed that after refining RA-SZZ, in the median, 44% of the lines that were flagged as fix-inducing are accurate, while only 29% were flagged accurately in case of the MA-SZZ-generated results. We also manually analyzed the RA-SZZ results and observed that there are still refactoring (31.17%) and equivalent changes (13.64%) to be recognized by SZZ. This result reiterates that detecting refactoring indeed increases the SZZ accuracy. Our thesis results contribute to SZZ maturation and indicate that the impact of refactoring upon SZZ may be even higher if further improvements can be made in future studies.BrasilUFRNPROGRAMA DE PÓS-GRADUAÇÃO EM SISTEMAS E COMPUTAÇÃOKulesza, UiraCosta, Daniel Alencar daAranha, Eduardo Henrique da SilvaNunes, Ingrid Oliveira deMaia, Marcelo de AlmeidaCoelho, Roberta de SouzaCampos Neto, Edmilson Barbalho2018-11-27T20:42:18Z2018-11-27T20:42:18Z2018-07-20info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfCAMPOS NETO, Edmilson Barbalho. Enhancing the SZZ Algorithm to Deal with Refactoring Changes. 2018. 133f. Tese (Doutorado em Ciência da Computação) - Centro de Ciências Exatas e da Terra, Universidade Federal do Rio Grande do Norte, Natal, 2018.https://repositorio.ufrn.br/jspui/handle/123456789/26204porinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFRNinstname:Universidade Federal do Rio Grande do Norte (UFRN)instacron:UFRN2019-01-29T19:54:39Zoai:repositorio.ufrn.br:123456789/26204Repositório InstitucionalPUBhttp://repositorio.ufrn.br/oai/repositorio@bczm.ufrn.bropendoar:2019-01-29T19:54:39Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN)false |
| dc.title.none.fl_str_mv |
Enhancing the SZZ Algorithm to Deal with Refactoring Changes |
| title |
Enhancing the SZZ Algorithm to Deal with Refactoring Changes |
| spellingShingle |
Enhancing the SZZ Algorithm to Deal with Refactoring Changes Campos Neto, Edmilson Barbalho SZZ algorithm Fix-inducing changes Bug-fix changes Refactoring changes CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO |
| title_short |
Enhancing the SZZ Algorithm to Deal with Refactoring Changes |
| title_full |
Enhancing the SZZ Algorithm to Deal with Refactoring Changes |
| title_fullStr |
Enhancing the SZZ Algorithm to Deal with Refactoring Changes |
| title_full_unstemmed |
Enhancing the SZZ Algorithm to Deal with Refactoring Changes |
| title_sort |
Enhancing the SZZ Algorithm to Deal with Refactoring Changes |
| author |
Campos Neto, Edmilson Barbalho |
| author_facet |
Campos Neto, Edmilson Barbalho |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Kulesza, Uira Costa, Daniel Alencar da Aranha, Eduardo Henrique da Silva Nunes, Ingrid Oliveira de Maia, Marcelo de Almeida Coelho, Roberta de Souza |
| dc.contributor.author.fl_str_mv |
Campos Neto, Edmilson Barbalho |
| dc.subject.por.fl_str_mv |
SZZ algorithm Fix-inducing changes Bug-fix changes Refactoring changes CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO |
| topic |
SZZ algorithm Fix-inducing changes Bug-fix changes Refactoring changes CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO |
| description |
SZZ was proposed by Śliwerski, Zimmermann, and Zeller (hence the SZZ abbreviation) to identify fix-inducing changes, i.e., changes that are likely to induce bugs. Despite the wide adoption of this algorithm, SZZ still faces limitations, which have been recently reported. No existing research work widely surveys how SZZ has been used, extended, and evaluated by the software engineering community. Furthermore, only a few studies have investigated the SZZ improvements. Therefore, this thesis aims to explore the existing limitations that have been documented in the SZZ literature and to enhance the state-of-the-art of SZZ by proposing solutions to some of its limitations. First, we perform a systematic mapping study to determine the current state-of-the-art of the SZZ algorithm. Our results exhibit that majority of the analyzed studies use SZZ as a foundation to conduct their empirical studies (79%), while only a few studies have proposed direct improvements to SZZ (6%) or evaluated it (4%). We further observe that SZZ exhibits several unaddressed limitations, such as the bias related to the refactoring changes. Second, we conducted an empirical study to investigate the relationship between the refactoring changes and the SZZ results. We use RefDiff — a tool that detects code refactoring with high precision — to analyze an extensive dataset, including 31,518 issues of ten systems, with 64,855 bug-fixes and 20,298 fix-inducing changes. We run RefDiff for both bug-fix and fix-inducing changes that were generated by a recent SZZ implementation, meta-change aware SZZ (MASZZ). The results indicate a refactoring ratio of 6.5% for fix-inducing changes and 19.9% for bug-fix changes. We incorporated RefDiff into MA-SZZ and proposed a refactoring aware SZZ implementation (RA-SZZ). RA-SZZ reduces the number of lines flagged as fix-inducing changes by MA-SZZ by 20.8%. These results indicate that refactoring can impact the SZZ results. Using an evaluation framework, we observe that RA-SZZ reduces the disagreement ratio compared to previous implementations; however, our results suggest the SZZ accuracy must still be improved. Finally, we evaluated the RA-SZZ accuracy using a well-accepted dataset, which we validated for evaluating SZZ implementations. Furthermore, we revisited the known RA-SZZ limitations to improve the accuracy of the algorithm by integrating a novel refactoring-detection tool, RMiner. We observed that after refining RA-SZZ, in the median, 44% of the lines that were flagged as fix-inducing are accurate, while only 29% were flagged accurately in case of the MA-SZZ-generated results. We also manually analyzed the RA-SZZ results and observed that there are still refactoring (31.17%) and equivalent changes (13.64%) to be recognized by SZZ. This result reiterates that detecting refactoring indeed increases the SZZ accuracy. Our thesis results contribute to SZZ maturation and indicate that the impact of refactoring upon SZZ may be even higher if further improvements can be made in future studies. |
| publishDate |
2018 |
| dc.date.none.fl_str_mv |
2018-11-27T20:42:18Z 2018-11-27T20:42:18Z 2018-07-20 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
| format |
doctoralThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
CAMPOS NETO, Edmilson Barbalho. Enhancing the SZZ Algorithm to Deal with Refactoring Changes. 2018. 133f. Tese (Doutorado em Ciência da Computação) - Centro de Ciências Exatas e da Terra, Universidade Federal do Rio Grande do Norte, Natal, 2018. https://repositorio.ufrn.br/jspui/handle/123456789/26204 |
| identifier_str_mv |
CAMPOS NETO, Edmilson Barbalho. Enhancing the SZZ Algorithm to Deal with Refactoring Changes. 2018. 133f. Tese (Doutorado em Ciência da Computação) - Centro de Ciências Exatas e da Terra, Universidade Federal do Rio Grande do Norte, Natal, 2018. |
| url |
https://repositorio.ufrn.br/jspui/handle/123456789/26204 |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Brasil UFRN PROGRAMA DE PÓS-GRADUAÇÃO EM SISTEMAS E COMPUTAÇÃO |
| publisher.none.fl_str_mv |
Brasil UFRN PROGRAMA DE PÓS-GRADUAÇÃO EM SISTEMAS E COMPUTAÇÃO |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFRN instname:Universidade Federal do Rio Grande do Norte (UFRN) instacron:UFRN |
| instname_str |
Universidade Federal do Rio Grande do Norte (UFRN) |
| instacron_str |
UFRN |
| institution |
UFRN |
| reponame_str |
Repositório Institucional da UFRN |
| collection |
Repositório Institucional da UFRN |
| repository.name.fl_str_mv |
Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN) |
| repository.mail.fl_str_mv |
repositorio@bczm.ufrn.br |
| _version_ |
1855758908115648512 |