An analysis of git’s private life and its merge conflicts

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: CUNHA, Marcela Bandeira
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/54306
Resumo: Collaborative development is an essential practice for the success of most nontriv- ial software projects. However, merge conflicts might occur when a developer integrates, through a remote shared repository, their changes with the changes from other developers. Such conflicts may impair developers’ productivity and introduce unexpected defects. Pre- vious empirical studies have analyzed such conflict characteristics and proposed different approaches to avoid or resolve them. However, these studies are limited to the analysis of code shared in public repositories. This way they ignore local (developer private) reposi- tory actions and, consequently, code integration scenarios that are often omitted from the history of remote shared repositories due to the use of commands such as git rebase, which rewrite Git commit history. These studies might then be examining only part of the actual code integration scenarios and conflicts. To assess that, we aim to shed light on this issue by bringing evidence from an empirical study that analyzes Git command history data extracted from the local repositories of a number of developers. This way we can access hidden integration scenarios that cannot be accessed by analyzing public reposi- tory data as in GitHub based studies. After identifying the visible and hidden integration scenarios, we investigate the relationship between the frequency of developers integrating code and how frequently these scenarios result in conflicts. This way, we can understand if these data are correlated. We analyze 95 Git reflog files from 61 different developers. Our results indicate that hidden code integration scenarios are more frequent than visible ones. We also find higher conflict rates than in previous studies. Our evidence suggests that studies that consider only remote shared repositories might miss integration conflict data by not considering the developer’s local repository actions. Regarding the correlation study, our results indicate a statistically significant relationship between the frequency of developers’ code integration and the frequency of integration scenarios resulting in con- flicts. This relationship is represented by a negative correlation (the higher values of one event are associated with the lower values of the other). From our study sample result, we suggest that if a developer integrates code often, the failed code integration frequency will tend to decrease.
id UFPE_d8ff6cabb05031add0f6d7a96feb8d70
oai_identifier_str oai:repositorio.ufpe.br:123456789/54306
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str
spelling An analysis of git’s private life and its merge conflictsCollaborative software developmentMerge conflictsEmpirical software engineeringRepository miningCollaborative development is an essential practice for the success of most nontriv- ial software projects. However, merge conflicts might occur when a developer integrates, through a remote shared repository, their changes with the changes from other developers. Such conflicts may impair developers’ productivity and introduce unexpected defects. Pre- vious empirical studies have analyzed such conflict characteristics and proposed different approaches to avoid or resolve them. However, these studies are limited to the analysis of code shared in public repositories. This way they ignore local (developer private) reposi- tory actions and, consequently, code integration scenarios that are often omitted from the history of remote shared repositories due to the use of commands such as git rebase, which rewrite Git commit history. These studies might then be examining only part of the actual code integration scenarios and conflicts. To assess that, we aim to shed light on this issue by bringing evidence from an empirical study that analyzes Git command history data extracted from the local repositories of a number of developers. This way we can access hidden integration scenarios that cannot be accessed by analyzing public reposi- tory data as in GitHub based studies. After identifying the visible and hidden integration scenarios, we investigate the relationship between the frequency of developers integrating code and how frequently these scenarios result in conflicts. This way, we can understand if these data are correlated. We analyze 95 Git reflog files from 61 different developers. Our results indicate that hidden code integration scenarios are more frequent than visible ones. We also find higher conflict rates than in previous studies. Our evidence suggests that studies that consider only remote shared repositories might miss integration conflict data by not considering the developer’s local repository actions. Regarding the correlation study, our results indicate a statistically significant relationship between the frequency of developers’ code integration and the frequency of integration scenarios resulting in con- flicts. This relationship is represented by a negative correlation (the higher values of one event are associated with the lower values of the other). From our study sample result, we suggest that if a developer integrates code often, the failed code integration frequency will tend to decrease.O desenvolvimento colaborativo é uma prática essencial para o sucesso da maioria dos projetos de software não triviais. No entanto, conflitos de mesclagem podem ocorrer quando um desenvolvedor integra, por meio de um repositório compartilhado remoto, suas alterações com as alterações de outros desenvolvedores. Tais conflitos podem prejudicar a produtividade dos desenvolvedores e introduzir defeitos inesperados. Estudos empíricos anteriores analisaram tais características de conflito e propuseram diferentes abordagens para evitá-los ou resolvê-los. No entanto, esses estudos se limitam à análise de códigos compartilhados em repositórios públicos. Dessa forma, eles ignoram as ações do repositório local (privado do desenvolvedor) e, consequentemente, os cenários de integração de código que muitas vezes são omitidos do histórico de repositórios remotos compartilhados devido ao uso de comandos como git rebase, que reescrevem o histórico de commits do Git. Esses estudos podem então estar examinando apenas parte dos cenários e conflitos reais de integração de código. Para avaliar isso, pretendemos lançar luz sobre essa questão, trazendo evidências de um estudo empírico que analisa dados do histórico de comandos Git extraídos dos repositórios locais de vários desenvolvedores. Dessa forma, podemos acessar cenários de integração ocultos que não podem ser acessados analisando dados de repositório público como em estudos baseados no GitHub. Após identificar os cenários de integração visíveis e ocultos, investigamos a relação entre a frequência dos desenvolvedores que integram o código e a frequência com que esses cenários resultam em conflitos. Dessa forma, podemos entender se esses dados estão correlacionados. Analisamos 95 arquivos Git reflog de 61 desenvolvedores diferentes. Nossos resultados indicam que os cenários de integração de código oculto são mais frequentes do que os visíveis. Também encontramos taxas de conflito mais altas do que em estudos anteriores. Nossas evidências sugerem que estudos que consideram apenas repositórios compartilhados remotos podem perder dados de conflito de integração por não considerar as ações do repositório local do de- senvolvedor. Em relação ao estudo de correlação, nossos resultados indicam uma relação estatisticamente significativa entre a frequência de integração de código dos desenvolve- dores e a frequência de cenários de integração que resultam em conflitos. Essa relação é representada por uma correlação negativa (os valores mais altos de um evento estão associados aos valores mais baixos do outro). A partir do resultado da amostra do nosso estudo, sugerimos que, se um desenvolvedor integra o código com frequência, a frequência de falha na integração do código tende a diminuir.Universidade Federal de PernambucoUFPEBrasilPrograma de Pos Graduacao em Ciencia da ComputacaoBORBA, Paulo Henrique MonteiroACCIOLY, Paola Rodrigues Godoyhttp://lattes.cnpq.br/7823204701989431http://lattes.cnpq.br/9395715443254344http://lattes.cnpq.br/6629813636801870CUNHA, Marcela Bandeira2023-12-21T17:16:12Z2023-12-21T17:16:12Z2023-08-31info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfCUNHA, Marcela Bandeira. An analysis of git’s private life and its merge conflicts. 2023. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2023.https://repositorio.ufpe.br/handle/123456789/54306enghttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPE2024-01-05T05:22:26Zoai:repositorio.ufpe.br:123456789/54306Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212024-01-05T05:22:26Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.none.fl_str_mv An analysis of git’s private life and its merge conflicts
title An analysis of git’s private life and its merge conflicts
spellingShingle An analysis of git’s private life and its merge conflicts
CUNHA, Marcela Bandeira
Collaborative software development
Merge conflicts
Empirical software engineering
Repository mining
title_short An analysis of git’s private life and its merge conflicts
title_full An analysis of git’s private life and its merge conflicts
title_fullStr An analysis of git’s private life and its merge conflicts
title_full_unstemmed An analysis of git’s private life and its merge conflicts
title_sort An analysis of git’s private life and its merge conflicts
author CUNHA, Marcela Bandeira
author_facet CUNHA, Marcela Bandeira
author_role author
dc.contributor.none.fl_str_mv BORBA, Paulo Henrique Monteiro
ACCIOLY, Paola Rodrigues Godoy
http://lattes.cnpq.br/7823204701989431
http://lattes.cnpq.br/9395715443254344
http://lattes.cnpq.br/6629813636801870
dc.contributor.author.fl_str_mv CUNHA, Marcela Bandeira
dc.subject.por.fl_str_mv Collaborative software development
Merge conflicts
Empirical software engineering
Repository mining
topic Collaborative software development
Merge conflicts
Empirical software engineering
Repository mining
description Collaborative development is an essential practice for the success of most nontriv- ial software projects. However, merge conflicts might occur when a developer integrates, through a remote shared repository, their changes with the changes from other developers. Such conflicts may impair developers’ productivity and introduce unexpected defects. Pre- vious empirical studies have analyzed such conflict characteristics and proposed different approaches to avoid or resolve them. However, these studies are limited to the analysis of code shared in public repositories. This way they ignore local (developer private) reposi- tory actions and, consequently, code integration scenarios that are often omitted from the history of remote shared repositories due to the use of commands such as git rebase, which rewrite Git commit history. These studies might then be examining only part of the actual code integration scenarios and conflicts. To assess that, we aim to shed light on this issue by bringing evidence from an empirical study that analyzes Git command history data extracted from the local repositories of a number of developers. This way we can access hidden integration scenarios that cannot be accessed by analyzing public reposi- tory data as in GitHub based studies. After identifying the visible and hidden integration scenarios, we investigate the relationship between the frequency of developers integrating code and how frequently these scenarios result in conflicts. This way, we can understand if these data are correlated. We analyze 95 Git reflog files from 61 different developers. Our results indicate that hidden code integration scenarios are more frequent than visible ones. We also find higher conflict rates than in previous studies. Our evidence suggests that studies that consider only remote shared repositories might miss integration conflict data by not considering the developer’s local repository actions. Regarding the correlation study, our results indicate a statistically significant relationship between the frequency of developers’ code integration and the frequency of integration scenarios resulting in con- flicts. This relationship is represented by a negative correlation (the higher values of one event are associated with the lower values of the other). From our study sample result, we suggest that if a developer integrates code often, the failed code integration frequency will tend to decrease.
publishDate 2023
dc.date.none.fl_str_mv 2023-12-21T17:16:12Z
2023-12-21T17:16:12Z
2023-08-31
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv CUNHA, Marcela Bandeira. An analysis of git’s private life and its merge conflicts. 2023. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2023.
https://repositorio.ufpe.br/handle/123456789/54306
identifier_str_mv CUNHA, Marcela Bandeira. An analysis of git’s private life and its merge conflicts. 2023. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2023.
url https://repositorio.ufpe.br/handle/123456789/54306
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1856041889594081280