Intelligent classification of SPI practices and evidences based on NLP and semantic similarity

Detalhes bibliográficos
Ano de defesa: 2021
Autor(a) principal: Ecar, Miguel da Silva
Orientador(a): Silva, João Pablo Silva da
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal do Pampa
Programa de Pós-Graduação: Mestrado Profissional em Engenharia de Software
Departamento: Campus Alegrete
País: Brasil
Palavras-chave em Português:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.unipampa.edu.br/jspui/handle/riu/6753
Resumo: Software Process Improvement (SPI) consists of a set of changes in software development companies, which introduces new and improved methods, techniques, and tools. SPI initiatives generally are performed based on a reference model, such as CMMI, ISO 9001, ISO 15504, among others. One of the first steps when investing in SPI initiatives is the SPI Diagnostic. The SPI Diagnostic is generally performed manually, which demands high effort from consultants. Moreover, a high data volume is generated and must be analyzed, which is bound to subjective analysis. Since there has been a lack of automation tools to support this process, it turns SPI Diagnostic a challenging process. This work aims to propose an intelligent tool called Coptic for Practice-Evidence classification, using Natural Language Processing and Semantic Similarity. We also propose Base of Knowledge about Software Engineering Practices (Badge), which is a domain ontology that generalizes SPI resources that says “what should be done” and PStory, which is a template to write pieces of evidence. We evaluated Badge through a focus group session. We evaluated PStory through an exercise and questionnaire with industry professionals. We evaluated Coptic by a quasi-experiment with PStories evaluated by industry professionals. As an outcome, Coptic presented satisfactory results using the initial corpus. We conclude that Coptic presents a valuable result in terms of providing support to professionals in performing a SPI Diagnostic. Badge introduces a domain ontology that differs from the related proposals in literature and has value to SPI initiatives. We also concluded that PStory introduces a simple way to write pieces of evidence, and Coptic provides support to SPI Practices-Evidence matching process. Key-words: Coptic, Badge, PStory, Software Process Improvement, SPI.
id UNIP_c9d9ec237d249d74566188d1434fc93f
oai_identifier_str oai:repositorio.unipampa.edu.br:riu/6753
network_acronym_str UNIP
network_name_str Repositório Institucional da UNIPAMPA
repository_id_str
spelling Silva, João Pablo Silva daEcar, Miguel da Silva2022-03-03T19:32:12Z2022-02-182022-03-03T19:32:12Z2021-09-30ECAR, Miguel da Silva. Intelligent classification of SPI practices and evidences based on NLP and semantic similarity. Orientador: João Pablo Silva da Silva. 2021. 117p. Dissertação (Mestrado Profissional em Engenharia de Software) - Universidade Federal do Pampa, Campus Alegrete, Alegrete, 2021.https://repositorio.unipampa.edu.br/jspui/handle/riu/6753Software Process Improvement (SPI) consists of a set of changes in software development companies, which introduces new and improved methods, techniques, and tools. SPI initiatives generally are performed based on a reference model, such as CMMI, ISO 9001, ISO 15504, among others. One of the first steps when investing in SPI initiatives is the SPI Diagnostic. The SPI Diagnostic is generally performed manually, which demands high effort from consultants. Moreover, a high data volume is generated and must be analyzed, which is bound to subjective analysis. Since there has been a lack of automation tools to support this process, it turns SPI Diagnostic a challenging process. This work aims to propose an intelligent tool called Coptic for Practice-Evidence classification, using Natural Language Processing and Semantic Similarity. We also propose Base of Knowledge about Software Engineering Practices (Badge), which is a domain ontology that generalizes SPI resources that says “what should be done” and PStory, which is a template to write pieces of evidence. We evaluated Badge through a focus group session. We evaluated PStory through an exercise and questionnaire with industry professionals. We evaluated Coptic by a quasi-experiment with PStories evaluated by industry professionals. As an outcome, Coptic presented satisfactory results using the initial corpus. We conclude that Coptic presents a valuable result in terms of providing support to professionals in performing a SPI Diagnostic. Badge introduces a domain ontology that differs from the related proposals in literature and has value to SPI initiatives. We also concluded that PStory introduces a simple way to write pieces of evidence, and Coptic provides support to SPI Practices-Evidence matching process. Key-words: Coptic, Badge, PStory, Software Process Improvement, SPI.Melhoria de Processo de Software (MPS) consiste em um conjunto de mudanças nas empresas de desenvolvimento de software, que pode estar relacionado a criação ou melhoria de métodos, técnicas, processos e ferramentas. Iniciativas de MPS geralmente são realizados com base em um modelo de referência, como CMMI, ISO 9001, ISO 15504, entre outros. Um dos primeiros passos ao investir em iniciativas de SPI é o diagnóstico. Na maioria dos casos o diagnóstico é realizado manualmente, o que demanda maior esforço dos consultores. Além disso, um grande volume de dados é gerado e deve ser analisado, o que resulta em análises com certa subjetividade. Como não há ferramentas de automação para dar suporte a esse processo, o diagnóstico torna-se um processo desafiador. Este estudo tem como objetivo propor uma ferramenta inteligente denominada Coptic para classificação de evidências e práticas de MPS, utilizando Processamento de Língua Natural e Similaridade Semântica. Também propomos a Badge, que é uma ontologia de domínio que generaliza recursos de MPS do tipo que dizem “o que deve ser feito” e PStory, que é um modelo para escrita de evidências. A Ontologia Badge foi avaliada através de um grupo focal. Avaliamos o PStory por meio de um exercício e questionário com profissionais da indústria. Coptic foi avaliado através de um quasi-experimento com PStories avaliadas por profissionais da indústria. Como resultado, o Coptic apresentou resultados satisfatórios com o corpus inicial. Concluímos que Badge apresenta uma ontologia de domínio que difere das propostas relacionadas na literatura e tem valor para iniciativas de MPS. O PStory apresenta uma maneira simples de escrever evidências, e o Coptic fornece suporte para o processo de classificação de evidências e práticas de MPS. Palavras-chave: Coptic, Badge, PStory, Melhoria de Processo de Software.engUniversidade Federal do PampaMestrado Profissional em Engenharia de SoftwareUNIPAMPABrasilCampus AlegreteCNPQ::CIENCIAS EXATAS E DA TERRAEngenharia de softwareSoftware - desempenhoSoftware EngineeringSoftware - performanceIntelligent classification of SPI practices and evidences based on NLP and semantic similarityinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNIPAMPAinstname:Universidade Federal do Pampa (UNIPAMPA)instacron:UNIPAMPAORIGINALMiguel da Silva Ecar - 2021.pdfMiguel da Silva Ecar - 2021.pdfapplication/pdf1210925https://repositorio.unipampa.edu.br/bitstreams/be359277-64cd-4803-8652-ede9c484d9fe/download128d353d93ff20b5b1bf0a7052896b8bMD51trueAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-81854https://repositorio.unipampa.edu.br/bitstreams/9c24ab68-1399-4e50-b1cf-9ffeccaeae07/downloadc9ad5aff503ef7873c4004c5b07c0b27MD52falseAnonymousREADriu/67532022-03-03 19:32:13.348open.accessoai:repositorio.unipampa.edu.br:riu/6753https://repositorio.unipampa.edu.brRepositório InstitucionalPUBhttp://dspace.unipampa.edu.br:8080/oai/requestsisbi@unipampa.edu.bropendoar:2022-03-03T19:32:13Repositório Institucional da UNIPAMPA - Universidade Federal do Pampa (UNIPAMPA)falseTElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgYW8gUmVwb3NpdMOzcmlvCkluc3RpdHVjaW9uYWwgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsICB0cmFkdXppciAoY29uZm9ybWUgZGVmaW5pZG8gYWJhaXhvKSwgZS9vdSBkaXN0cmlidWlyIGEKc3VhIHB1YmxpY2HDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlIGVtIHF1YWxxdWVyIG1laW8sIGluY2x1aW5kbyBvcwpmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIGEgVU5JUEFNUEEgcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250ZcO6ZG8sIHRyYW5zcG9yIGEgc3VhIHB1YmxpY2HDp8OjbyBwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0bwpwYXJhIGZpbnMgZGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIHRhbWLDqW0gY29uY29yZGEgcXVlICBhIFVOSVBBTVBBIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPDs3BpYSBkZSBzdWEgcHVibGljYcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwCmUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHB1YmxpY2HDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4KVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgcHVibGljYcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldSBjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzCmRlIG5pbmd1w6ltLgoKQ2FzbyBhIHN1YSBwdWJsaWNhw6fDo28gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY8OqIG7Do28gcG9zc3VpIGEgdGl0dWxhcmlkYWRlIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgdm9jw6ogZGVjbGFyYSBxdWUKb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFVOSVBBTVBBIG9zIGRpcmVpdG9zIGFwcmVzZW50YWRvcwpuZXN0YSBsaWNlbsOnYSwgZSBxdWUgZXNzZSBtYXRlcmlhbCBkZSBwcm9wcmllZGFkZSBkZSB0ZXJjZWlyb3MgZXN0w6EgY2xhcmFtZW50ZSBpZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0bwpvdSBubyBjb250ZcO6ZG8gZGEgcHVibGljYcOnw6NvIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFBVQkxJQ0HDh8ODTyBPUkEgREVQT1NJVEFEQSBURU5IQSBTSURPIFJFU1VMVEFETyBERSBVTSBQQVRST0PDjU5JTyBPVSBBUE9JTyBERSBVTUEgQUfDik5DSUEgREUgRk9NRU5UTyBPVSBPVVRSTwpPUkdBTklTTU8sIFZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PIFRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUwpFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVTklQQU1QQSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIChzKSBvdSBvKHMpIG5vbWUocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcwphdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4K
dc.title.pt_BR.fl_str_mv Intelligent classification of SPI practices and evidences based on NLP and semantic similarity
title Intelligent classification of SPI practices and evidences based on NLP and semantic similarity
spellingShingle Intelligent classification of SPI practices and evidences based on NLP and semantic similarity
Ecar, Miguel da Silva
CNPQ::CIENCIAS EXATAS E DA TERRA
Engenharia de software
Software - desempenho
Software Engineering
Software - performance
title_short Intelligent classification of SPI practices and evidences based on NLP and semantic similarity
title_full Intelligent classification of SPI practices and evidences based on NLP and semantic similarity
title_fullStr Intelligent classification of SPI practices and evidences based on NLP and semantic similarity
title_full_unstemmed Intelligent classification of SPI practices and evidences based on NLP and semantic similarity
title_sort Intelligent classification of SPI practices and evidences based on NLP and semantic similarity
author Ecar, Miguel da Silva
author_facet Ecar, Miguel da Silva
author_role author
dc.contributor.advisor1.fl_str_mv Silva, João Pablo Silva da
dc.contributor.author.fl_str_mv Ecar, Miguel da Silva
contributor_str_mv Silva, João Pablo Silva da
dc.subject.cnpq.fl_str_mv CNPQ::CIENCIAS EXATAS E DA TERRA
topic CNPQ::CIENCIAS EXATAS E DA TERRA
Engenharia de software
Software - desempenho
Software Engineering
Software - performance
dc.subject.por.fl_str_mv Engenharia de software
Software - desempenho
Software Engineering
Software - performance
description Software Process Improvement (SPI) consists of a set of changes in software development companies, which introduces new and improved methods, techniques, and tools. SPI initiatives generally are performed based on a reference model, such as CMMI, ISO 9001, ISO 15504, among others. One of the first steps when investing in SPI initiatives is the SPI Diagnostic. The SPI Diagnostic is generally performed manually, which demands high effort from consultants. Moreover, a high data volume is generated and must be analyzed, which is bound to subjective analysis. Since there has been a lack of automation tools to support this process, it turns SPI Diagnostic a challenging process. This work aims to propose an intelligent tool called Coptic for Practice-Evidence classification, using Natural Language Processing and Semantic Similarity. We also propose Base of Knowledge about Software Engineering Practices (Badge), which is a domain ontology that generalizes SPI resources that says “what should be done” and PStory, which is a template to write pieces of evidence. We evaluated Badge through a focus group session. We evaluated PStory through an exercise and questionnaire with industry professionals. We evaluated Coptic by a quasi-experiment with PStories evaluated by industry professionals. As an outcome, Coptic presented satisfactory results using the initial corpus. We conclude that Coptic presents a valuable result in terms of providing support to professionals in performing a SPI Diagnostic. Badge introduces a domain ontology that differs from the related proposals in literature and has value to SPI initiatives. We also concluded that PStory introduces a simple way to write pieces of evidence, and Coptic provides support to SPI Practices-Evidence matching process. Key-words: Coptic, Badge, PStory, Software Process Improvement, SPI.
publishDate 2021
dc.date.issued.fl_str_mv 2021-09-30
dc.date.accessioned.fl_str_mv 2022-03-03T19:32:12Z
dc.date.available.fl_str_mv 2022-02-18
2022-03-03T19:32:12Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv ECAR, Miguel da Silva. Intelligent classification of SPI practices and evidences based on NLP and semantic similarity. Orientador: João Pablo Silva da Silva. 2021. 117p. Dissertação (Mestrado Profissional em Engenharia de Software) - Universidade Federal do Pampa, Campus Alegrete, Alegrete, 2021.
dc.identifier.uri.fl_str_mv https://repositorio.unipampa.edu.br/jspui/handle/riu/6753
identifier_str_mv ECAR, Miguel da Silva. Intelligent classification of SPI practices and evidences based on NLP and semantic similarity. Orientador: João Pablo Silva da Silva. 2021. 117p. Dissertação (Mestrado Profissional em Engenharia de Software) - Universidade Federal do Pampa, Campus Alegrete, Alegrete, 2021.
url https://repositorio.unipampa.edu.br/jspui/handle/riu/6753
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal do Pampa
dc.publisher.program.fl_str_mv Mestrado Profissional em Engenharia de Software
dc.publisher.initials.fl_str_mv UNIPAMPA
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Campus Alegrete
publisher.none.fl_str_mv Universidade Federal do Pampa
dc.source.none.fl_str_mv reponame:Repositório Institucional da UNIPAMPA
instname:Universidade Federal do Pampa (UNIPAMPA)
instacron:UNIPAMPA
instname_str Universidade Federal do Pampa (UNIPAMPA)
instacron_str UNIPAMPA
institution UNIPAMPA
reponame_str Repositório Institucional da UNIPAMPA
collection Repositório Institucional da UNIPAMPA
bitstream.url.fl_str_mv https://repositorio.unipampa.edu.br/bitstreams/be359277-64cd-4803-8652-ede9c484d9fe/download
https://repositorio.unipampa.edu.br/bitstreams/9c24ab68-1399-4e50-b1cf-9ffeccaeae07/download
bitstream.checksum.fl_str_mv 128d353d93ff20b5b1bf0a7052896b8b
c9ad5aff503ef7873c4004c5b07c0b27
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UNIPAMPA - Universidade Federal do Pampa (UNIPAMPA)
repository.mail.fl_str_mv sisbi@unipampa.edu.br
_version_ 1854750408796274688