Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV
| Ano de defesa: | 2025 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Tese |
| Tipo de acesso: | Acesso embargado |
| Idioma: | por |
| Instituição de defesa: |
Universidade Federal do Espírito Santo
BR Doutorado em Saúde Coletiva Centro de Ciências da Saúde UFES Programa de Pós-Graduação em Saúde Coletiva |
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | http://repositorio.ufes.br/handle/10/19589 |
Resumo: | Objective: To build a prediction model for interruption of tuberculosis treatment in people living with HIV. Methods: This is a cross-sectional study developed in three stages: first, the analysis of the quality of SINAN data using the Centers for Disease Control and Prevention (CDC) Guide, from 2016 to 2018, with five methodological stages that included quality analysis, standardization of records, duplication analysis, data completeness through linkage with the SINAN-HIV database, and data anonymization. In the second stage, in addition to the methodological process of preparing the database and descriptive data analysis, the STATA statistical package, version 16 (StataCorp LP, College Station, TX, USA) was used to perform descriptive analyses with identification of relative and absolute values, and tables were generated for data analysis. The third stage consists of building the predictive model through machine learning using Multilayer Perceptron (MLP) and Restricted Boltzmann Machines (RBM) artificial neural network algorithms and Random Forest and CatBoost decision trees of TB-HIV co-infection, from 2016 to 2021, in Brazil, implemented in Python version 3.10.3; with validation through accuracy, sensitivity, specificity, true positive values and true negative values. The study obtained ethical approval under opinion no. 4022892 on 05/12/2020. Results: In the first stage, the study showed that 89% of the mandatory variables and 91% of the essential variables presented satisfactory completeness. In the case of TB-HIV co-infection, 73% of the variables were completed, but essential variables related to monitoring of TB treatment presented unsatisfactory completeness. In the second stage, of a total of 4,428 cases, 325 cases were of TB-HIV co-infection, 322 cases were located in the SINAN-TB database and three cases were located after linkage with the SINAN-HIV database that presented a record of a negative result for the HIV diagnostic test in the SINAN-TB database. The vulnerability profile of coinfection was observed in men (71%), young (20 to 39 years) (52%), mixed race (59%), with up to 8 years of education (25%), alcoholics (29%) and smokers (37%) and who used drugs (26%), with 65% adherence to antiretroviral therapy and only 44% with a cure outcome and 20% interrupted treatment; approximately 61% did not undergo directly observed treatment and only 6.9% of cases reported receiving assistance from the government's income transfer program. In the third stage, a total of 12,556 cases of TB-HIV coinfection in Brazil were analyzed, and the Multilayer Perceptron neural network algorithms were sensitive in identifying potential cases of treatment interruption, and were validated by an accuracy of 0.73, sensitivity of 0.75, and specificity of 0.62; Positive Predictive Value (PPV) of 0.91 and Negative Predictive Value (NPV) of 0.31. Conclusion: Training and capacity building to improve data collection, integration and analysis are essential to promote data quality. As well as social support, in order to enable access to health services and timely treatment for the most vulnerable. And finally, the implementation of new technologies, which optimize the breaking of the chain of TB transmission in people living with HIV, favoring actions aimed at screening, treatment and monitoring of cases. To strengthen care networks and promote equity in access to health services. |
| id |
UFES_1c28615cb110f5779645f8f95db208cb |
|---|---|
| oai_identifier_str |
oai:repositorio.ufes.br:10/19589 |
| network_acronym_str |
UFES |
| network_name_str |
Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) |
| repository_id_str |
|
| spelling |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIVEpidemiologiaTuberculoseCoinfecçãoSaúde ColetivaObjective: To build a prediction model for interruption of tuberculosis treatment in people living with HIV. Methods: This is a cross-sectional study developed in three stages: first, the analysis of the quality of SINAN data using the Centers for Disease Control and Prevention (CDC) Guide, from 2016 to 2018, with five methodological stages that included quality analysis, standardization of records, duplication analysis, data completeness through linkage with the SINAN-HIV database, and data anonymization. In the second stage, in addition to the methodological process of preparing the database and descriptive data analysis, the STATA statistical package, version 16 (StataCorp LP, College Station, TX, USA) was used to perform descriptive analyses with identification of relative and absolute values, and tables were generated for data analysis. The third stage consists of building the predictive model through machine learning using Multilayer Perceptron (MLP) and Restricted Boltzmann Machines (RBM) artificial neural network algorithms and Random Forest and CatBoost decision trees of TB-HIV co-infection, from 2016 to 2021, in Brazil, implemented in Python version 3.10.3; with validation through accuracy, sensitivity, specificity, true positive values and true negative values. The study obtained ethical approval under opinion no. 4022892 on 05/12/2020. Results: In the first stage, the study showed that 89% of the mandatory variables and 91% of the essential variables presented satisfactory completeness. In the case of TB-HIV co-infection, 73% of the variables were completed, but essential variables related to monitoring of TB treatment presented unsatisfactory completeness. In the second stage, of a total of 4,428 cases, 325 cases were of TB-HIV co-infection, 322 cases were located in the SINAN-TB database and three cases were located after linkage with the SINAN-HIV database that presented a record of a negative result for the HIV diagnostic test in the SINAN-TB database. The vulnerability profile of coinfection was observed in men (71%), young (20 to 39 years) (52%), mixed race (59%), with up to 8 years of education (25%), alcoholics (29%) and smokers (37%) and who used drugs (26%), with 65% adherence to antiretroviral therapy and only 44% with a cure outcome and 20% interrupted treatment; approximately 61% did not undergo directly observed treatment and only 6.9% of cases reported receiving assistance from the government's income transfer program. In the third stage, a total of 12,556 cases of TB-HIV coinfection in Brazil were analyzed, and the Multilayer Perceptron neural network algorithms were sensitive in identifying potential cases of treatment interruption, and were validated by an accuracy of 0.73, sensitivity of 0.75, and specificity of 0.62; Positive Predictive Value (PPV) of 0.91 and Negative Predictive Value (NPV) of 0.31. Conclusion: Training and capacity building to improve data collection, integration and analysis are essential to promote data quality. As well as social support, in order to enable access to health services and timely treatment for the most vulnerable. And finally, the implementation of new technologies, which optimize the breaking of the chain of TB transmission in people living with HIV, favoring actions aimed at screening, treatment and monitoring of cases. To strengthen care networks and promote equity in access to health services.Objetivo: Construir um modelo de predição para a interrupção do tratamento da tuberculose em pessoas que vivem com HIV. Métodos: Trata-se de um estudo transversal desenvolvido em três etapas, primeiramente a análise da qualidade dos dados do SINAN com uso do Guia do Centers for Disease Control and Prevention (CDC), no período de 2016 a 2018, com cinco etapas metodológicas que incluíram análise da qualidade, padronização dos registros, análise de duplicidade, a completude dos dados por meio de linkage com o banco de dados do SINAN-HIV e anonimização dos dados. Na segunda etapa além do processo metodológico de preparação do banco e análise descritiva de dados, utilizou-se o pacote estatístico STATA, versão 16 (StataCorp LP, College Station, TX, EUA) para realização das análises descritivas com identificação dos valores relativos e absolutos, e foram geradas tabelas para análise dos dados. A terceira etapa consiste na construção do modelo preditivo por meio do aprendizado de máquina utilizando algoritmos de redes neurais artificiais Multilayer Perceptron (MLP) e Restricted Boltzmann Machines (RBM) e de árvores de decisão Random Forest e CatBoost da coinfecção TB-HIV, no período de 2016 a 2021, no Brasil, implementados em Python na versão 3.10.3; com validação por meio da acurácia, sensibilidade, especificidade, valores verdadeiros positivos e valores verdadeiros negativos. O estudo obteve aprovação ética sob parecer de nº 4022892 em 12/05/2020. Resultados: Na primeira etapa, o estudo mostrou que 89% das variáveis obrigatórias e 91% das variáveis essenciais apresentaram completude satisfatória. Já na coinfecção TB-HIV 73% das variáveis foram preenchidas, porém variáveis essenciais relacionadas ao acompanhamento do tratamento para TB apresentaram completude insatisfatória. Na segunda etapa, de um total de 4.428 casos, 325 casos eram de coinfecção TB-HIV, 322 casos foram localizados no banco SINAN-TB e três casos foram localizados após linkage com o banco SINAN-HIV que apresentaram registro de resultado negativo para o teste diagnóstico de HIV no banco SINAN-TB. O perfil de vulnerabilidade da coinfecção se deu por homens (71%), jovens (20 a 39 anos) (52%), pardos (59%), com tempo de estudo de até 8 anos (25%), etilistas (29%) e tabagista (37%) e que faziam uso de drogas (26%), com adesão de 65% à terapia antirretroviral e apenas 44% com desfecho de cura e 20% interromperam o tratamento; em torno de 61% não realizaram o tratamento diretamente observado e apenas 6,9% dos casos relataram receber auxílio pelo programa de transferência de renda do governo. Na terceira etapa, um total de 12.556 casos de coinfecção TB-HIV no Brasil, foram analisados, e os algoritmos de rede neural do tipo Multilayer Perceptron foram sensíveis na identificação de potenciais casos de interrupção do tratamento, e foi validado pela acurária de 0,73, sensibilidade 0,75, especificidade 0,62; Valor Preditivo Positivo (VPP) de 0,91 e Valor Preditivo Negativo (VPN) 0,31. Conclusão: O treinamento e a capacitação para aprimoramento na coleta, integração e análise dos dados são primordiais para promover a qualidade dos dados. Assim como de suporte social, a fim de possibilitar o acesso aos serviços de saúde e tratamento oportuno aos mais vulneráveis. E por fim, a implantação de novas tecnologias, que otimizam a quebra da cadeia de transmissão da TB em pessoas vivendo com HIV, favorecendo ações direcionadas ao rastreio, tratamento e acompanhamento dos casos. Para o fortalecimento das redes de cuidado e promoção da equidade no acesso aos serviços de saúde.Fundação de Amparo à Pesquisa e Inovação do Espírito Santo (FAPES)Universidade Federal do Espírito SantoBRDoutorado em Saúde ColetivaCentro de Ciências da SaúdeUFESPrograma de Pós-Graduação em Saúde ColetivaHisatugu, Wilian Hiroshihttps://orcid.org/0000-0001-8333-0539http://lattes.cnpq.br/6597878238749014Prado, Thiago Nascimento dohttps://orcid.org/0000-0001-8132-6288http://lattes.cnpq.br/6388559394015871https://orcid.org/0000-0002-2296-1190Sanabria, Gladys Mercedes EstigarribiaNegri, Leticya dos Santos AlmeidaRissino, Silvia das DoresPossuelo, Lia GonçalvesSoares, Karllian Kerlen Simonelli2025-05-27T20:00:55Z2025-05-27T20:00:55Z2025-01-30info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisTextapplication/pdfhttp://repositorio.ufes.br/handle/10/19589porinfo:eu-repo/semantics/embargoedAccessreponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)instname:Universidade Federal do Espírito Santo (UFES)instacron:UFES2025-05-27T17:29:32Zoai:repositorio.ufes.br:10/19589Repositório InstitucionalPUBhttp://repositorio.ufes.br/oai/requestriufes@ufes.bropendoar:21082025-05-27T17:29:32Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES)false |
| dc.title.none.fl_str_mv |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV |
| title |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV |
| spellingShingle |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV Soares, Karllian Kerlen Simonelli Epidemiologia Tuberculose Coinfecção Saúde Coletiva |
| title_short |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV |
| title_full |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV |
| title_fullStr |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV |
| title_full_unstemmed |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV |
| title_sort |
Uso de aprendizado de máquina em análise preditiva na interrupção do tratamento da tuberculose em pessoas que vivem com HIV |
| author |
Soares, Karllian Kerlen Simonelli |
| author_facet |
Soares, Karllian Kerlen Simonelli |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Hisatugu, Wilian Hiroshi https://orcid.org/0000-0001-8333-0539 http://lattes.cnpq.br/6597878238749014 Prado, Thiago Nascimento do https://orcid.org/0000-0001-8132-6288 http://lattes.cnpq.br/6388559394015871 https://orcid.org/0000-0002-2296-1190 Sanabria, Gladys Mercedes Estigarribia Negri, Leticya dos Santos Almeida Rissino, Silvia das Dores Possuelo, Lia Gonçalves |
| dc.contributor.author.fl_str_mv |
Soares, Karllian Kerlen Simonelli |
| dc.subject.por.fl_str_mv |
Epidemiologia Tuberculose Coinfecção Saúde Coletiva |
| topic |
Epidemiologia Tuberculose Coinfecção Saúde Coletiva |
| description |
Objective: To build a prediction model for interruption of tuberculosis treatment in people living with HIV. Methods: This is a cross-sectional study developed in three stages: first, the analysis of the quality of SINAN data using the Centers for Disease Control and Prevention (CDC) Guide, from 2016 to 2018, with five methodological stages that included quality analysis, standardization of records, duplication analysis, data completeness through linkage with the SINAN-HIV database, and data anonymization. In the second stage, in addition to the methodological process of preparing the database and descriptive data analysis, the STATA statistical package, version 16 (StataCorp LP, College Station, TX, USA) was used to perform descriptive analyses with identification of relative and absolute values, and tables were generated for data analysis. The third stage consists of building the predictive model through machine learning using Multilayer Perceptron (MLP) and Restricted Boltzmann Machines (RBM) artificial neural network algorithms and Random Forest and CatBoost decision trees of TB-HIV co-infection, from 2016 to 2021, in Brazil, implemented in Python version 3.10.3; with validation through accuracy, sensitivity, specificity, true positive values and true negative values. The study obtained ethical approval under opinion no. 4022892 on 05/12/2020. Results: In the first stage, the study showed that 89% of the mandatory variables and 91% of the essential variables presented satisfactory completeness. In the case of TB-HIV co-infection, 73% of the variables were completed, but essential variables related to monitoring of TB treatment presented unsatisfactory completeness. In the second stage, of a total of 4,428 cases, 325 cases were of TB-HIV co-infection, 322 cases were located in the SINAN-TB database and three cases were located after linkage with the SINAN-HIV database that presented a record of a negative result for the HIV diagnostic test in the SINAN-TB database. The vulnerability profile of coinfection was observed in men (71%), young (20 to 39 years) (52%), mixed race (59%), with up to 8 years of education (25%), alcoholics (29%) and smokers (37%) and who used drugs (26%), with 65% adherence to antiretroviral therapy and only 44% with a cure outcome and 20% interrupted treatment; approximately 61% did not undergo directly observed treatment and only 6.9% of cases reported receiving assistance from the government's income transfer program. In the third stage, a total of 12,556 cases of TB-HIV coinfection in Brazil were analyzed, and the Multilayer Perceptron neural network algorithms were sensitive in identifying potential cases of treatment interruption, and were validated by an accuracy of 0.73, sensitivity of 0.75, and specificity of 0.62; Positive Predictive Value (PPV) of 0.91 and Negative Predictive Value (NPV) of 0.31. Conclusion: Training and capacity building to improve data collection, integration and analysis are essential to promote data quality. As well as social support, in order to enable access to health services and timely treatment for the most vulnerable. And finally, the implementation of new technologies, which optimize the breaking of the chain of TB transmission in people living with HIV, favoring actions aimed at screening, treatment and monitoring of cases. To strengthen care networks and promote equity in access to health services. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025-05-27T20:00:55Z 2025-05-27T20:00:55Z 2025-01-30 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
| format |
doctoralThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
http://repositorio.ufes.br/handle/10/19589 |
| url |
http://repositorio.ufes.br/handle/10/19589 |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
| eu_rights_str_mv |
embargoedAccess |
| dc.format.none.fl_str_mv |
Text application/pdf |
| dc.publisher.none.fl_str_mv |
Universidade Federal do Espírito Santo BR Doutorado em Saúde Coletiva Centro de Ciências da Saúde UFES Programa de Pós-Graduação em Saúde Coletiva |
| publisher.none.fl_str_mv |
Universidade Federal do Espírito Santo BR Doutorado em Saúde Coletiva Centro de Ciências da Saúde UFES Programa de Pós-Graduação em Saúde Coletiva |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) instname:Universidade Federal do Espírito Santo (UFES) instacron:UFES |
| instname_str |
Universidade Federal do Espírito Santo (UFES) |
| instacron_str |
UFES |
| institution |
UFES |
| reponame_str |
Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) |
| collection |
Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) |
| repository.name.fl_str_mv |
Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES) |
| repository.mail.fl_str_mv |
riufes@ufes.br |
| _version_ |
1834479097121603584 |