Métodos de Krylov aplicados na análise de regressão linear de Big Data

Detalhes bibliográficos
Ano de defesa: 2025
Autor(a) principal: Arthur Mota Silva Dantés Macedo
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://hdl.handle.net/1843/81230
Resumo: The increasing use of massive and complex databases, known as Big Data, renders the optimization of traditional data analysis methods necessary to enable their application to this type of information. Even statistical analysis methods considered simple, such as linear regression, are inefficient when applied in their traditional form to Big Data due to their high computational cost. Consequently, they require adaptations. This paper addresses the application of the Krylov Methods, a class of algorithms, to Big Data to efficiently estimate parameters for linear regression. These methods return an approximation of the solution to the least squares problem at each iteration, being computationally more economical to obtain a satisfactory estimate of the solution compared to other traditional methods, such as the QR decomposition method applied to the least squares problem. Two Krylov methods are presented and studied in the text: the Generalized Minimum Residuals (GMRES) and LSMR, with a strong focus on the latter. Finally, to evaluate the performance of LSMR, several simulation studies are presented in databases of different dimensionalities, along with applications in real datasets, all considered Big Data. Performance was measured by comparing metrics resulting from the LSMR studies, such as the execution time of the algorithm, with those resulting from the direct method of QR decomposition applied to the least squares problem. In addition, the same approach was employed to correlate LSMR with two other Krylov methods, LSQR and the Conjugate Gradient method. Overall, LSMR performed better regarding execution time, providing estimates of solutions similar to or even better than the other methods evaluated.
id UFMG_192490b8c2dad1ca2b60a6af4c3032e8
oai_identifier_str oai:repositorio.ufmg.br:1843/81230
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling Métodos de Krylov aplicados na análise de regressão linear de Big DataKrylov Methods applied to linear regression analysis of Big DataEstatística – TesesAnálise de regressão – TesesÁlgebra linear - TesesBig data – TesesMínimos quadrados – Processamento de dados – TesesRegressão linearMétodos de KrylovBig dataÁlgebra linear numérica.The increasing use of massive and complex databases, known as Big Data, renders the optimization of traditional data analysis methods necessary to enable their application to this type of information. Even statistical analysis methods considered simple, such as linear regression, are inefficient when applied in their traditional form to Big Data due to their high computational cost. Consequently, they require adaptations. This paper addresses the application of the Krylov Methods, a class of algorithms, to Big Data to efficiently estimate parameters for linear regression. These methods return an approximation of the solution to the least squares problem at each iteration, being computationally more economical to obtain a satisfactory estimate of the solution compared to other traditional methods, such as the QR decomposition method applied to the least squares problem. Two Krylov methods are presented and studied in the text: the Generalized Minimum Residuals (GMRES) and LSMR, with a strong focus on the latter. Finally, to evaluate the performance of LSMR, several simulation studies are presented in databases of different dimensionalities, along with applications in real datasets, all considered Big Data. Performance was measured by comparing metrics resulting from the LSMR studies, such as the execution time of the algorithm, with those resulting from the direct method of QR decomposition applied to the least squares problem. In addition, the same approach was employed to correlate LSMR with two other Krylov methods, LSQR and the Conjugate Gradient method. Overall, LSMR performed better regarding execution time, providing estimates of solutions similar to or even better than the other methods evaluated.FAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisUniversidade Federal de Minas Gerais2025-04-02T15:48:40Z2025-09-09T00:12:03Z2025-04-02T15:48:40Z2025-03-07info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/1843/81230porhttp://creativecommons.org/licenses/by-nc/3.0/pt/info:eu-repo/semantics/openAccessArthur Mota Silva Dantés Macedoreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-09T00:12:03Zoai:repositorio.ufmg.br:1843/81230Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T00:12:03Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.none.fl_str_mv Métodos de Krylov aplicados na análise de regressão linear de Big Data
Krylov Methods applied to linear regression analysis of Big Data
title Métodos de Krylov aplicados na análise de regressão linear de Big Data
spellingShingle Métodos de Krylov aplicados na análise de regressão linear de Big Data
Arthur Mota Silva Dantés Macedo
Estatística – Teses
Análise de regressão – Teses
Álgebra linear - Teses
Big data – Teses
Mínimos quadrados – Processamento de dados – Teses
Regressão linear
Métodos de Krylov
Big data
Álgebra linear numérica.
title_short Métodos de Krylov aplicados na análise de regressão linear de Big Data
title_full Métodos de Krylov aplicados na análise de regressão linear de Big Data
title_fullStr Métodos de Krylov aplicados na análise de regressão linear de Big Data
title_full_unstemmed Métodos de Krylov aplicados na análise de regressão linear de Big Data
title_sort Métodos de Krylov aplicados na análise de regressão linear de Big Data
author Arthur Mota Silva Dantés Macedo
author_facet Arthur Mota Silva Dantés Macedo
author_role author
dc.contributor.author.fl_str_mv Arthur Mota Silva Dantés Macedo
dc.subject.por.fl_str_mv Estatística – Teses
Análise de regressão – Teses
Álgebra linear - Teses
Big data – Teses
Mínimos quadrados – Processamento de dados – Teses
Regressão linear
Métodos de Krylov
Big data
Álgebra linear numérica.
topic Estatística – Teses
Análise de regressão – Teses
Álgebra linear - Teses
Big data – Teses
Mínimos quadrados – Processamento de dados – Teses
Regressão linear
Métodos de Krylov
Big data
Álgebra linear numérica.
description The increasing use of massive and complex databases, known as Big Data, renders the optimization of traditional data analysis methods necessary to enable their application to this type of information. Even statistical analysis methods considered simple, such as linear regression, are inefficient when applied in their traditional form to Big Data due to their high computational cost. Consequently, they require adaptations. This paper addresses the application of the Krylov Methods, a class of algorithms, to Big Data to efficiently estimate parameters for linear regression. These methods return an approximation of the solution to the least squares problem at each iteration, being computationally more economical to obtain a satisfactory estimate of the solution compared to other traditional methods, such as the QR decomposition method applied to the least squares problem. Two Krylov methods are presented and studied in the text: the Generalized Minimum Residuals (GMRES) and LSMR, with a strong focus on the latter. Finally, to evaluate the performance of LSMR, several simulation studies are presented in databases of different dimensionalities, along with applications in real datasets, all considered Big Data. Performance was measured by comparing metrics resulting from the LSMR studies, such as the execution time of the algorithm, with those resulting from the direct method of QR decomposition applied to the least squares problem. In addition, the same approach was employed to correlate LSMR with two other Krylov methods, LSQR and the Conjugate Gradient method. Overall, LSMR performed better regarding execution time, providing estimates of solutions similar to or even better than the other methods evaluated.
publishDate 2025
dc.date.none.fl_str_mv 2025-04-02T15:48:40Z
2025-09-09T00:12:03Z
2025-04-02T15:48:40Z
2025-03-07
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/1843/81230
url https://hdl.handle.net/1843/81230
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv http://creativecommons.org/licenses/by-nc/3.0/pt/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc/3.0/pt/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv repositorio@ufmg.br
_version_ 1856414029645348864