Métodos de Krylov aplicados na análise de regressão linear de Big Data
| Ano de defesa: | 2025 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | por |
| Instituição de defesa: |
Universidade Federal de Minas Gerais
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://hdl.handle.net/1843/81230 |
Resumo: | The increasing use of massive and complex databases, known as Big Data, renders the optimization of traditional data analysis methods necessary to enable their application to this type of information. Even statistical analysis methods considered simple, such as linear regression, are inefficient when applied in their traditional form to Big Data due to their high computational cost. Consequently, they require adaptations. This paper addresses the application of the Krylov Methods, a class of algorithms, to Big Data to efficiently estimate parameters for linear regression. These methods return an approximation of the solution to the least squares problem at each iteration, being computationally more economical to obtain a satisfactory estimate of the solution compared to other traditional methods, such as the QR decomposition method applied to the least squares problem. Two Krylov methods are presented and studied in the text: the Generalized Minimum Residuals (GMRES) and LSMR, with a strong focus on the latter. Finally, to evaluate the performance of LSMR, several simulation studies are presented in databases of different dimensionalities, along with applications in real datasets, all considered Big Data. Performance was measured by comparing metrics resulting from the LSMR studies, such as the execution time of the algorithm, with those resulting from the direct method of QR decomposition applied to the least squares problem. In addition, the same approach was employed to correlate LSMR with two other Krylov methods, LSQR and the Conjugate Gradient method. Overall, LSMR performed better regarding execution time, providing estimates of solutions similar to or even better than the other methods evaluated. |
| id |
UFMG_192490b8c2dad1ca2b60a6af4c3032e8 |
|---|---|
| oai_identifier_str |
oai:repositorio.ufmg.br:1843/81230 |
| network_acronym_str |
UFMG |
| network_name_str |
Repositório Institucional da UFMG |
| repository_id_str |
|
| spelling |
Métodos de Krylov aplicados na análise de regressão linear de Big DataKrylov Methods applied to linear regression analysis of Big DataEstatística – TesesAnálise de regressão – TesesÁlgebra linear - TesesBig data – TesesMínimos quadrados – Processamento de dados – TesesRegressão linearMétodos de KrylovBig dataÁlgebra linear numérica.The increasing use of massive and complex databases, known as Big Data, renders the optimization of traditional data analysis methods necessary to enable their application to this type of information. Even statistical analysis methods considered simple, such as linear regression, are inefficient when applied in their traditional form to Big Data due to their high computational cost. Consequently, they require adaptations. This paper addresses the application of the Krylov Methods, a class of algorithms, to Big Data to efficiently estimate parameters for linear regression. These methods return an approximation of the solution to the least squares problem at each iteration, being computationally more economical to obtain a satisfactory estimate of the solution compared to other traditional methods, such as the QR decomposition method applied to the least squares problem. Two Krylov methods are presented and studied in the text: the Generalized Minimum Residuals (GMRES) and LSMR, with a strong focus on the latter. Finally, to evaluate the performance of LSMR, several simulation studies are presented in databases of different dimensionalities, along with applications in real datasets, all considered Big Data. Performance was measured by comparing metrics resulting from the LSMR studies, such as the execution time of the algorithm, with those resulting from the direct method of QR decomposition applied to the least squares problem. In addition, the same approach was employed to correlate LSMR with two other Krylov methods, LSQR and the Conjugate Gradient method. Overall, LSMR performed better regarding execution time, providing estimates of solutions similar to or even better than the other methods evaluated.FAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas GeraisUniversidade Federal de Minas Gerais2025-04-02T15:48:40Z2025-09-09T00:12:03Z2025-04-02T15:48:40Z2025-03-07info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/1843/81230porhttp://creativecommons.org/licenses/by-nc/3.0/pt/info:eu-repo/semantics/openAccessArthur Mota Silva Dantés Macedoreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-09T00:12:03Zoai:repositorio.ufmg.br:1843/81230Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T00:12:03Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
| dc.title.none.fl_str_mv |
Métodos de Krylov aplicados na análise de regressão linear de Big Data Krylov Methods applied to linear regression analysis of Big Data |
| title |
Métodos de Krylov aplicados na análise de regressão linear de Big Data |
| spellingShingle |
Métodos de Krylov aplicados na análise de regressão linear de Big Data Arthur Mota Silva Dantés Macedo Estatística – Teses Análise de regressão – Teses Álgebra linear - Teses Big data – Teses Mínimos quadrados – Processamento de dados – Teses Regressão linear Métodos de Krylov Big data Álgebra linear numérica. |
| title_short |
Métodos de Krylov aplicados na análise de regressão linear de Big Data |
| title_full |
Métodos de Krylov aplicados na análise de regressão linear de Big Data |
| title_fullStr |
Métodos de Krylov aplicados na análise de regressão linear de Big Data |
| title_full_unstemmed |
Métodos de Krylov aplicados na análise de regressão linear de Big Data |
| title_sort |
Métodos de Krylov aplicados na análise de regressão linear de Big Data |
| author |
Arthur Mota Silva Dantés Macedo |
| author_facet |
Arthur Mota Silva Dantés Macedo |
| author_role |
author |
| dc.contributor.author.fl_str_mv |
Arthur Mota Silva Dantés Macedo |
| dc.subject.por.fl_str_mv |
Estatística – Teses Análise de regressão – Teses Álgebra linear - Teses Big data – Teses Mínimos quadrados – Processamento de dados – Teses Regressão linear Métodos de Krylov Big data Álgebra linear numérica. |
| topic |
Estatística – Teses Análise de regressão – Teses Álgebra linear - Teses Big data – Teses Mínimos quadrados – Processamento de dados – Teses Regressão linear Métodos de Krylov Big data Álgebra linear numérica. |
| description |
The increasing use of massive and complex databases, known as Big Data, renders the optimization of traditional data analysis methods necessary to enable their application to this type of information. Even statistical analysis methods considered simple, such as linear regression, are inefficient when applied in their traditional form to Big Data due to their high computational cost. Consequently, they require adaptations. This paper addresses the application of the Krylov Methods, a class of algorithms, to Big Data to efficiently estimate parameters for linear regression. These methods return an approximation of the solution to the least squares problem at each iteration, being computationally more economical to obtain a satisfactory estimate of the solution compared to other traditional methods, such as the QR decomposition method applied to the least squares problem. Two Krylov methods are presented and studied in the text: the Generalized Minimum Residuals (GMRES) and LSMR, with a strong focus on the latter. Finally, to evaluate the performance of LSMR, several simulation studies are presented in databases of different dimensionalities, along with applications in real datasets, all considered Big Data. Performance was measured by comparing metrics resulting from the LSMR studies, such as the execution time of the algorithm, with those resulting from the direct method of QR decomposition applied to the least squares problem. In addition, the same approach was employed to correlate LSMR with two other Krylov methods, LSQR and the Conjugate Gradient method. Overall, LSMR performed better regarding execution time, providing estimates of solutions similar to or even better than the other methods evaluated. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025-04-02T15:48:40Z 2025-09-09T00:12:03Z 2025-04-02T15:48:40Z 2025-03-07 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1843/81230 |
| url |
https://hdl.handle.net/1843/81230 |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
http://creativecommons.org/licenses/by-nc/3.0/pt/ info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc/3.0/pt/ |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
| instname_str |
Universidade Federal de Minas Gerais (UFMG) |
| instacron_str |
UFMG |
| institution |
UFMG |
| reponame_str |
Repositório Institucional da UFMG |
| collection |
Repositório Institucional da UFMG |
| repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
| repository.mail.fl_str_mv |
repositorio@ufmg.br |
| _version_ |
1856414029645348864 |