Imbalanced Regression Pipeline Recommendation
| Ano de defesa: | 2024 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Tese |
| Tipo de acesso: | Acesso embargado |
| Idioma: | eng |
| Instituição de defesa: |
Universidade Federal de Pernambuco
UFPE Brasil Programa de Pos Graduacao em Ciencia da Computacao |
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://repositorio.ufpe.br/handle/123456789/58487 |
Resumo: | Imbalanced problems are common in various real-world scenarios and present significant chal- lenges, especially for regression tasks due to the rarity of certain continuous target values. While these issues have been extensively explored in classification tasks, they also affect re- gression, complicating model performance. This work presents an extensive experimental study involving various balancing strategies and learning models, introduces a taxonomy for imbal- anced regression approaches based on regression models, learning process modification, and evaluation metrics, and highlights new insights into the advantages of different strategies. From this study, it became evident that the choice of resampling method depends on the problem, learning models, and metrics, making it difficult to select an appropriate resam- pling strategy and learning model. As a result, it is necessary to test the majority of existing combinations. Based on these findings, this work proposes the Meta-learning for Imbalanced Regression (Meta-IR) framework to address these challenges. Meta-IR recommends optimal pipelines consisting of resampling strategies and learning models for imbalanced regression tasks. Two formulations are proposed: Independent, which separately recommends learning algorithms and resampling strategies, and Chained, which models their interdependencies se- quentially. The Chained approach demonstrated superior performance, suggesting a significant relationship between learning algorithms and resampling strategies. Compared with AutoML models and baseline configurations, Meta-IR outperformed all, offering a more effective solu- tion for imbalanced regression and indicating directions for future research. |
| id |
UFPE_18ab553dd5c16850bec68607a362d724 |
|---|---|
| oai_identifier_str |
oai:repositorio.ufpe.br:123456789/58487 |
| network_acronym_str |
UFPE |
| network_name_str |
Repositório Institucional da UFPE |
| repository_id_str |
|
| spelling |
Imbalanced Regression Pipeline RecommendationRegressão desbalanceadaEstratégias de reamostragemMeta-aprendizadoImbalanced problems are common in various real-world scenarios and present significant chal- lenges, especially for regression tasks due to the rarity of certain continuous target values. While these issues have been extensively explored in classification tasks, they also affect re- gression, complicating model performance. This work presents an extensive experimental study involving various balancing strategies and learning models, introduces a taxonomy for imbal- anced regression approaches based on regression models, learning process modification, and evaluation metrics, and highlights new insights into the advantages of different strategies. From this study, it became evident that the choice of resampling method depends on the problem, learning models, and metrics, making it difficult to select an appropriate resam- pling strategy and learning model. As a result, it is necessary to test the majority of existing combinations. Based on these findings, this work proposes the Meta-learning for Imbalanced Regression (Meta-IR) framework to address these challenges. Meta-IR recommends optimal pipelines consisting of resampling strategies and learning models for imbalanced regression tasks. Two formulations are proposed: Independent, which separately recommends learning algorithms and resampling strategies, and Chained, which models their interdependencies se- quentially. The Chained approach demonstrated superior performance, suggesting a significant relationship between learning algorithms and resampling strategies. Compared with AutoML models and baseline configurations, Meta-IR outperformed all, offering a more effective solu- tion for imbalanced regression and indicating directions for future research.Problemas de desbalanceamento são comuns em diversos cenários do mundo real e apre- sentam desafios significativos, especialmente para tarefas de regressão, devido à raridade de certos valores-alvo contínuos. Embora essas questões tenham sido amplamente exploradas em tarefas de classificação, elas também afetam a regressão, complicando o desempenho dos mod- elos. Este trabalho apresenta um estudo experimental extenso envolvendo várias estratégias de balanceamento e modelos de aprendizado, introduzimos uma taxonomia para abordagens de regressão desbalanceada baseada em modelos de regressão, modificação no processo de aprendizado e métricas de avaliação, e destaca novos insights sobre as vantagens de diferentes estratégias. A partir deste estudo, ficou evidente que a escolha do método de reamostragem depende do problema, dos modelos de aprendizado e das métricas, tornando difícil selecionar uma estratégia de reamostragem e um modelo de aprendizado apropriados. Como resultado, é necessário testar a maioria das combinações existentes. Com base nessas descobertas, este tra- balho propõe o modelo Meta-learning for Imbalanced Regression (Meta-IR) para enfrentar esses desafios. O Meta-IR recomenda pipelines ideais que consistem em estratégias de reamostragem e modelos de aprendizado para tarefas de regressão desbalanceada. Duas formulações são pro- postas: Independente, que recomenda separadamente algoritmos de aprendizado e estratégias de reamostragem, e Encadeada, que modela suas interdependências sequencialmente. A abor- dagem Encadeada demonstrou desempenho superior, sugerindo uma relação significativa entre algoritmos de aprendizado e estratégias de reamostragem. Em comparação com modelos de AutoML e configurações de linha de base, o Meta-IR superou todos, oferecendo uma solução mais eficaz para a regressão desbalanceada e indicando direções para futuras pesquisas.Universidade Federal de PernambucoUFPEBrasilPrograma de Pos Graduacao em Ciencia da ComputacaoCAVALCANTI, George Darmiton da CunhaCRUZ, Rafael Menelau Oliveira ehttp://lattes.cnpq.br/5854014635627691http://lattes.cnpq.br/8577312109146354AVELINO, Juscimara Gomes2024-11-05T15:27:15Z2024-11-05T15:27:15Z2024-08-20info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfAVELINO, Juscimara Gomes. Imbalanced Regression Pipeline Recommendation. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.https://repositorio.ufpe.br/handle/123456789/58487engAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/embargoedAccessreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPE2024-11-07T05:39:01Zoai:repositorio.ufpe.br:123456789/58487Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212024-11-07T05:39:01Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false |
| dc.title.none.fl_str_mv |
Imbalanced Regression Pipeline Recommendation |
| title |
Imbalanced Regression Pipeline Recommendation |
| spellingShingle |
Imbalanced Regression Pipeline Recommendation AVELINO, Juscimara Gomes Regressão desbalanceada Estratégias de reamostragem Meta-aprendizado |
| title_short |
Imbalanced Regression Pipeline Recommendation |
| title_full |
Imbalanced Regression Pipeline Recommendation |
| title_fullStr |
Imbalanced Regression Pipeline Recommendation |
| title_full_unstemmed |
Imbalanced Regression Pipeline Recommendation |
| title_sort |
Imbalanced Regression Pipeline Recommendation |
| author |
AVELINO, Juscimara Gomes |
| author_facet |
AVELINO, Juscimara Gomes |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
CAVALCANTI, George Darmiton da Cunha CRUZ, Rafael Menelau Oliveira e http://lattes.cnpq.br/5854014635627691 http://lattes.cnpq.br/8577312109146354 |
| dc.contributor.author.fl_str_mv |
AVELINO, Juscimara Gomes |
| dc.subject.por.fl_str_mv |
Regressão desbalanceada Estratégias de reamostragem Meta-aprendizado |
| topic |
Regressão desbalanceada Estratégias de reamostragem Meta-aprendizado |
| description |
Imbalanced problems are common in various real-world scenarios and present significant chal- lenges, especially for regression tasks due to the rarity of certain continuous target values. While these issues have been extensively explored in classification tasks, they also affect re- gression, complicating model performance. This work presents an extensive experimental study involving various balancing strategies and learning models, introduces a taxonomy for imbal- anced regression approaches based on regression models, learning process modification, and evaluation metrics, and highlights new insights into the advantages of different strategies. From this study, it became evident that the choice of resampling method depends on the problem, learning models, and metrics, making it difficult to select an appropriate resam- pling strategy and learning model. As a result, it is necessary to test the majority of existing combinations. Based on these findings, this work proposes the Meta-learning for Imbalanced Regression (Meta-IR) framework to address these challenges. Meta-IR recommends optimal pipelines consisting of resampling strategies and learning models for imbalanced regression tasks. Two formulations are proposed: Independent, which separately recommends learning algorithms and resampling strategies, and Chained, which models their interdependencies se- quentially. The Chained approach demonstrated superior performance, suggesting a significant relationship between learning algorithms and resampling strategies. Compared with AutoML models and baseline configurations, Meta-IR outperformed all, offering a more effective solu- tion for imbalanced regression and indicating directions for future research. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024-11-05T15:27:15Z 2024-11-05T15:27:15Z 2024-08-20 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
| format |
doctoralThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
AVELINO, Juscimara Gomes. Imbalanced Regression Pipeline Recommendation. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024. https://repositorio.ufpe.br/handle/123456789/58487 |
| identifier_str_mv |
AVELINO, Juscimara Gomes. Imbalanced Regression Pipeline Recommendation. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024. |
| url |
https://repositorio.ufpe.br/handle/123456789/58487 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.rights.driver.fl_str_mv |
Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/embargoedAccess |
| rights_invalid_str_mv |
Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ |
| eu_rights_str_mv |
embargoedAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Pernambuco UFPE Brasil Programa de Pos Graduacao em Ciencia da Computacao |
| publisher.none.fl_str_mv |
Universidade Federal de Pernambuco UFPE Brasil Programa de Pos Graduacao em Ciencia da Computacao |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFPE instname:Universidade Federal de Pernambuco (UFPE) instacron:UFPE |
| instname_str |
Universidade Federal de Pernambuco (UFPE) |
| instacron_str |
UFPE |
| institution |
UFPE |
| reponame_str |
Repositório Institucional da UFPE |
| collection |
Repositório Institucional da UFPE |
| repository.name.fl_str_mv |
Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE) |
| repository.mail.fl_str_mv |
attena@ufpe.br |
| _version_ |
1856041941765980160 |