On model complexity reduction in instance-based learners

Detalhes bibliográficos
Ano de defesa: 2021
Autor(a) principal: Oliveira, Saulo Anderson Freitas de
Orientador(a): Gomes, João Paulo Pordeus
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/61137
Resumo: Instance-based learners habitually adopt instance selection techniques to reduce complexity and avoiding overfitting. Such learners’ most recent and well-known formulations seek to impose some sparsity in their training and prediction structure alongside regularization to meet such a result. Due to the variety of such instance-based learners, we will draw attention to the Least-Squares Support Vector Machines and Minimal Learning Machines because they embody additional information beyond the stored instances so they can perform predictions. Later, in this thesis, we formulate variants that constrain candidate solutions within a specific functional space where overfitting is avoided, and model complexity is reduced. In the Least-Squares Support Vector Machines context, this thesis follows the pruning fashion by adopting the Class-Corner Instance Selection. Such an approach focuses on describing the class-corner relationship among the samples on the dataset to penalize the ones close to the corners. As for the Minimal Learning Machine model, this thesis introduces a new proposal called the Lightweight Minimal Learning Machine. It adopts regularization in the complexity term to penalize each sample’s learning, resulting in a direct method. Usually, this penalization goes in alongside the term error. This thesis describes strategies based on random and observed linearity conditions related to the data for regression tasks. And, as for classification tasks, this thesis employs the before-mentioned class-corner idea to regularize them. Thus, resulting in the ones close to the corners suffering more penalization. By adopting such a methodology, we reduced the number of computations inherent in the original proposal’s multilateration process without requiring any instance selection criterion, yielding a faster model for out-of-sample prediction. Additionally, another remarkable feature is that it derives a unique solution when other formulations rely on overdetermined systems.
id UFC-7_f882860dadce3b80b77bca77f55deda4
oai_identifier_str oai:repositorio.ufc.br:riufc/61137
network_acronym_str UFC-7
network_name_str Repositório Institucional da Universidade Federal do Ceará (UFC)
repository_id_str
spelling Oliveira, Saulo Anderson Freitas deRocha Neto, Ajalmar Rêgo daGomes, João Paulo Pordeus2021-10-13T14:59:03Z2021-10-13T14:59:03Z2021OLIVEIRA, Saulo Anderson Freitas de. On model complexity reduction in instance-based learners. 2021. 106 f. Tese (Doutorado em Ciência da Computação) - Centro de Ciências, Universidade Federal do Ceará, Fortaleza, 2021.http://www.repositorio.ufc.br/handle/riufc/61137Instance-based learners habitually adopt instance selection techniques to reduce complexity and avoiding overfitting. Such learners’ most recent and well-known formulations seek to impose some sparsity in their training and prediction structure alongside regularization to meet such a result. Due to the variety of such instance-based learners, we will draw attention to the Least-Squares Support Vector Machines and Minimal Learning Machines because they embody additional information beyond the stored instances so they can perform predictions. Later, in this thesis, we formulate variants that constrain candidate solutions within a specific functional space where overfitting is avoided, and model complexity is reduced. In the Least-Squares Support Vector Machines context, this thesis follows the pruning fashion by adopting the Class-Corner Instance Selection. Such an approach focuses on describing the class-corner relationship among the samples on the dataset to penalize the ones close to the corners. As for the Minimal Learning Machine model, this thesis introduces a new proposal called the Lightweight Minimal Learning Machine. It adopts regularization in the complexity term to penalize each sample’s learning, resulting in a direct method. Usually, this penalization goes in alongside the term error. This thesis describes strategies based on random and observed linearity conditions related to the data for regression tasks. And, as for classification tasks, this thesis employs the before-mentioned class-corner idea to regularize them. Thus, resulting in the ones close to the corners suffering more penalization. By adopting such a methodology, we reduced the number of computations inherent in the original proposal’s multilateration process without requiring any instance selection criterion, yielding a faster model for out-of-sample prediction. Additionally, another remarkable feature is that it derives a unique solution when other formulations rely on overdetermined systems.Os algoritmos de aprendizagem com base em instâncias normalmente adotam técnicas de seleção de instâncias para reduzir a complexidade e evitar o sobreajuste. As formulações mais recentes e conhecidas de tais algoritmos buscam obter alguma esparsidade em sua estrutura de treinamento e predição junto com a regularização para atingir tal resultado. Devido à variedade de algoritmos com base em instâncias, nesta tese, daremos atenção para as Máquinas de Vetores de Suporte de Mínimos Quadrados e Máquinas de Aprendizagem Mínimas porque ambas incorporam informações adicionais além das instâncias armazenadas para que possam realizar predições. Posteriormente, nesta tese, formulamos variantes que restringem as soluções candidatas dentro de um espaço funcional específico onde o sobreajuste é evitado e a complexidade do modelo é reduzida. No contexto de Máquinas de Vetores de Suporte de Mínimos Quadrados, esta tese segue o método de poda, adotando um algoritmo de seleção de instâncias com base em canto de classe. Essa abordagem se concentra em descrever a relação de canto de classe entre as amostras no conjunto de dados para penalizar as demais amostras próximas aos cantos. Quanto às Máquinas de Aprendizagem Mínimas, esta tese apresenta uma nova proposta denominada Máquina de Aprendizagem Mínima de Pesos Leves. Essa nova proposta adota a regularização no termo de complexidade para penalizar o aprendizado de cada amostra, resultando em um método direto, já que normalmente essa penalização acompanha o termo erro. Esta tese descreve estratégias com base em condições aleatórias e características lineares da função alvo relacionadas aos dados para tarefas de regressão. E, quanto à tarefas de classificação, esta tese emprega a ideia de canto de classe mencionada anteriormente para regularizá-las. Assim, resultando em que as amostras próximas aos cantos sofram mais penalização. Ao adotar essa metodologia, reduzimos o número de cálculos inerentes ao processo de multilateração da proposta original sem exigir nenhum critério de seleção de instância, gerando um modelo mais rápido para predições. Além disso, outra característica notável é que é derivada uma solução única, enquanto que outras formulações dependem de sistemas sobredeterminados.Instance selectionMinimal learning machineRegularizationLeast-squares support vector machineOn model complexity reduction in instance-based learnersOn model complexity reduction in instance-based learnersinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisengreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFCinfo:eu-repo/semantics/openAccessORIGINAL2021_tese_safoliveira.pdf2021_tese_safoliveira.pdfapplication/pdf6056220http://repositorio.ufc.br/bitstream/riufc/61137/3/2021_tese_safoliveira.pdf83b97cfd9851ef4d65bb885ecfe5395cMD53LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ufc.br/bitstream/riufc/61137/4/license.txt8a4605be74aa9ea9d79846c1fba20a33MD54riufc/611372021-10-13 11:59:03.232oai:repositorio.ufc.br:riufc/61137Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2021-10-13T14:59:03Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false
dc.title.pt_BR.fl_str_mv On model complexity reduction in instance-based learners
dc.title.en.pt_BR.fl_str_mv On model complexity reduction in instance-based learners
title On model complexity reduction in instance-based learners
spellingShingle On model complexity reduction in instance-based learners
Oliveira, Saulo Anderson Freitas de
Instance selection
Minimal learning machine
Regularization
Least-squares support vector machine
title_short On model complexity reduction in instance-based learners
title_full On model complexity reduction in instance-based learners
title_fullStr On model complexity reduction in instance-based learners
title_full_unstemmed On model complexity reduction in instance-based learners
title_sort On model complexity reduction in instance-based learners
author Oliveira, Saulo Anderson Freitas de
author_facet Oliveira, Saulo Anderson Freitas de
author_role author
dc.contributor.co-advisor.none.fl_str_mv Rocha Neto, Ajalmar Rêgo da
dc.contributor.author.fl_str_mv Oliveira, Saulo Anderson Freitas de
dc.contributor.advisor1.fl_str_mv Gomes, João Paulo Pordeus
contributor_str_mv Gomes, João Paulo Pordeus
dc.subject.por.fl_str_mv Instance selection
Minimal learning machine
Regularization
Least-squares support vector machine
topic Instance selection
Minimal learning machine
Regularization
Least-squares support vector machine
description Instance-based learners habitually adopt instance selection techniques to reduce complexity and avoiding overfitting. Such learners’ most recent and well-known formulations seek to impose some sparsity in their training and prediction structure alongside regularization to meet such a result. Due to the variety of such instance-based learners, we will draw attention to the Least-Squares Support Vector Machines and Minimal Learning Machines because they embody additional information beyond the stored instances so they can perform predictions. Later, in this thesis, we formulate variants that constrain candidate solutions within a specific functional space where overfitting is avoided, and model complexity is reduced. In the Least-Squares Support Vector Machines context, this thesis follows the pruning fashion by adopting the Class-Corner Instance Selection. Such an approach focuses on describing the class-corner relationship among the samples on the dataset to penalize the ones close to the corners. As for the Minimal Learning Machine model, this thesis introduces a new proposal called the Lightweight Minimal Learning Machine. It adopts regularization in the complexity term to penalize each sample’s learning, resulting in a direct method. Usually, this penalization goes in alongside the term error. This thesis describes strategies based on random and observed linearity conditions related to the data for regression tasks. And, as for classification tasks, this thesis employs the before-mentioned class-corner idea to regularize them. Thus, resulting in the ones close to the corners suffering more penalization. By adopting such a methodology, we reduced the number of computations inherent in the original proposal’s multilateration process without requiring any instance selection criterion, yielding a faster model for out-of-sample prediction. Additionally, another remarkable feature is that it derives a unique solution when other formulations rely on overdetermined systems.
publishDate 2021
dc.date.accessioned.fl_str_mv 2021-10-13T14:59:03Z
dc.date.available.fl_str_mv 2021-10-13T14:59:03Z
dc.date.issued.fl_str_mv 2021
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv OLIVEIRA, Saulo Anderson Freitas de. On model complexity reduction in instance-based learners. 2021. 106 f. Tese (Doutorado em Ciência da Computação) - Centro de Ciências, Universidade Federal do Ceará, Fortaleza, 2021.
dc.identifier.uri.fl_str_mv http://www.repositorio.ufc.br/handle/riufc/61137
identifier_str_mv OLIVEIRA, Saulo Anderson Freitas de. On model complexity reduction in instance-based learners. 2021. 106 f. Tese (Doutorado em Ciência da Computação) - Centro de Ciências, Universidade Federal do Ceará, Fortaleza, 2021.
url http://www.repositorio.ufc.br/handle/riufc/61137
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Ceará (UFC)
instname:Universidade Federal do Ceará (UFC)
instacron:UFC
instname_str Universidade Federal do Ceará (UFC)
instacron_str UFC
institution UFC
reponame_str Repositório Institucional da Universidade Federal do Ceará (UFC)
collection Repositório Institucional da Universidade Federal do Ceará (UFC)
bitstream.url.fl_str_mv http://repositorio.ufc.br/bitstream/riufc/61137/3/2021_tese_safoliveira.pdf
http://repositorio.ufc.br/bitstream/riufc/61137/4/license.txt
bitstream.checksum.fl_str_mv 83b97cfd9851ef4d65bb885ecfe5395c
8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)
repository.mail.fl_str_mv bu@ufc.br || repositorio@ufc.br
_version_ 1847793409386872832