Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Camargo, Fernando Henrique Fernandes de lattes
Orientador(a): Soares, Anderson da Silva lattes
Banca de defesa: Soares, Anderson da Silva, Galvão Filho, Arlindo Rodrigues, Vieira, Flávio Henrique Teles, Gomes, Herman Martins, Lotufo, Roberto de Alencar
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Goiás
Programa de Pós-Graduação: Programa de Pós-graduação em Ciência da Computação (INF)
Departamento: Instituto de Informática - INF (RMG)
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: http://repositorio.bc.ufg.br/tede/handle/tede/13342
Resumo: This thesis introduces a novel approach to address high-dimensional multiclass classification challenges, particularly in dynamic environments where new classes emerge. Named Future-Shot, the method employs metric learning, specifically triplet learning, to train a model capable of generating embeddings for both data points and classes within a shared vector space. This facilitates efficient similarity comparisons using techniques like k-nearest neighbors (\acrshort{knn}), enabling seamless integration of new classes without extensive retraining. Tested on lab-of-origin prediction tasks using the Addgene dataset, Future-Shot achieves top-10 accuracy of $90.39\%$, surpassing existing methods. Notably, in few-shot learning scenarios, it achieves an average top-10 accuracy of $81.2\%$ with just $30\%$ of the data for new classes, demonstrating robustness and efficiency in adapting to evolving class structures
id UFG-2_48526b4ffa0df7eca6c24d7439aed6b1
oai_identifier_str oai:repositorio.bc.ufg.br:tede/13342
network_acronym_str UFG-2
network_name_str Repositório Institucional da UFG
repository_id_str
spelling Soares, Anderson da Silvahttp://lattes.cnpq.br/1096941114079527Soares, Anderson da SilvaGalvão Filho, Arlindo RodriguesVieira, Flávio Henrique TelesGomes, Herman MartinsLotufo, Roberto de Alencarhttp://lattes.cnpq.br/0401456515486306Camargo, Fernando Henrique Fernandes de2024-09-16T18:20:52Z2024-09-16T18:20:52Z2024-02-23CAMARGO, F. H. F. Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems. 2024. 75 f. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.http://repositorio.bc.ufg.br/tede/handle/tede/13342This thesis introduces a novel approach to address high-dimensional multiclass classification challenges, particularly in dynamic environments where new classes emerge. Named Future-Shot, the method employs metric learning, specifically triplet learning, to train a model capable of generating embeddings for both data points and classes within a shared vector space. This facilitates efficient similarity comparisons using techniques like k-nearest neighbors (\acrshort{knn}), enabling seamless integration of new classes without extensive retraining. Tested on lab-of-origin prediction tasks using the Addgene dataset, Future-Shot achieves top-10 accuracy of $90.39\%$, surpassing existing methods. Notably, in few-shot learning scenarios, it achieves an average top-10 accuracy of $81.2\%$ with just $30\%$ of the data for new classes, demonstrating robustness and efficiency in adapting to evolving class structuresEsta tese introduz uma nova abordagem para enfrentar desafios de classificação multiclasse de alta dimensão, particularmente em ambientes dinâmicos onde surgem novas classes. Chamado de Future-Shot, o método emprega metric learning, especificamente triplet learning, para treinar um modelo capaz de gerar embeddings para pontos de dados e classes dentro de um espaço vetorial compartilhado. Isso facilita comparações eficientes de similaridade usando técnicas como k-nearest neighbors (KNN), permitindo integração de novas classes sem extenso treinamento. Testado em tarefas de previsão de laboratório de origem usando o conjunto de dados Addgene, o Future-Shot atinge a acurácia de top10 de 90,39%, superando os métodos existentes. Notavelmente, em cenários de few-shot learning, ele atinge uma acuråcia de top-10 média de 81,2% com apenas 30% dos dados para novas classes, demonstrando robustez e eficiência na adaptação às estruturas em que novas classes são inseridas com o passar do tempoengUniversidade Federal de GoiásPrograma de Pós-graduação em Ciência da Computação (INF)UFGBrasilInstituto de Informática - INF (RMG)Attribution-NonCommercial-NoDerivatives 4.0 Internationalinfo:eu-repo/semantics/openAccessClassificação em alta dimensãoPredição de laboratório de origemFew-shot learningMachine learningMetric learningCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOFuture-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problemsFuture-Shot: Few-Shot Learning para lidar com novos rótulos em problemas de classificação de alta dimensãoinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisreponame:Repositório Institucional da UFGinstname:Universidade Federal de Goiás (UFG)instacron:UFGLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.bc.ufg.br/tede/bitstreams/ed8b7076-e1ba-4e0a-a8ae-670fdf44e4ee/download8a4605be74aa9ea9d79846c1fba20a33MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8805http://repositorio.bc.ufg.br/tede/bitstreams/b6e89a0b-58d6-42f4-b0fd-9c29751a40e6/download4460e5956bc1d1639be9ae6146a50347MD52ORIGINALTese - Fernando Henrique Fernandes de Camargo - 2024.pdfTese - Fernando Henrique Fernandes de Camargo - 2024.pdfapplication/pdf21552922http://repositorio.bc.ufg.br/tede/bitstreams/ee8332e8-a402-42e1-a37b-ff50c32e7b30/download285e49eb4a8596f6379b72278d5ef20fMD53tede/133422024-09-16 15:20:52.993http://creativecommons.org/licenses/by-nc-nd/4.0/Attribution-NonCommercial-NoDerivatives 4.0 Internationalopen.accessoai:repositorio.bc.ufg.br:tede/13342http://repositorio.bc.ufg.br/tedeRepositório InstitucionalPUBhttps://repositorio.bc.ufg.br/tedeserver/oai/requestgrt.bc@ufg.bropendoar:oai:repositorio.bc.ufg.br:tede/12342024-09-16T18:20:52Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)falseTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=
dc.title.none.fl_str_mv Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems
dc.title.alternative.por.fl_str_mv Future-Shot: Few-Shot Learning para lidar com novos rótulos em problemas de classificação de alta dimensão
title Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems
spellingShingle Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems
Camargo, Fernando Henrique Fernandes de
Classificação em alta dimensão
Predição de laboratório de origem
Few-shot learning
Machine learning
Metric learning
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems
title_full Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems
title_fullStr Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems
title_full_unstemmed Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems
title_sort Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems
author Camargo, Fernando Henrique Fernandes de
author_facet Camargo, Fernando Henrique Fernandes de
author_role author
dc.contributor.advisor1.fl_str_mv Soares, Anderson da Silva
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/1096941114079527
dc.contributor.referee1.fl_str_mv Soares, Anderson da Silva
dc.contributor.referee2.fl_str_mv Galvão Filho, Arlindo Rodrigues
dc.contributor.referee3.fl_str_mv Vieira, Flávio Henrique Teles
dc.contributor.referee4.fl_str_mv Gomes, Herman Martins
dc.contributor.referee5.fl_str_mv Lotufo, Roberto de Alencar
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/0401456515486306
dc.contributor.author.fl_str_mv Camargo, Fernando Henrique Fernandes de
contributor_str_mv Soares, Anderson da Silva
Soares, Anderson da Silva
Galvão Filho, Arlindo Rodrigues
Vieira, Flávio Henrique Teles
Gomes, Herman Martins
Lotufo, Roberto de Alencar
dc.subject.por.fl_str_mv Classificação em alta dimensão
Predição de laboratório de origem
topic Classificação em alta dimensão
Predição de laboratório de origem
Few-shot learning
Machine learning
Metric learning
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.eng.fl_str_mv Few-shot learning
Machine learning
Metric learning
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description This thesis introduces a novel approach to address high-dimensional multiclass classification challenges, particularly in dynamic environments where new classes emerge. Named Future-Shot, the method employs metric learning, specifically triplet learning, to train a model capable of generating embeddings for both data points and classes within a shared vector space. This facilitates efficient similarity comparisons using techniques like k-nearest neighbors (\acrshort{knn}), enabling seamless integration of new classes without extensive retraining. Tested on lab-of-origin prediction tasks using the Addgene dataset, Future-Shot achieves top-10 accuracy of $90.39\%$, surpassing existing methods. Notably, in few-shot learning scenarios, it achieves an average top-10 accuracy of $81.2\%$ with just $30\%$ of the data for new classes, demonstrating robustness and efficiency in adapting to evolving class structures
publishDate 2024
dc.date.accessioned.fl_str_mv 2024-09-16T18:20:52Z
dc.date.available.fl_str_mv 2024-09-16T18:20:52Z
dc.date.issued.fl_str_mv 2024-02-23
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv CAMARGO, F. H. F. Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems. 2024. 75 f. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.
dc.identifier.uri.fl_str_mv http://repositorio.bc.ufg.br/tede/handle/tede/13342
identifier_str_mv CAMARGO, F. H. F. Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems. 2024. 75 f. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.
url http://repositorio.bc.ufg.br/tede/handle/tede/13342
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivatives 4.0 International
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivatives 4.0 International
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Goiás
dc.publisher.program.fl_str_mv Programa de Pós-graduação em Ciência da Computação (INF)
dc.publisher.initials.fl_str_mv UFG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Instituto de Informática - INF (RMG)
publisher.none.fl_str_mv Universidade Federal de Goiás
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFG
instname:Universidade Federal de Goiás (UFG)
instacron:UFG
instname_str Universidade Federal de Goiás (UFG)
instacron_str UFG
institution UFG
reponame_str Repositório Institucional da UFG
collection Repositório Institucional da UFG
bitstream.url.fl_str_mv http://repositorio.bc.ufg.br/tede/bitstreams/ed8b7076-e1ba-4e0a-a8ae-670fdf44e4ee/download
http://repositorio.bc.ufg.br/tede/bitstreams/b6e89a0b-58d6-42f4-b0fd-9c29751a40e6/download
http://repositorio.bc.ufg.br/tede/bitstreams/ee8332e8-a402-42e1-a37b-ff50c32e7b30/download
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
4460e5956bc1d1639be9ae6146a50347
285e49eb4a8596f6379b72278d5ef20f
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFG - Universidade Federal de Goiás (UFG)
repository.mail.fl_str_mv grt.bc@ufg.br
_version_ 1861293892378296320