Aprendiz de descritores de mistura gaussiana

Detalhes bibliográficos
Ano de defesa: 2017
Autor(a) principal: Freitas, Breno Lima de
Orientador(a): Almeida, Tiago Agostinho de lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de São Carlos
Câmpus Sorocaba
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação - PPGCC-So
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/20.500.14289/9249
Resumo: For the last decades, many Machine Learning methods have been proposed aiming categorizing data. Given many tentative models, those methods try to find the one that fits the dataset by building a hypothesis that predicts unseen samples reasonably well. One of the main concerns in that regard is selecting a model that performs well in unseen samples not overfitting on the known data. In this work, we introduce a classification method based on the minimum description length principle, which naturally offers a tradeoff between model complexity and data fit. The proposed method is multiclass, online and is generic in the regard of data representation. The experiments conducted in real datasets with many different characteristics, have shown that the proposed method is statiscally equivalent to the other classical baseline methods in the literature in the offline scenario and it performed better than some when tested in an online scenario. Moreover, the method has proven to be robust to overfitting and data normalization which poses great features a classifier must have in order to deal with large, complex and real-world classification problems.
id SCAR_e251d44a6b12ea3bec00237b83179839
oai_identifier_str oai:repositorio.ufscar.br:20.500.14289/9249
network_acronym_str SCAR
network_name_str Repositório Institucional da UFSCAR
repository_id_str
spelling Freitas, Breno Lima deAlmeida, Tiago Agostinho dehttp://lattes.cnpq.br/5368680512020633http://lattes.cnpq.br/9494175519218074872244ab-b0cf-47fc-b4d6-dc8ff9434bd02018-01-12T10:36:25Z2018-01-12T10:36:25Z2017-12-14FREITAS, Breno Lima de. Aprendiz de descritores de mistura gaussiana. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/9249.https://repositorio.ufscar.br/handle/20.500.14289/9249For the last decades, many Machine Learning methods have been proposed aiming categorizing data. Given many tentative models, those methods try to find the one that fits the dataset by building a hypothesis that predicts unseen samples reasonably well. One of the main concerns in that regard is selecting a model that performs well in unseen samples not overfitting on the known data. In this work, we introduce a classification method based on the minimum description length principle, which naturally offers a tradeoff between model complexity and data fit. The proposed method is multiclass, online and is generic in the regard of data representation. The experiments conducted in real datasets with many different characteristics, have shown that the proposed method is statiscally equivalent to the other classical baseline methods in the literature in the offline scenario and it performed better than some when tested in an online scenario. Moreover, the method has proven to be robust to overfitting and data normalization which poses great features a classifier must have in order to deal with large, complex and real-world classification problems.Ao longo das últimas décadas, diversos métodos de aprendizado de máquina vêm sendo propostos com o intuito de classificar dados. Entre os modelos candidatos, procura-se selecionar um que se ajuste bem aos dados de treinamento, criando uma hipótese que faça boas predições em amostras não analisadas anteriormente. Um dos maiores desafios é selecionar um modelo, cuja hipótese não seja sobre-ajustada aos dados conhecidos, sendo genérica o suficiente para boas predições futuras. Neste trabalho, é apresentado um método de classificação baseado no princípio da descrição mais simples que efetua uma troca benéfica entre a complexidade do modelo e o ajuste aos dados. O método proposto é multiclasse, incremental e pode ser usado em dados com atributos categóricos, numéricos e contínuos. Experimentos conduzidos em bases reais de diversas características mostraram que o método proposto é estatisticamente equivalente à métodos clássicos na literatura para o cenário offline e superior a alguns métodos no cenário de aprendizado incremental. Além disso, o método mostrou-se robusto ao sobre-ajustamento e à normalização dos dados, apresentando características benéficas para um método de classificação que pode ser aplicado nos dias atuais.Não recebi financiamentoporUniversidade Federal de São CarlosCâmpus SorocabaPrograma de Pós-Graduação em Ciência da Computação - PPGCC-SoUFSCarPrincípio da descrição mais simplesMistura GaussianaClassificaçãoAprendizado de máquinaMinimum description length principleGaussan mixtureClassifiersMachine learningCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOAprendiz de descritores de mistura gaussianaGaussian mixture descriptors learnerinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisOnline6006005de967ad-743c-4f36-972b-79dd683c0e9dinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALtese.pdftese.pdfapplication/pdf2292309https://repositorio.ufscar.br/bitstreams/ad2ebb7e-83af-4edb-b288-5c9a0155d3d7/download6fed87a7d149f4ecf1b124c3dbe48d9eMD51trueAnonymousREADatestado-deposito-assinado.pdfatestado-deposito-assinado.pdfcarta comprovanteapplication/pdf159376https://repositorio.ufscar.br/bitstreams/a61cfb8b-d4d1-4ec9-bf61-7f8a15698ee1/download4d20d8676988e631bc8b46f7c2224bcbMD53falseAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-81957https://repositorio.ufscar.br/bitstreams/b9f6ce4a-3485-4bd2-8a6a-e8ca0bbcb99d/downloadae0398b6f8b235e40ad82cba6c50031dMD54falseAnonymousREADTEXTtese.pdf.txttese.pdf.txtExtracted texttext/plain153108https://repositorio.ufscar.br/bitstreams/70950fd8-d052-4bc5-87cf-81cc0b649cf3/download768ed9ce241c33e2edd8a3334eb90f15MD59falseAnonymousREADatestado-deposito-assinado.pdf.txtatestado-deposito-assinado.pdf.txtExtracted texttext/plain1259https://repositorio.ufscar.br/bitstreams/fff6a0ba-c715-43a0-8275-9be7e5598080/download99ad454bd2eed57a698f0c8298b95e45MD511falseAnonymousREADTHUMBNAILtese.pdf.jpgtese.pdf.jpgIM Thumbnailimage/jpeg4631https://repositorio.ufscar.br/bitstreams/6c266d9d-8546-48c1-bbfb-4deed0a42280/download29828d4cb1e8f136c9549b210cd8bd39MD510falseAnonymousREADatestado-deposito-assinado.pdf.jpgatestado-deposito-assinado.pdf.jpgIM Thumbnailimage/jpeg5725https://repositorio.ufscar.br/bitstreams/7889230f-cdea-4142-aaad-78261a5ef6f8/downloada4700cc6e65f518ecdab87f8bed91716MD512falseAnonymousREAD20.500.14289/92492025-02-05 17:45:47.047Acesso abertoopen.accessoai:repositorio.ufscar.br:20.500.14289/9249https://repositorio.ufscar.brRepositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestrepositorio.sibi@ufscar.bropendoar:43222025-02-05T20:45:47Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)falseTElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgw6AgVW5pdmVyc2lkYWRlCkZlZGVyYWwgZGUgU8OjbyBDYXJsb3MgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsICB0cmFkdXppciAoY29uZm9ybWUgZGVmaW5pZG8gYWJhaXhvKSwgZS9vdQpkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlCmVtIHF1YWxxdWVyIG1laW8sIGluY2x1aW5kbyBvcyBmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIGEgVUZTQ2FyIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28KcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGU0NhciBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgYSBzdWEgdGVzZSBvdQpkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcwpuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldQpjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd1w6ltLgoKQ2FzbyBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY8OqIG7Do28gcG9zc3VpIGEgdGl0dWxhcmlkYWRlIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgdm9jw6oKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFVGU0NhcgpvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUKaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRlNDYXIsClZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVRlNDYXIgc2UgY29tcHJvbWV0ZSBhIGlkZW50aWZpY2FyIGNsYXJhbWVudGUgbyBzZXUgbm9tZSAocykgb3UgbyhzKSBub21lKHMpIGRvKHMpCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzCmNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuCg==
dc.title.por.fl_str_mv Aprendiz de descritores de mistura gaussiana
dc.title.alternative.eng.fl_str_mv Gaussian mixture descriptors learner
title Aprendiz de descritores de mistura gaussiana
spellingShingle Aprendiz de descritores de mistura gaussiana
Freitas, Breno Lima de
Princípio da descrição mais simples
Mistura Gaussiana
Classificação
Aprendizado de máquina
Minimum description length principle
Gaussan mixture
Classifiers
Machine learning
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Aprendiz de descritores de mistura gaussiana
title_full Aprendiz de descritores de mistura gaussiana
title_fullStr Aprendiz de descritores de mistura gaussiana
title_full_unstemmed Aprendiz de descritores de mistura gaussiana
title_sort Aprendiz de descritores de mistura gaussiana
author Freitas, Breno Lima de
author_facet Freitas, Breno Lima de
author_role author
dc.contributor.authorlattes.por.fl_str_mv http://lattes.cnpq.br/9494175519218074
dc.contributor.author.fl_str_mv Freitas, Breno Lima de
dc.contributor.advisor1.fl_str_mv Almeida, Tiago Agostinho de
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/5368680512020633
dc.contributor.authorID.fl_str_mv 872244ab-b0cf-47fc-b4d6-dc8ff9434bd0
contributor_str_mv Almeida, Tiago Agostinho de
dc.subject.por.fl_str_mv Princípio da descrição mais simples
Mistura Gaussiana
Classificação
Aprendizado de máquina
topic Princípio da descrição mais simples
Mistura Gaussiana
Classificação
Aprendizado de máquina
Minimum description length principle
Gaussan mixture
Classifiers
Machine learning
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.eng.fl_str_mv Minimum description length principle
Gaussan mixture
Classifiers
Machine learning
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description For the last decades, many Machine Learning methods have been proposed aiming categorizing data. Given many tentative models, those methods try to find the one that fits the dataset by building a hypothesis that predicts unseen samples reasonably well. One of the main concerns in that regard is selecting a model that performs well in unseen samples not overfitting on the known data. In this work, we introduce a classification method based on the minimum description length principle, which naturally offers a tradeoff between model complexity and data fit. The proposed method is multiclass, online and is generic in the regard of data representation. The experiments conducted in real datasets with many different characteristics, have shown that the proposed method is statiscally equivalent to the other classical baseline methods in the literature in the offline scenario and it performed better than some when tested in an online scenario. Moreover, the method has proven to be robust to overfitting and data normalization which poses great features a classifier must have in order to deal with large, complex and real-world classification problems.
publishDate 2017
dc.date.issued.fl_str_mv 2017-12-14
dc.date.accessioned.fl_str_mv 2018-01-12T10:36:25Z
dc.date.available.fl_str_mv 2018-01-12T10:36:25Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv FREITAS, Breno Lima de. Aprendiz de descritores de mistura gaussiana. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/9249.
dc.identifier.uri.fl_str_mv https://repositorio.ufscar.br/handle/20.500.14289/9249
identifier_str_mv FREITAS, Breno Lima de. Aprendiz de descritores de mistura gaussiana. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/9249.
url https://repositorio.ufscar.br/handle/20.500.14289/9249
dc.language.iso.fl_str_mv por
language por
dc.relation.confidence.fl_str_mv 600
600
dc.relation.authority.fl_str_mv 5de967ad-743c-4f36-972b-79dd683c0e9d
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus Sorocaba
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação - PPGCC-So
dc.publisher.initials.fl_str_mv UFSCar
publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus Sorocaba
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFSCAR
instname:Universidade Federal de São Carlos (UFSCAR)
instacron:UFSCAR
instname_str Universidade Federal de São Carlos (UFSCAR)
instacron_str UFSCAR
institution UFSCAR
reponame_str Repositório Institucional da UFSCAR
collection Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv https://repositorio.ufscar.br/bitstreams/ad2ebb7e-83af-4edb-b288-5c9a0155d3d7/download
https://repositorio.ufscar.br/bitstreams/a61cfb8b-d4d1-4ec9-bf61-7f8a15698ee1/download
https://repositorio.ufscar.br/bitstreams/b9f6ce4a-3485-4bd2-8a6a-e8ca0bbcb99d/download
https://repositorio.ufscar.br/bitstreams/70950fd8-d052-4bc5-87cf-81cc0b649cf3/download
https://repositorio.ufscar.br/bitstreams/fff6a0ba-c715-43a0-8275-9be7e5598080/download
https://repositorio.ufscar.br/bitstreams/6c266d9d-8546-48c1-bbfb-4deed0a42280/download
https://repositorio.ufscar.br/bitstreams/7889230f-cdea-4142-aaad-78261a5ef6f8/download
bitstream.checksum.fl_str_mv 6fed87a7d149f4ecf1b124c3dbe48d9e
4d20d8676988e631bc8b46f7c2224bcb
ae0398b6f8b235e40ad82cba6c50031d
768ed9ce241c33e2edd8a3334eb90f15
99ad454bd2eed57a698f0c8298b95e45
29828d4cb1e8f136c9549b210cd8bd39
a4700cc6e65f518ecdab87f8bed91716
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv repositorio.sibi@ufscar.br
_version_ 1833925332550287360