Aprendiz de descritores de mistura gaussiana
Ano de defesa: | 2017 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de São Carlos
Câmpus Sorocaba |
Programa de Pós-Graduação: |
Programa de Pós-Graduação em Ciência da Computação - PPGCC-So
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Palavras-chave em Inglês: | |
Área do conhecimento CNPq: | |
Link de acesso: | https://repositorio.ufscar.br/handle/20.500.14289/9249 |
Resumo: | For the last decades, many Machine Learning methods have been proposed aiming categorizing data. Given many tentative models, those methods try to find the one that fits the dataset by building a hypothesis that predicts unseen samples reasonably well. One of the main concerns in that regard is selecting a model that performs well in unseen samples not overfitting on the known data. In this work, we introduce a classification method based on the minimum description length principle, which naturally offers a tradeoff between model complexity and data fit. The proposed method is multiclass, online and is generic in the regard of data representation. The experiments conducted in real datasets with many different characteristics, have shown that the proposed method is statiscally equivalent to the other classical baseline methods in the literature in the offline scenario and it performed better than some when tested in an online scenario. Moreover, the method has proven to be robust to overfitting and data normalization which poses great features a classifier must have in order to deal with large, complex and real-world classification problems. |
id |
SCAR_e251d44a6b12ea3bec00237b83179839 |
---|---|
oai_identifier_str |
oai:repositorio.ufscar.br:20.500.14289/9249 |
network_acronym_str |
SCAR |
network_name_str |
Repositório Institucional da UFSCAR |
repository_id_str |
|
spelling |
Freitas, Breno Lima deAlmeida, Tiago Agostinho dehttp://lattes.cnpq.br/5368680512020633http://lattes.cnpq.br/9494175519218074872244ab-b0cf-47fc-b4d6-dc8ff9434bd02018-01-12T10:36:25Z2018-01-12T10:36:25Z2017-12-14FREITAS, Breno Lima de. Aprendiz de descritores de mistura gaussiana. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/9249.https://repositorio.ufscar.br/handle/20.500.14289/9249For the last decades, many Machine Learning methods have been proposed aiming categorizing data. Given many tentative models, those methods try to find the one that fits the dataset by building a hypothesis that predicts unseen samples reasonably well. One of the main concerns in that regard is selecting a model that performs well in unseen samples not overfitting on the known data. In this work, we introduce a classification method based on the minimum description length principle, which naturally offers a tradeoff between model complexity and data fit. The proposed method is multiclass, online and is generic in the regard of data representation. The experiments conducted in real datasets with many different characteristics, have shown that the proposed method is statiscally equivalent to the other classical baseline methods in the literature in the offline scenario and it performed better than some when tested in an online scenario. Moreover, the method has proven to be robust to overfitting and data normalization which poses great features a classifier must have in order to deal with large, complex and real-world classification problems.Ao longo das últimas décadas, diversos métodos de aprendizado de máquina vêm sendo propostos com o intuito de classificar dados. Entre os modelos candidatos, procura-se selecionar um que se ajuste bem aos dados de treinamento, criando uma hipótese que faça boas predições em amostras não analisadas anteriormente. Um dos maiores desafios é selecionar um modelo, cuja hipótese não seja sobre-ajustada aos dados conhecidos, sendo genérica o suficiente para boas predições futuras. Neste trabalho, é apresentado um método de classificação baseado no princípio da descrição mais simples que efetua uma troca benéfica entre a complexidade do modelo e o ajuste aos dados. O método proposto é multiclasse, incremental e pode ser usado em dados com atributos categóricos, numéricos e contínuos. Experimentos conduzidos em bases reais de diversas características mostraram que o método proposto é estatisticamente equivalente à métodos clássicos na literatura para o cenário offline e superior a alguns métodos no cenário de aprendizado incremental. Além disso, o método mostrou-se robusto ao sobre-ajustamento e à normalização dos dados, apresentando características benéficas para um método de classificação que pode ser aplicado nos dias atuais.Não recebi financiamentoporUniversidade Federal de São CarlosCâmpus SorocabaPrograma de Pós-Graduação em Ciência da Computação - PPGCC-SoUFSCarPrincípio da descrição mais simplesMistura GaussianaClassificaçãoAprendizado de máquinaMinimum description length principleGaussan mixtureClassifiersMachine learningCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOAprendiz de descritores de mistura gaussianaGaussian mixture descriptors learnerinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisOnline6006005de967ad-743c-4f36-972b-79dd683c0e9dinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALtese.pdftese.pdfapplication/pdf2292309https://repositorio.ufscar.br/bitstreams/ad2ebb7e-83af-4edb-b288-5c9a0155d3d7/download6fed87a7d149f4ecf1b124c3dbe48d9eMD51trueAnonymousREADatestado-deposito-assinado.pdfatestado-deposito-assinado.pdfcarta comprovanteapplication/pdf159376https://repositorio.ufscar.br/bitstreams/a61cfb8b-d4d1-4ec9-bf61-7f8a15698ee1/download4d20d8676988e631bc8b46f7c2224bcbMD53falseAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-81957https://repositorio.ufscar.br/bitstreams/b9f6ce4a-3485-4bd2-8a6a-e8ca0bbcb99d/downloadae0398b6f8b235e40ad82cba6c50031dMD54falseAnonymousREADTEXTtese.pdf.txttese.pdf.txtExtracted texttext/plain153108https://repositorio.ufscar.br/bitstreams/70950fd8-d052-4bc5-87cf-81cc0b649cf3/download768ed9ce241c33e2edd8a3334eb90f15MD59falseAnonymousREADatestado-deposito-assinado.pdf.txtatestado-deposito-assinado.pdf.txtExtracted texttext/plain1259https://repositorio.ufscar.br/bitstreams/fff6a0ba-c715-43a0-8275-9be7e5598080/download99ad454bd2eed57a698f0c8298b95e45MD511falseAnonymousREADTHUMBNAILtese.pdf.jpgtese.pdf.jpgIM Thumbnailimage/jpeg4631https://repositorio.ufscar.br/bitstreams/6c266d9d-8546-48c1-bbfb-4deed0a42280/download29828d4cb1e8f136c9549b210cd8bd39MD510falseAnonymousREADatestado-deposito-assinado.pdf.jpgatestado-deposito-assinado.pdf.jpgIM Thumbnailimage/jpeg5725https://repositorio.ufscar.br/bitstreams/7889230f-cdea-4142-aaad-78261a5ef6f8/downloada4700cc6e65f518ecdab87f8bed91716MD512falseAnonymousREAD20.500.14289/92492025-02-05 17:45:47.047Acesso abertoopen.accessoai:repositorio.ufscar.br:20.500.14289/9249https://repositorio.ufscar.brRepositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestrepositorio.sibi@ufscar.bropendoar:43222025-02-05T20:45:47Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)falseTElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgw6AgVW5pdmVyc2lkYWRlCkZlZGVyYWwgZGUgU8OjbyBDYXJsb3MgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsICB0cmFkdXppciAoY29uZm9ybWUgZGVmaW5pZG8gYWJhaXhvKSwgZS9vdQpkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlCmVtIHF1YWxxdWVyIG1laW8sIGluY2x1aW5kbyBvcyBmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIGEgVUZTQ2FyIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28KcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGU0NhciBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgYSBzdWEgdGVzZSBvdQpkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcwpuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldQpjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd1w6ltLgoKQ2FzbyBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY8OqIG7Do28gcG9zc3VpIGEgdGl0dWxhcmlkYWRlIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgdm9jw6oKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFVGU0NhcgpvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUKaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRlNDYXIsClZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVRlNDYXIgc2UgY29tcHJvbWV0ZSBhIGlkZW50aWZpY2FyIGNsYXJhbWVudGUgbyBzZXUgbm9tZSAocykgb3UgbyhzKSBub21lKHMpIGRvKHMpCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzCmNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuCg== |
dc.title.por.fl_str_mv |
Aprendiz de descritores de mistura gaussiana |
dc.title.alternative.eng.fl_str_mv |
Gaussian mixture descriptors learner |
title |
Aprendiz de descritores de mistura gaussiana |
spellingShingle |
Aprendiz de descritores de mistura gaussiana Freitas, Breno Lima de Princípio da descrição mais simples Mistura Gaussiana Classificação Aprendizado de máquina Minimum description length principle Gaussan mixture Classifiers Machine learning CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
title_short |
Aprendiz de descritores de mistura gaussiana |
title_full |
Aprendiz de descritores de mistura gaussiana |
title_fullStr |
Aprendiz de descritores de mistura gaussiana |
title_full_unstemmed |
Aprendiz de descritores de mistura gaussiana |
title_sort |
Aprendiz de descritores de mistura gaussiana |
author |
Freitas, Breno Lima de |
author_facet |
Freitas, Breno Lima de |
author_role |
author |
dc.contributor.authorlattes.por.fl_str_mv |
http://lattes.cnpq.br/9494175519218074 |
dc.contributor.author.fl_str_mv |
Freitas, Breno Lima de |
dc.contributor.advisor1.fl_str_mv |
Almeida, Tiago Agostinho de |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/5368680512020633 |
dc.contributor.authorID.fl_str_mv |
872244ab-b0cf-47fc-b4d6-dc8ff9434bd0 |
contributor_str_mv |
Almeida, Tiago Agostinho de |
dc.subject.por.fl_str_mv |
Princípio da descrição mais simples Mistura Gaussiana Classificação Aprendizado de máquina |
topic |
Princípio da descrição mais simples Mistura Gaussiana Classificação Aprendizado de máquina Minimum description length principle Gaussan mixture Classifiers Machine learning CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
dc.subject.eng.fl_str_mv |
Minimum description length principle Gaussan mixture Classifiers Machine learning |
dc.subject.cnpq.fl_str_mv |
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
description |
For the last decades, many Machine Learning methods have been proposed aiming categorizing data. Given many tentative models, those methods try to find the one that fits the dataset by building a hypothesis that predicts unseen samples reasonably well. One of the main concerns in that regard is selecting a model that performs well in unseen samples not overfitting on the known data. In this work, we introduce a classification method based on the minimum description length principle, which naturally offers a tradeoff between model complexity and data fit. The proposed method is multiclass, online and is generic in the regard of data representation. The experiments conducted in real datasets with many different characteristics, have shown that the proposed method is statiscally equivalent to the other classical baseline methods in the literature in the offline scenario and it performed better than some when tested in an online scenario. Moreover, the method has proven to be robust to overfitting and data normalization which poses great features a classifier must have in order to deal with large, complex and real-world classification problems. |
publishDate |
2017 |
dc.date.issued.fl_str_mv |
2017-12-14 |
dc.date.accessioned.fl_str_mv |
2018-01-12T10:36:25Z |
dc.date.available.fl_str_mv |
2018-01-12T10:36:25Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
FREITAS, Breno Lima de. Aprendiz de descritores de mistura gaussiana. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/9249. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufscar.br/handle/20.500.14289/9249 |
identifier_str_mv |
FREITAS, Breno Lima de. Aprendiz de descritores de mistura gaussiana. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/9249. |
url |
https://repositorio.ufscar.br/handle/20.500.14289/9249 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.confidence.fl_str_mv |
600 600 |
dc.relation.authority.fl_str_mv |
5de967ad-743c-4f36-972b-79dd683c0e9d |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de São Carlos Câmpus Sorocaba |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Ciência da Computação - PPGCC-So |
dc.publisher.initials.fl_str_mv |
UFSCar |
publisher.none.fl_str_mv |
Universidade Federal de São Carlos Câmpus Sorocaba |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFSCAR instname:Universidade Federal de São Carlos (UFSCAR) instacron:UFSCAR |
instname_str |
Universidade Federal de São Carlos (UFSCAR) |
instacron_str |
UFSCAR |
institution |
UFSCAR |
reponame_str |
Repositório Institucional da UFSCAR |
collection |
Repositório Institucional da UFSCAR |
bitstream.url.fl_str_mv |
https://repositorio.ufscar.br/bitstreams/ad2ebb7e-83af-4edb-b288-5c9a0155d3d7/download https://repositorio.ufscar.br/bitstreams/a61cfb8b-d4d1-4ec9-bf61-7f8a15698ee1/download https://repositorio.ufscar.br/bitstreams/b9f6ce4a-3485-4bd2-8a6a-e8ca0bbcb99d/download https://repositorio.ufscar.br/bitstreams/70950fd8-d052-4bc5-87cf-81cc0b649cf3/download https://repositorio.ufscar.br/bitstreams/fff6a0ba-c715-43a0-8275-9be7e5598080/download https://repositorio.ufscar.br/bitstreams/6c266d9d-8546-48c1-bbfb-4deed0a42280/download https://repositorio.ufscar.br/bitstreams/7889230f-cdea-4142-aaad-78261a5ef6f8/download |
bitstream.checksum.fl_str_mv |
6fed87a7d149f4ecf1b124c3dbe48d9e 4d20d8676988e631bc8b46f7c2224bcb ae0398b6f8b235e40ad82cba6c50031d 768ed9ce241c33e2edd8a3334eb90f15 99ad454bd2eed57a698f0c8298b95e45 29828d4cb1e8f136c9549b210cd8bd39 a4700cc6e65f518ecdab87f8bed91716 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR) |
repository.mail.fl_str_mv |
repositorio.sibi@ufscar.br |
_version_ |
1833925332550287360 |