Contributions on latent projections for Gaussian process modeling

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Souza, Daniel Augusto Ramos Macedo Antunes de
Orientador(a): Gomes, João Paulo Pordeus
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/55580
Resumo: Projecting data to a latent space is a routine procedure in machine learning. One of the incentives to do such transformations is the manifold hypothesis, which states that most data sampled from empirical processes tend to be inside a lower-dimensional space. Since this smaller representation is not visible in the dataset, probabilistic machine learning techniques can accurately propagate uncertainties in the data to the latent representation. In particular, Gaussian processes (GP) are a family of probabilistic kernel methods that researchers have successfully applied to regression and dimensionality reduction tasks. However, for dimensionality reduction, efficient and deterministic variational inference exists only for a minimal set of kernels. As such, I propose the unscented Gaussian process latent variable model (UGPLVM), an alternative inference method for Bayesian Gaussian process latent variable models that uses the unscented transformation to permit the use of arbitrary kernels while remaining sample efficient. For regression with GP models, the compositional deep Gaussian process (DGP) is a popular model that uses successive mappings to latent spaces to alleviate the burden of choosing a kernel function. However, that is not the only DGP construction possible. In this dissertation, I propose another DGP construction in which each layer controls the smoothness of the next layer, instead of directly composing layer outputs into layer inputs. This model is called deep Mahalanobis Gaussian process (DMGP), and it is based on previous literature on the integration of Mahalanobis kernel hyperparameters and, thus, incorporates the idea of locally linear projections. Both proposals use deterministic variational inference while maintaining the same results and scalability as non-deterministic methods in various experimental tasks. The experiments for UGPLVM cover dimensionality reduction and simulation of dynamic systems with uncertainty propagation, and, for DMGP, they cover regression tasks with synthetic and empirical datasets.
id UFC-7_ee5d3e0aba4440dc7f4e5e1bd723f892
oai_identifier_str oai:repositorio.ufc.br:riufc/55580
network_acronym_str UFC-7
network_name_str Repositório Institucional da Universidade Federal do Ceará (UFC)
repository_id_str
spelling Souza, Daniel Augusto Ramos Macedo Antunes deMattos, César Lincoln CavalcanteGomes, João Paulo Pordeus2020-12-04T18:52:53Z2020-12-04T18:52:53Z2020SOUZA, Daniel Augusto Ramos Macedo Antunes de. Contributions on latent projections for Gaussian process modeling. 2020. 72 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2020.http://www.repositorio.ufc.br/handle/riufc/55580Projecting data to a latent space is a routine procedure in machine learning. One of the incentives to do such transformations is the manifold hypothesis, which states that most data sampled from empirical processes tend to be inside a lower-dimensional space. Since this smaller representation is not visible in the dataset, probabilistic machine learning techniques can accurately propagate uncertainties in the data to the latent representation. In particular, Gaussian processes (GP) are a family of probabilistic kernel methods that researchers have successfully applied to regression and dimensionality reduction tasks. However, for dimensionality reduction, efficient and deterministic variational inference exists only for a minimal set of kernels. As such, I propose the unscented Gaussian process latent variable model (UGPLVM), an alternative inference method for Bayesian Gaussian process latent variable models that uses the unscented transformation to permit the use of arbitrary kernels while remaining sample efficient. For regression with GP models, the compositional deep Gaussian process (DGP) is a popular model that uses successive mappings to latent spaces to alleviate the burden of choosing a kernel function. However, that is not the only DGP construction possible. In this dissertation, I propose another DGP construction in which each layer controls the smoothness of the next layer, instead of directly composing layer outputs into layer inputs. This model is called deep Mahalanobis Gaussian process (DMGP), and it is based on previous literature on the integration of Mahalanobis kernel hyperparameters and, thus, incorporates the idea of locally linear projections. Both proposals use deterministic variational inference while maintaining the same results and scalability as non-deterministic methods in various experimental tasks. The experiments for UGPLVM cover dimensionality reduction and simulation of dynamic systems with uncertainty propagation, and, for DMGP, they cover regression tasks with synthetic and empirical datasets.Projetar dados num espaço latente é uma operação rotineira em aprendizado de máquina. Um dos incentivos para realizar tal transformação é a hipótese da variedade (manifold hypothesis), que diz que a maioria dos dados amostrados de um processo empírico tendem a estar dentro de um espaço de dimensão menor. Já que essa representação menor não é visível no conjunto de dados, técnicas probabilísticas de aprendizado de máquina conseguem propagar as incertezas nos dados para a representação latente de forma acurada. Em particular, processos Gaussianos (GP) são uma família de métodos de kernel probabilísticos que foram aplicados com sucesso em tarefas de regressão e redução de dimensão. Contudo, no caso da redução de dimensão, inferência variacional determinística e eficiente só existe para um conjunto mínimo de kernels. Portanto, eu proponho o unscented Bayesian Gaussian process latent variable model (UGPLVM), um método de inferência alternativo para o Bayesian Gaussian process latent variable model que usa a transformação unscented a fim de permitir o uso de kernels completamente arbitrários enquanto se mantém eficiente em amostragem. Para regressão com modelos GP, o deep Gaussian process (DGP) composicional é um modelo popular que utiliza transformações sucessivas entre espaços latentes para aliviar a dificuldade de escolher de um kernel. Contudo, essa não é a única construção possível para um DGP. Nessa dissertação, eu proponho outra construção para DGP onde cada camada controla a suavidade da próxima, ao invés de compor entradas com saídas diretamente. Esse modelo é chamado de deep Mahalanobis Gaussian process (DMGP), e ele é baseado em pesquisas anteriores sobre a integração de hiperparâmetros do kernel Mahalanobis e, então, incorpora a ideia de projeções localmente lineares. Ambas as propostas usam inferência variacional determinística mas ainda mantem os mesmos resultados e escabilidade que métodos não determinísticos em várias tarefas experimentais. Os experimentos para o UGPLVM cobrem tarefas de redução de dimensionalidade e simulação de sistemas dinâmicos com propagação de incerteza, e, para o DMGP, cobrem tarefas de regressão em conjuntos de dados sintéticos e empíricos.Machine learningGaussian processesVariational inferenceDeep learningManifold learningContributions on latent projections for Gaussian process modelingContributions on latent projections for Gaussian process modelinginfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisengreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFCinfo:eu-repo/semantics/openAccessORIGINAL2020_dis_darmasouza.pdf2020_dis_darmasouza.pdfapplication/pdf4099768http://repositorio.ufc.br/bitstream/riufc/55580/3/2020_dis_darmasouza.pdfbb5ec83f7dd90d235fb2912f814658c2MD53LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ufc.br/bitstream/riufc/55580/4/license.txt8a4605be74aa9ea9d79846c1fba20a33MD54riufc/555802020-12-04 17:37:12.694oai:repositorio.ufc.br:riufc/55580Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2020-12-04T20:37:12Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false
dc.title.pt_BR.fl_str_mv Contributions on latent projections for Gaussian process modeling
dc.title.en.pt_BR.fl_str_mv Contributions on latent projections for Gaussian process modeling
title Contributions on latent projections for Gaussian process modeling
spellingShingle Contributions on latent projections for Gaussian process modeling
Souza, Daniel Augusto Ramos Macedo Antunes de
Machine learning
Gaussian processes
Variational inference
Deep learning
Manifold learning
title_short Contributions on latent projections for Gaussian process modeling
title_full Contributions on latent projections for Gaussian process modeling
title_fullStr Contributions on latent projections for Gaussian process modeling
title_full_unstemmed Contributions on latent projections for Gaussian process modeling
title_sort Contributions on latent projections for Gaussian process modeling
author Souza, Daniel Augusto Ramos Macedo Antunes de
author_facet Souza, Daniel Augusto Ramos Macedo Antunes de
author_role author
dc.contributor.co-advisor.none.fl_str_mv Mattos, César Lincoln Cavalcante
dc.contributor.author.fl_str_mv Souza, Daniel Augusto Ramos Macedo Antunes de
dc.contributor.advisor1.fl_str_mv Gomes, João Paulo Pordeus
contributor_str_mv Gomes, João Paulo Pordeus
dc.subject.por.fl_str_mv Machine learning
Gaussian processes
Variational inference
Deep learning
Manifold learning
topic Machine learning
Gaussian processes
Variational inference
Deep learning
Manifold learning
description Projecting data to a latent space is a routine procedure in machine learning. One of the incentives to do such transformations is the manifold hypothesis, which states that most data sampled from empirical processes tend to be inside a lower-dimensional space. Since this smaller representation is not visible in the dataset, probabilistic machine learning techniques can accurately propagate uncertainties in the data to the latent representation. In particular, Gaussian processes (GP) are a family of probabilistic kernel methods that researchers have successfully applied to regression and dimensionality reduction tasks. However, for dimensionality reduction, efficient and deterministic variational inference exists only for a minimal set of kernels. As such, I propose the unscented Gaussian process latent variable model (UGPLVM), an alternative inference method for Bayesian Gaussian process latent variable models that uses the unscented transformation to permit the use of arbitrary kernels while remaining sample efficient. For regression with GP models, the compositional deep Gaussian process (DGP) is a popular model that uses successive mappings to latent spaces to alleviate the burden of choosing a kernel function. However, that is not the only DGP construction possible. In this dissertation, I propose another DGP construction in which each layer controls the smoothness of the next layer, instead of directly composing layer outputs into layer inputs. This model is called deep Mahalanobis Gaussian process (DMGP), and it is based on previous literature on the integration of Mahalanobis kernel hyperparameters and, thus, incorporates the idea of locally linear projections. Both proposals use deterministic variational inference while maintaining the same results and scalability as non-deterministic methods in various experimental tasks. The experiments for UGPLVM cover dimensionality reduction and simulation of dynamic systems with uncertainty propagation, and, for DMGP, they cover regression tasks with synthetic and empirical datasets.
publishDate 2020
dc.date.accessioned.fl_str_mv 2020-12-04T18:52:53Z
dc.date.available.fl_str_mv 2020-12-04T18:52:53Z
dc.date.issued.fl_str_mv 2020
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv SOUZA, Daniel Augusto Ramos Macedo Antunes de. Contributions on latent projections for Gaussian process modeling. 2020. 72 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2020.
dc.identifier.uri.fl_str_mv http://www.repositorio.ufc.br/handle/riufc/55580
identifier_str_mv SOUZA, Daniel Augusto Ramos Macedo Antunes de. Contributions on latent projections for Gaussian process modeling. 2020. 72 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2020.
url http://www.repositorio.ufc.br/handle/riufc/55580
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Ceará (UFC)
instname:Universidade Federal do Ceará (UFC)
instacron:UFC
instname_str Universidade Federal do Ceará (UFC)
instacron_str UFC
institution UFC
reponame_str Repositório Institucional da Universidade Federal do Ceará (UFC)
collection Repositório Institucional da Universidade Federal do Ceará (UFC)
bitstream.url.fl_str_mv http://repositorio.ufc.br/bitstream/riufc/55580/3/2020_dis_darmasouza.pdf
http://repositorio.ufc.br/bitstream/riufc/55580/4/license.txt
bitstream.checksum.fl_str_mv bb5ec83f7dd90d235fb2912f814658c2
8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)
repository.mail.fl_str_mv bu@ufc.br || repositorio@ufc.br
_version_ 1847793280304021504