Contributions on latent projections for Gaussian process modeling
| Ano de defesa: | 2020 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Não Informado pela instituição
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | http://www.repositorio.ufc.br/handle/riufc/55580 |
Resumo: | Projecting data to a latent space is a routine procedure in machine learning. One of the incentives to do such transformations is the manifold hypothesis, which states that most data sampled from empirical processes tend to be inside a lower-dimensional space. Since this smaller representation is not visible in the dataset, probabilistic machine learning techniques can accurately propagate uncertainties in the data to the latent representation. In particular, Gaussian processes (GP) are a family of probabilistic kernel methods that researchers have successfully applied to regression and dimensionality reduction tasks. However, for dimensionality reduction, efficient and deterministic variational inference exists only for a minimal set of kernels. As such, I propose the unscented Gaussian process latent variable model (UGPLVM), an alternative inference method for Bayesian Gaussian process latent variable models that uses the unscented transformation to permit the use of arbitrary kernels while remaining sample efficient. For regression with GP models, the compositional deep Gaussian process (DGP) is a popular model that uses successive mappings to latent spaces to alleviate the burden of choosing a kernel function. However, that is not the only DGP construction possible. In this dissertation, I propose another DGP construction in which each layer controls the smoothness of the next layer, instead of directly composing layer outputs into layer inputs. This model is called deep Mahalanobis Gaussian process (DMGP), and it is based on previous literature on the integration of Mahalanobis kernel hyperparameters and, thus, incorporates the idea of locally linear projections. Both proposals use deterministic variational inference while maintaining the same results and scalability as non-deterministic methods in various experimental tasks. The experiments for UGPLVM cover dimensionality reduction and simulation of dynamic systems with uncertainty propagation, and, for DMGP, they cover regression tasks with synthetic and empirical datasets. |
| id |
UFC-7_ee5d3e0aba4440dc7f4e5e1bd723f892 |
|---|---|
| oai_identifier_str |
oai:repositorio.ufc.br:riufc/55580 |
| network_acronym_str |
UFC-7 |
| network_name_str |
Repositório Institucional da Universidade Federal do Ceará (UFC) |
| repository_id_str |
|
| spelling |
Souza, Daniel Augusto Ramos Macedo Antunes deMattos, César Lincoln CavalcanteGomes, João Paulo Pordeus2020-12-04T18:52:53Z2020-12-04T18:52:53Z2020SOUZA, Daniel Augusto Ramos Macedo Antunes de. Contributions on latent projections for Gaussian process modeling. 2020. 72 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2020.http://www.repositorio.ufc.br/handle/riufc/55580Projecting data to a latent space is a routine procedure in machine learning. One of the incentives to do such transformations is the manifold hypothesis, which states that most data sampled from empirical processes tend to be inside a lower-dimensional space. Since this smaller representation is not visible in the dataset, probabilistic machine learning techniques can accurately propagate uncertainties in the data to the latent representation. In particular, Gaussian processes (GP) are a family of probabilistic kernel methods that researchers have successfully applied to regression and dimensionality reduction tasks. However, for dimensionality reduction, efficient and deterministic variational inference exists only for a minimal set of kernels. As such, I propose the unscented Gaussian process latent variable model (UGPLVM), an alternative inference method for Bayesian Gaussian process latent variable models that uses the unscented transformation to permit the use of arbitrary kernels while remaining sample efficient. For regression with GP models, the compositional deep Gaussian process (DGP) is a popular model that uses successive mappings to latent spaces to alleviate the burden of choosing a kernel function. However, that is not the only DGP construction possible. In this dissertation, I propose another DGP construction in which each layer controls the smoothness of the next layer, instead of directly composing layer outputs into layer inputs. This model is called deep Mahalanobis Gaussian process (DMGP), and it is based on previous literature on the integration of Mahalanobis kernel hyperparameters and, thus, incorporates the idea of locally linear projections. Both proposals use deterministic variational inference while maintaining the same results and scalability as non-deterministic methods in various experimental tasks. The experiments for UGPLVM cover dimensionality reduction and simulation of dynamic systems with uncertainty propagation, and, for DMGP, they cover regression tasks with synthetic and empirical datasets.Projetar dados num espaço latente é uma operação rotineira em aprendizado de máquina. Um dos incentivos para realizar tal transformação é a hipótese da variedade (manifold hypothesis), que diz que a maioria dos dados amostrados de um processo empírico tendem a estar dentro de um espaço de dimensão menor. Já que essa representação menor não é visível no conjunto de dados, técnicas probabilísticas de aprendizado de máquina conseguem propagar as incertezas nos dados para a representação latente de forma acurada. Em particular, processos Gaussianos (GP) são uma família de métodos de kernel probabilísticos que foram aplicados com sucesso em tarefas de regressão e redução de dimensão. Contudo, no caso da redução de dimensão, inferência variacional determinística e eficiente só existe para um conjunto mínimo de kernels. Portanto, eu proponho o unscented Bayesian Gaussian process latent variable model (UGPLVM), um método de inferência alternativo para o Bayesian Gaussian process latent variable model que usa a transformação unscented a fim de permitir o uso de kernels completamente arbitrários enquanto se mantém eficiente em amostragem. Para regressão com modelos GP, o deep Gaussian process (DGP) composicional é um modelo popular que utiliza transformações sucessivas entre espaços latentes para aliviar a dificuldade de escolher de um kernel. Contudo, essa não é a única construção possível para um DGP. Nessa dissertação, eu proponho outra construção para DGP onde cada camada controla a suavidade da próxima, ao invés de compor entradas com saídas diretamente. Esse modelo é chamado de deep Mahalanobis Gaussian process (DMGP), e ele é baseado em pesquisas anteriores sobre a integração de hiperparâmetros do kernel Mahalanobis e, então, incorpora a ideia de projeções localmente lineares. Ambas as propostas usam inferência variacional determinística mas ainda mantem os mesmos resultados e escabilidade que métodos não determinísticos em várias tarefas experimentais. Os experimentos para o UGPLVM cobrem tarefas de redução de dimensionalidade e simulação de sistemas dinâmicos com propagação de incerteza, e, para o DMGP, cobrem tarefas de regressão em conjuntos de dados sintéticos e empíricos.Machine learningGaussian processesVariational inferenceDeep learningManifold learningContributions on latent projections for Gaussian process modelingContributions on latent projections for Gaussian process modelinginfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisengreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFCinfo:eu-repo/semantics/openAccessORIGINAL2020_dis_darmasouza.pdf2020_dis_darmasouza.pdfapplication/pdf4099768http://repositorio.ufc.br/bitstream/riufc/55580/3/2020_dis_darmasouza.pdfbb5ec83f7dd90d235fb2912f814658c2MD53LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ufc.br/bitstream/riufc/55580/4/license.txt8a4605be74aa9ea9d79846c1fba20a33MD54riufc/555802020-12-04 17:37:12.694oai:repositorio.ufc.br:riufc/55580Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2020-12-04T20:37:12Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false |
| dc.title.pt_BR.fl_str_mv |
Contributions on latent projections for Gaussian process modeling |
| dc.title.en.pt_BR.fl_str_mv |
Contributions on latent projections for Gaussian process modeling |
| title |
Contributions on latent projections for Gaussian process modeling |
| spellingShingle |
Contributions on latent projections for Gaussian process modeling Souza, Daniel Augusto Ramos Macedo Antunes de Machine learning Gaussian processes Variational inference Deep learning Manifold learning |
| title_short |
Contributions on latent projections for Gaussian process modeling |
| title_full |
Contributions on latent projections for Gaussian process modeling |
| title_fullStr |
Contributions on latent projections for Gaussian process modeling |
| title_full_unstemmed |
Contributions on latent projections for Gaussian process modeling |
| title_sort |
Contributions on latent projections for Gaussian process modeling |
| author |
Souza, Daniel Augusto Ramos Macedo Antunes de |
| author_facet |
Souza, Daniel Augusto Ramos Macedo Antunes de |
| author_role |
author |
| dc.contributor.co-advisor.none.fl_str_mv |
Mattos, César Lincoln Cavalcante |
| dc.contributor.author.fl_str_mv |
Souza, Daniel Augusto Ramos Macedo Antunes de |
| dc.contributor.advisor1.fl_str_mv |
Gomes, João Paulo Pordeus |
| contributor_str_mv |
Gomes, João Paulo Pordeus |
| dc.subject.por.fl_str_mv |
Machine learning Gaussian processes Variational inference Deep learning Manifold learning |
| topic |
Machine learning Gaussian processes Variational inference Deep learning Manifold learning |
| description |
Projecting data to a latent space is a routine procedure in machine learning. One of the incentives to do such transformations is the manifold hypothesis, which states that most data sampled from empirical processes tend to be inside a lower-dimensional space. Since this smaller representation is not visible in the dataset, probabilistic machine learning techniques can accurately propagate uncertainties in the data to the latent representation. In particular, Gaussian processes (GP) are a family of probabilistic kernel methods that researchers have successfully applied to regression and dimensionality reduction tasks. However, for dimensionality reduction, efficient and deterministic variational inference exists only for a minimal set of kernels. As such, I propose the unscented Gaussian process latent variable model (UGPLVM), an alternative inference method for Bayesian Gaussian process latent variable models that uses the unscented transformation to permit the use of arbitrary kernels while remaining sample efficient. For regression with GP models, the compositional deep Gaussian process (DGP) is a popular model that uses successive mappings to latent spaces to alleviate the burden of choosing a kernel function. However, that is not the only DGP construction possible. In this dissertation, I propose another DGP construction in which each layer controls the smoothness of the next layer, instead of directly composing layer outputs into layer inputs. This model is called deep Mahalanobis Gaussian process (DMGP), and it is based on previous literature on the integration of Mahalanobis kernel hyperparameters and, thus, incorporates the idea of locally linear projections. Both proposals use deterministic variational inference while maintaining the same results and scalability as non-deterministic methods in various experimental tasks. The experiments for UGPLVM cover dimensionality reduction and simulation of dynamic systems with uncertainty propagation, and, for DMGP, they cover regression tasks with synthetic and empirical datasets. |
| publishDate |
2020 |
| dc.date.accessioned.fl_str_mv |
2020-12-04T18:52:53Z |
| dc.date.available.fl_str_mv |
2020-12-04T18:52:53Z |
| dc.date.issued.fl_str_mv |
2020 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.citation.fl_str_mv |
SOUZA, Daniel Augusto Ramos Macedo Antunes de. Contributions on latent projections for Gaussian process modeling. 2020. 72 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2020. |
| dc.identifier.uri.fl_str_mv |
http://www.repositorio.ufc.br/handle/riufc/55580 |
| identifier_str_mv |
SOUZA, Daniel Augusto Ramos Macedo Antunes de. Contributions on latent projections for Gaussian process modeling. 2020. 72 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2020. |
| url |
http://www.repositorio.ufc.br/handle/riufc/55580 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da Universidade Federal do Ceará (UFC) instname:Universidade Federal do Ceará (UFC) instacron:UFC |
| instname_str |
Universidade Federal do Ceará (UFC) |
| instacron_str |
UFC |
| institution |
UFC |
| reponame_str |
Repositório Institucional da Universidade Federal do Ceará (UFC) |
| collection |
Repositório Institucional da Universidade Federal do Ceará (UFC) |
| bitstream.url.fl_str_mv |
http://repositorio.ufc.br/bitstream/riufc/55580/3/2020_dis_darmasouza.pdf http://repositorio.ufc.br/bitstream/riufc/55580/4/license.txt |
| bitstream.checksum.fl_str_mv |
bb5ec83f7dd90d235fb2912f814658c2 8a4605be74aa9ea9d79846c1fba20a33 |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
| repository.name.fl_str_mv |
Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC) |
| repository.mail.fl_str_mv |
bu@ufc.br || repositorio@ufc.br |
| _version_ |
1847793280304021504 |