Local dampening: differential privacy for non-numeric queries via local sensitivity

Detalhes bibliográficos
Ano de defesa: 2021
Autor(a) principal: Farias, Victor Aguiar Evangelista de
Orientador(a): Machado, Javam de Castro
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/59462
Resumo: Differential privacy is the state-of-the-art formal definition for data release under strong privacy guarantees. A variety of mechanisms has been proposed in the literature for releasing the output of numeric queries (e.g., the Laplace mechanism and smooth sensitivity mechanism). Those mechanisms guarantee different privacy by adding noise to the true query’s output. The amount of noise added is calibrated by the notions of global sensitivity and local sensitivity of the query that measure the impact of the addition or removal of an individual on the query’s output. Mechanisms that use local sensitivity add less noise and, consequently, have a more accurate answer. However, although there has been some work on generic mechanisms for releasing the output of non-numeric queries using global sensitivity (e.g., the Exponential mechanism), the literature lacks generic mechanisms for releasing the output of non-numeric queries using local sensitivity to reduce the noise in the query’s output. In this work, we remedy this shortcoming and present the local dampening mechanism. We adapt the notion of local sensitivity for the non-numeric setting and leverage it to design a generic non-numeric mechanism. We provide theoretical comparisons to the exponential mechanism and show under which conditions the local dampening mechanism is more accurate than the exponential mechanism. We illustrate the effectiveness of the local dampening mechanism by applying it to three diverse problems: (i) median selection. We report the median element in the database; (ii) Influential node analysis. Given an influence metric, we release the top-k most influential nodes while preserving the privacy of the relationship between nodes in the network; (iii) Decision tree induction. We provide a private adaptation to the ID3 algorithm to build decision trees from a given tabular dataset. Experimental evaluation shows that we can reduce the error for median selection application up to 18%, reduce the use of privacy budget by 2 to 4 orders of magnitude for influential node analysis application and increase accuracy up to 8% for decision tree induction when compared to global sensitivity based approaches.
id UFC-7_d6585ae7c59220f0f590b2fddc085979
oai_identifier_str oai:repositorio.ufc.br:riufc/59462
network_acronym_str UFC-7
network_name_str Repositório Institucional da Universidade Federal do Ceará (UFC)
repository_id_str
spelling Farias, Victor Aguiar Evangelista deMachado, Javam de Castro2021-07-12T13:55:26Z2021-07-12T13:55:26Z2021FARIAS, Victor Aguiar Evangelista de. Local dampening: differential privacy for non-numeric queries via local sensitivity. 2021. 100 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2021.http://www.repositorio.ufc.br/handle/riufc/59462Differential privacy is the state-of-the-art formal definition for data release under strong privacy guarantees. A variety of mechanisms has been proposed in the literature for releasing the output of numeric queries (e.g., the Laplace mechanism and smooth sensitivity mechanism). Those mechanisms guarantee different privacy by adding noise to the true query’s output. The amount of noise added is calibrated by the notions of global sensitivity and local sensitivity of the query that measure the impact of the addition or removal of an individual on the query’s output. Mechanisms that use local sensitivity add less noise and, consequently, have a more accurate answer. However, although there has been some work on generic mechanisms for releasing the output of non-numeric queries using global sensitivity (e.g., the Exponential mechanism), the literature lacks generic mechanisms for releasing the output of non-numeric queries using local sensitivity to reduce the noise in the query’s output. In this work, we remedy this shortcoming and present the local dampening mechanism. We adapt the notion of local sensitivity for the non-numeric setting and leverage it to design a generic non-numeric mechanism. We provide theoretical comparisons to the exponential mechanism and show under which conditions the local dampening mechanism is more accurate than the exponential mechanism. We illustrate the effectiveness of the local dampening mechanism by applying it to three diverse problems: (i) median selection. We report the median element in the database; (ii) Influential node analysis. Given an influence metric, we release the top-k most influential nodes while preserving the privacy of the relationship between nodes in the network; (iii) Decision tree induction. We provide a private adaptation to the ID3 algorithm to build decision trees from a given tabular dataset. Experimental evaluation shows that we can reduce the error for median selection application up to 18%, reduce the use of privacy budget by 2 to 4 orders of magnitude for influential node analysis application and increase accuracy up to 8% for decision tree induction when compared to global sensitivity based approaches.Privacidade diferencial é a definição formal do estado da arte para publicação de dados sob fortes garantias de privacidade. Uma variedade de mecanismos foram propostos na literatura para publicar as saídas de consultas numéricas (e.g., mecanismo de Laplace e o mecanismo smooth sensitivity). Esses mecanismos garantem a privacidade diferencial adicionando ruído na saída verdadeira da consulta. A quantidade de ruído adicionada é calibrada usando as noções de sensibilidade global e sensibilidade local da consulta que medem o impacto da adição ou remoção de um indivíduo na saída da consulta. Mecanismos numéricos que usam sensibilidade local adicionam menos ruído e, consequentemente, tem uma resposta mais acurada. Contudo, mesmo que também haja trabalhos para consultas não-numéricas usando sensibilidade global (e.g., mecanismo exponencial), a literatura carece de mecanismos genéricos para publicação de saídas não-numéricas que usem sensibilidade local para reduzir o ruído. Nesse trabalho, remediamos essa deficiência apresentando o mecanismo local dampening. Nós adaptamos a noção de sensibilidade local da configuração numérica para a configuração não-numérica e a usamos para criar um mecanismo não-numérico genérico. Nós provemos uma comparação teórica com o mecanismo exponencial e mostramos sob quais condições o mecanismo local dampening é mais acurado que o mecanismo exponencial. Nós ilustramos a efetividade do mecanismo local dampening aplicando-o em três problemas diversos: (i) Seleção de mediana. Nós reportamos o elemento mediano de um banco de dados; (ii) Análise de nós influentes. Dado uma métrica de influência, nós publicamos os top-k nós mais influentes da rede; (iii) Indução de árvores de decisão. Nós provemos uma adaptação privada para o algoritmo ID3 para construir árvores de decisão a partir de um dado tabular. Nossa avaliação experimental mostra que nós reduzimos o erro para a aplicação de seleção de mediana em até 18%, reduzimos o uso de orçamento de privacidade em 2 a 4 ordens de magnitude para a aplicação de análise de nós influentes e aumentamos a acurácia em até 8% para árvores a aplicação em indução de árvores de decisão quando comparado a abordagens que usam sensibilidade global.Differential privacyData anonymizationGraph analysisDecision TreesLocal dampening: differential privacy for non-numeric queries via local sensitivityLocal dampening: differential privacy for non-numeric queries via local sensitivityinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisengreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFCinfo:eu-repo/semantics/openAccessLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ufc.br/bitstream/riufc/59462/4/license.txt8a4605be74aa9ea9d79846c1fba20a33MD54ORIGINAL2021_tese_vaefarias.pdf2021_tese_vaefarias.pdfapplication/pdf1151470http://repositorio.ufc.br/bitstream/riufc/59462/3/2021_tese_vaefarias.pdf5c43b74d29cf49dd1711eac175b1b754MD53riufc/594622021-07-12 10:55:26.082oai:repositorio.ufc.br:riufc/59462Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2021-07-12T13:55:26Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false
dc.title.pt_BR.fl_str_mv Local dampening: differential privacy for non-numeric queries via local sensitivity
dc.title.en.pt_BR.fl_str_mv Local dampening: differential privacy for non-numeric queries via local sensitivity
title Local dampening: differential privacy for non-numeric queries via local sensitivity
spellingShingle Local dampening: differential privacy for non-numeric queries via local sensitivity
Farias, Victor Aguiar Evangelista de
Differential privacy
Data anonymization
Graph analysis
Decision Trees
title_short Local dampening: differential privacy for non-numeric queries via local sensitivity
title_full Local dampening: differential privacy for non-numeric queries via local sensitivity
title_fullStr Local dampening: differential privacy for non-numeric queries via local sensitivity
title_full_unstemmed Local dampening: differential privacy for non-numeric queries via local sensitivity
title_sort Local dampening: differential privacy for non-numeric queries via local sensitivity
author Farias, Victor Aguiar Evangelista de
author_facet Farias, Victor Aguiar Evangelista de
author_role author
dc.contributor.author.fl_str_mv Farias, Victor Aguiar Evangelista de
dc.contributor.advisor1.fl_str_mv Machado, Javam de Castro
contributor_str_mv Machado, Javam de Castro
dc.subject.por.fl_str_mv Differential privacy
Data anonymization
Graph analysis
Decision Trees
topic Differential privacy
Data anonymization
Graph analysis
Decision Trees
description Differential privacy is the state-of-the-art formal definition for data release under strong privacy guarantees. A variety of mechanisms has been proposed in the literature for releasing the output of numeric queries (e.g., the Laplace mechanism and smooth sensitivity mechanism). Those mechanisms guarantee different privacy by adding noise to the true query’s output. The amount of noise added is calibrated by the notions of global sensitivity and local sensitivity of the query that measure the impact of the addition or removal of an individual on the query’s output. Mechanisms that use local sensitivity add less noise and, consequently, have a more accurate answer. However, although there has been some work on generic mechanisms for releasing the output of non-numeric queries using global sensitivity (e.g., the Exponential mechanism), the literature lacks generic mechanisms for releasing the output of non-numeric queries using local sensitivity to reduce the noise in the query’s output. In this work, we remedy this shortcoming and present the local dampening mechanism. We adapt the notion of local sensitivity for the non-numeric setting and leverage it to design a generic non-numeric mechanism. We provide theoretical comparisons to the exponential mechanism and show under which conditions the local dampening mechanism is more accurate than the exponential mechanism. We illustrate the effectiveness of the local dampening mechanism by applying it to three diverse problems: (i) median selection. We report the median element in the database; (ii) Influential node analysis. Given an influence metric, we release the top-k most influential nodes while preserving the privacy of the relationship between nodes in the network; (iii) Decision tree induction. We provide a private adaptation to the ID3 algorithm to build decision trees from a given tabular dataset. Experimental evaluation shows that we can reduce the error for median selection application up to 18%, reduce the use of privacy budget by 2 to 4 orders of magnitude for influential node analysis application and increase accuracy up to 8% for decision tree induction when compared to global sensitivity based approaches.
publishDate 2021
dc.date.accessioned.fl_str_mv 2021-07-12T13:55:26Z
dc.date.available.fl_str_mv 2021-07-12T13:55:26Z
dc.date.issued.fl_str_mv 2021
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv FARIAS, Victor Aguiar Evangelista de. Local dampening: differential privacy for non-numeric queries via local sensitivity. 2021. 100 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2021.
dc.identifier.uri.fl_str_mv http://www.repositorio.ufc.br/handle/riufc/59462
identifier_str_mv FARIAS, Victor Aguiar Evangelista de. Local dampening: differential privacy for non-numeric queries via local sensitivity. 2021. 100 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2021.
url http://www.repositorio.ufc.br/handle/riufc/59462
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Ceará (UFC)
instname:Universidade Federal do Ceará (UFC)
instacron:UFC
instname_str Universidade Federal do Ceará (UFC)
instacron_str UFC
institution UFC
reponame_str Repositório Institucional da Universidade Federal do Ceará (UFC)
collection Repositório Institucional da Universidade Federal do Ceará (UFC)
bitstream.url.fl_str_mv http://repositorio.ufc.br/bitstream/riufc/59462/4/license.txt
http://repositorio.ufc.br/bitstream/riufc/59462/3/2021_tese_vaefarias.pdf
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
5c43b74d29cf49dd1711eac175b1b754
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)
repository.mail.fl_str_mv bu@ufc.br || repositorio@ufc.br
_version_ 1847793346125234176