Exportação concluída — 

Integração de informações de usuários para inferência de localização geográfica no twitter

Detalhes bibliográficos
Ano de defesa: 2015
Autor(a) principal: Silvio Soares Ribeiro Junior
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://hdl.handle.net/1843/ESBF-9XZGNF
Resumo: Knowing the location of a social-network user is essential to many applications that rely on the informations extracted from those networks, like detecting events and tracking epidemics. In this work, we investigate the task of inferring the location of Twitter users by recognizing that it may be encoded not only in their GPS-tagged messages, but also in their messages, profile information and the network of users they interact with. While previous works exploit the possibility of using those different sources of information, there is not a de facto technique that is a consensus as the best, but a range of many techniques with specific pros and cons. The sparsity of users in a geographic area may be a challenge to many inference methods due to the lack of enough data to characterize it. Therefore we initially discuss how to overcome the sparsity of users in many geographic regions. Grouping and merging location is one of the main approaches to solve this challenge, although by doing it we are giving up the precision of the inferred location. We propose a metric to evaluate the quality of the generated clusters and present a new strategy to group cities. Previous work have used different performance metrics to evaluate their geo-inferecing methods, making it difficult to compare them directly. We use standardized metrics in the same dataset, comparing the methods that use the users' social network, their publications and the information provided voluntarily in their profiles. We evaluate different network-based methods and compare their performance in two different types of users network: the friendship and the mentions network. We make evident the differences and similarities in using those two networks in the geoinference task. After pointing the changes in the vocabulary use in the stream of publication as the time passes, we propose a new method of inferring the location of a user's location based not only in the text but also considering the publication of her messages. We achieve better precision than the non-temporal similar approaches from the literature. We also evaluate the performance of using of the profile fields provided by the users and the periodic update of the models created to infere the location of new users. Finally, we evaluate the benefits of combining the information outputted by each method to improve the final performance the the geo-inference task.
id UFMG_546e7ebe80ceeb22a93e49d8eddccaa8
oai_identifier_str oai:repositorio.ufmg.br:1843/ESBF-9XZGNF
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling Integração de informações de usuários para inferência de localização geográfica no twitterComputaçãoTwitterRedes de relações sociaisgeoinferênciaredes sociaisTwitterKnowing the location of a social-network user is essential to many applications that rely on the informations extracted from those networks, like detecting events and tracking epidemics. In this work, we investigate the task of inferring the location of Twitter users by recognizing that it may be encoded not only in their GPS-tagged messages, but also in their messages, profile information and the network of users they interact with. While previous works exploit the possibility of using those different sources of information, there is not a de facto technique that is a consensus as the best, but a range of many techniques with specific pros and cons. The sparsity of users in a geographic area may be a challenge to many inference methods due to the lack of enough data to characterize it. Therefore we initially discuss how to overcome the sparsity of users in many geographic regions. Grouping and merging location is one of the main approaches to solve this challenge, although by doing it we are giving up the precision of the inferred location. We propose a metric to evaluate the quality of the generated clusters and present a new strategy to group cities. Previous work have used different performance metrics to evaluate their geo-inferecing methods, making it difficult to compare them directly. We use standardized metrics in the same dataset, comparing the methods that use the users' social network, their publications and the information provided voluntarily in their profiles. We evaluate different network-based methods and compare their performance in two different types of users network: the friendship and the mentions network. We make evident the differences and similarities in using those two networks in the geoinference task. After pointing the changes in the vocabulary use in the stream of publication as the time passes, we propose a new method of inferring the location of a user's location based not only in the text but also considering the publication of her messages. We achieve better precision than the non-temporal similar approaches from the literature. We also evaluate the performance of using of the profile fields provided by the users and the periodic update of the models created to infere the location of new users. Finally, we evaluate the benefits of combining the information outputted by each method to improve the final performance the the geo-inference task.Universidade Federal de Minas Gerais2019-08-14T07:23:26Z2025-09-09T00:59:28Z2019-08-14T07:23:26Z2015-06-29info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/1843/ESBF-9XZGNFSilvio Soares Ribeiro Juniorinfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-09T00:59:28Zoai:repositorio.ufmg.br:1843/ESBF-9XZGNFRepositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T00:59:28Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.none.fl_str_mv Integração de informações de usuários para inferência de localização geográfica no twitter
title Integração de informações de usuários para inferência de localização geográfica no twitter
spellingShingle Integração de informações de usuários para inferência de localização geográfica no twitter
Silvio Soares Ribeiro Junior
Computação
Twitter
Redes de relações sociais
geoinferência
redes sociais
Twitter
title_short Integração de informações de usuários para inferência de localização geográfica no twitter
title_full Integração de informações de usuários para inferência de localização geográfica no twitter
title_fullStr Integração de informações de usuários para inferência de localização geográfica no twitter
title_full_unstemmed Integração de informações de usuários para inferência de localização geográfica no twitter
title_sort Integração de informações de usuários para inferência de localização geográfica no twitter
author Silvio Soares Ribeiro Junior
author_facet Silvio Soares Ribeiro Junior
author_role author
dc.contributor.author.fl_str_mv Silvio Soares Ribeiro Junior
dc.subject.por.fl_str_mv Computação
Twitter
Redes de relações sociais
geoinferência
redes sociais
Twitter
topic Computação
Twitter
Redes de relações sociais
geoinferência
redes sociais
Twitter
description Knowing the location of a social-network user is essential to many applications that rely on the informations extracted from those networks, like detecting events and tracking epidemics. In this work, we investigate the task of inferring the location of Twitter users by recognizing that it may be encoded not only in their GPS-tagged messages, but also in their messages, profile information and the network of users they interact with. While previous works exploit the possibility of using those different sources of information, there is not a de facto technique that is a consensus as the best, but a range of many techniques with specific pros and cons. The sparsity of users in a geographic area may be a challenge to many inference methods due to the lack of enough data to characterize it. Therefore we initially discuss how to overcome the sparsity of users in many geographic regions. Grouping and merging location is one of the main approaches to solve this challenge, although by doing it we are giving up the precision of the inferred location. We propose a metric to evaluate the quality of the generated clusters and present a new strategy to group cities. Previous work have used different performance metrics to evaluate their geo-inferecing methods, making it difficult to compare them directly. We use standardized metrics in the same dataset, comparing the methods that use the users' social network, their publications and the information provided voluntarily in their profiles. We evaluate different network-based methods and compare their performance in two different types of users network: the friendship and the mentions network. We make evident the differences and similarities in using those two networks in the geoinference task. After pointing the changes in the vocabulary use in the stream of publication as the time passes, we propose a new method of inferring the location of a user's location based not only in the text but also considering the publication of her messages. We achieve better precision than the non-temporal similar approaches from the literature. We also evaluate the performance of using of the profile fields provided by the users and the periodic update of the models created to infere the location of new users. Finally, we evaluate the benefits of combining the information outputted by each method to improve the final performance the the geo-inference task.
publishDate 2015
dc.date.none.fl_str_mv 2015-06-29
2019-08-14T07:23:26Z
2019-08-14T07:23:26Z
2025-09-09T00:59:28Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/1843/ESBF-9XZGNF
url https://hdl.handle.net/1843/ESBF-9XZGNF
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv repositorio@ufmg.br
_version_ 1856414055914274816