Integração de informações de usuários para inferência de localização geográfica no twitter
| Ano de defesa: | 2015 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | por |
| Instituição de defesa: |
Universidade Federal de Minas Gerais
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://hdl.handle.net/1843/ESBF-9XZGNF |
Resumo: | Knowing the location of a social-network user is essential to many applications that rely on the informations extracted from those networks, like detecting events and tracking epidemics. In this work, we investigate the task of inferring the location of Twitter users by recognizing that it may be encoded not only in their GPS-tagged messages, but also in their messages, profile information and the network of users they interact with. While previous works exploit the possibility of using those different sources of information, there is not a de facto technique that is a consensus as the best, but a range of many techniques with specific pros and cons. The sparsity of users in a geographic area may be a challenge to many inference methods due to the lack of enough data to characterize it. Therefore we initially discuss how to overcome the sparsity of users in many geographic regions. Grouping and merging location is one of the main approaches to solve this challenge, although by doing it we are giving up the precision of the inferred location. We propose a metric to evaluate the quality of the generated clusters and present a new strategy to group cities. Previous work have used different performance metrics to evaluate their geo-inferecing methods, making it difficult to compare them directly. We use standardized metrics in the same dataset, comparing the methods that use the users' social network, their publications and the information provided voluntarily in their profiles. We evaluate different network-based methods and compare their performance in two different types of users network: the friendship and the mentions network. We make evident the differences and similarities in using those two networks in the geoinference task. After pointing the changes in the vocabulary use in the stream of publication as the time passes, we propose a new method of inferring the location of a user's location based not only in the text but also considering the publication of her messages. We achieve better precision than the non-temporal similar approaches from the literature. We also evaluate the performance of using of the profile fields provided by the users and the periodic update of the models created to infere the location of new users. Finally, we evaluate the benefits of combining the information outputted by each method to improve the final performance the the geo-inference task. |
| id |
UFMG_546e7ebe80ceeb22a93e49d8eddccaa8 |
|---|---|
| oai_identifier_str |
oai:repositorio.ufmg.br:1843/ESBF-9XZGNF |
| network_acronym_str |
UFMG |
| network_name_str |
Repositório Institucional da UFMG |
| repository_id_str |
|
| spelling |
2019-08-14T07:23:26Z2025-09-09T00:59:28Z2019-08-14T07:23:26Z2015-06-29https://hdl.handle.net/1843/ESBF-9XZGNFKnowing the location of a social-network user is essential to many applications that rely on the informations extracted from those networks, like detecting events and tracking epidemics. In this work, we investigate the task of inferring the location of Twitter users by recognizing that it may be encoded not only in their GPS-tagged messages, but also in their messages, profile information and the network of users they interact with. While previous works exploit the possibility of using those different sources of information, there is not a de facto technique that is a consensus as the best, but a range of many techniques with specific pros and cons. The sparsity of users in a geographic area may be a challenge to many inference methods due to the lack of enough data to characterize it. Therefore we initially discuss how to overcome the sparsity of users in many geographic regions. Grouping and merging location is one of the main approaches to solve this challenge, although by doing it we are giving up the precision of the inferred location. We propose a metric to evaluate the quality of the generated clusters and present a new strategy to group cities. Previous work have used different performance metrics to evaluate their geo-inferecing methods, making it difficult to compare them directly. We use standardized metrics in the same dataset, comparing the methods that use the users' social network, their publications and the information provided voluntarily in their profiles. We evaluate different network-based methods and compare their performance in two different types of users network: the friendship and the mentions network. We make evident the differences and similarities in using those two networks in the geoinference task. After pointing the changes in the vocabulary use in the stream of publication as the time passes, we propose a new method of inferring the location of a user's location based not only in the text but also considering the publication of her messages. We achieve better precision than the non-temporal similar approaches from the literature. We also evaluate the performance of using of the profile fields provided by the users and the periodic update of the models created to infere the location of new users. Finally, we evaluate the benefits of combining the information outputted by each method to improve the final performance the the geo-inference task.Universidade Federal de Minas Geraisgeoinferênciaredes sociaisTwitterComputaçãoTwitterRedes de relações sociaisIntegração de informações de usuários para inferência de localização geográfica no twitterinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisSilvio Soares Ribeiro Juniorinfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGGisele Lobo PappaClodoveu Augusto Davis JuniorFabricio Benevenuto de SouzaA localização de um usuário de redes sociais é essencial para aplicações que usam informações proveniente dessas redes, como detecção de eventos e epidemias. Neste trabalho, investigamos a tarefa de inferir a localização de um usuário do Twitter através de seus tweets, informações do perfil e usuários com os quais interage. Inicialmente discutimos como contornar a escassez de usuários em certas regiões - o que pode dificultar a tarefa de geoinferência. Avaliamos métodos existentes para inferência de localização baseados na rede de usuários e comparamos seus resultados quando aplicados em duas redes diferentes: a de amizades e de comunicação do usuário. Após evidenciar a mudança do vocabulário das publicações com o passar do tempo, aumentamos a precisão das inferências feitas a partir do texto de tweets ao levar em consideração as datas de publicação. Finalmente, utilizamos as informações do perfil do usuário e mostramos os benefícios da combinação dos resultados dos métodos estudados.UFMGORIGINALsilviosoares.pdfapplication/pdf4936905https://repositorio.ufmg.br//bitstreams/a955f917-1ddd-4b62-80c0-41b65734f92c/download3fd54cff9a8f04c51521473eb915a1baMD51trueAnonymousREADTEXTsilviosoares.pdf.txttext/plain200863https://repositorio.ufmg.br//bitstreams/ce3b6048-137a-4262-b1b5-a3ab6c094780/download5e290e4b0754fa8e25e795537b66ab89MD52falseAnonymousREAD1843/ESBF-9XZGNF2025-09-08 21:59:28.811open.accessoai:repositorio.ufmg.br:1843/ESBF-9XZGNFhttps://repositorio.ufmg.br/Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T00:59:28Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
| dc.title.none.fl_str_mv |
Integração de informações de usuários para inferência de localização geográfica no twitter |
| title |
Integração de informações de usuários para inferência de localização geográfica no twitter |
| spellingShingle |
Integração de informações de usuários para inferência de localização geográfica no twitter Silvio Soares Ribeiro Junior Computação Redes de relações sociais geoinferência redes sociais |
| title_short |
Integração de informações de usuários para inferência de localização geográfica no twitter |
| title_full |
Integração de informações de usuários para inferência de localização geográfica no twitter |
| title_fullStr |
Integração de informações de usuários para inferência de localização geográfica no twitter |
| title_full_unstemmed |
Integração de informações de usuários para inferência de localização geográfica no twitter |
| title_sort |
Integração de informações de usuários para inferência de localização geográfica no twitter |
| author |
Silvio Soares Ribeiro Junior |
| author_facet |
Silvio Soares Ribeiro Junior |
| author_role |
author |
| dc.contributor.author.fl_str_mv |
Silvio Soares Ribeiro Junior |
| dc.subject.por.fl_str_mv |
Computação Redes de relações sociais |
| topic |
Computação Redes de relações sociais geoinferência redes sociais |
| dc.subject.other.none.fl_str_mv |
geoinferência redes sociais |
| description |
Knowing the location of a social-network user is essential to many applications that rely on the informations extracted from those networks, like detecting events and tracking epidemics. In this work, we investigate the task of inferring the location of Twitter users by recognizing that it may be encoded not only in their GPS-tagged messages, but also in their messages, profile information and the network of users they interact with. While previous works exploit the possibility of using those different sources of information, there is not a de facto technique that is a consensus as the best, but a range of many techniques with specific pros and cons. The sparsity of users in a geographic area may be a challenge to many inference methods due to the lack of enough data to characterize it. Therefore we initially discuss how to overcome the sparsity of users in many geographic regions. Grouping and merging location is one of the main approaches to solve this challenge, although by doing it we are giving up the precision of the inferred location. We propose a metric to evaluate the quality of the generated clusters and present a new strategy to group cities. Previous work have used different performance metrics to evaluate their geo-inferecing methods, making it difficult to compare them directly. We use standardized metrics in the same dataset, comparing the methods that use the users' social network, their publications and the information provided voluntarily in their profiles. We evaluate different network-based methods and compare their performance in two different types of users network: the friendship and the mentions network. We make evident the differences and similarities in using those two networks in the geoinference task. After pointing the changes in the vocabulary use in the stream of publication as the time passes, we propose a new method of inferring the location of a user's location based not only in the text but also considering the publication of her messages. We achieve better precision than the non-temporal similar approaches from the literature. We also evaluate the performance of using of the profile fields provided by the users and the periodic update of the models created to infere the location of new users. Finally, we evaluate the benefits of combining the information outputted by each method to improve the final performance the the geo-inference task. |
| publishDate |
2015 |
| dc.date.issued.fl_str_mv |
2015-06-29 |
| dc.date.accessioned.fl_str_mv |
2019-08-14T07:23:26Z 2025-09-09T00:59:28Z |
| dc.date.available.fl_str_mv |
2019-08-14T07:23:26Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1843/ESBF-9XZGNF |
| url |
https://hdl.handle.net/1843/ESBF-9XZGNF |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
| instname_str |
Universidade Federal de Minas Gerais (UFMG) |
| instacron_str |
UFMG |
| institution |
UFMG |
| reponame_str |
Repositório Institucional da UFMG |
| collection |
Repositório Institucional da UFMG |
| bitstream.url.fl_str_mv |
https://repositorio.ufmg.br//bitstreams/a955f917-1ddd-4b62-80c0-41b65734f92c/download https://repositorio.ufmg.br//bitstreams/ce3b6048-137a-4262-b1b5-a3ab6c094780/download |
| bitstream.checksum.fl_str_mv |
3fd54cff9a8f04c51521473eb915a1ba 5e290e4b0754fa8e25e795537b66ab89 |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
| repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
| repository.mail.fl_str_mv |
repositorio@ufmg.br |
| _version_ |
1862105907695976448 |