Recognition and Linking of Product Mentions in User-generated Contents

Detalhes bibliográficos
Ano de defesa: 2018
Autor(a) principal: Vieira, Henry Silva
Outros Autores: http://lattes.cnpq.br/7635217586028802
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal do Amazonas
Instituto de Computação
Brasil
UFAM
Programa de Pós-graduação em Informática
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://tede.ufam.edu.br/handle/tede/6686
Resumo: Online social media has grown into an essential part of our daily life. Through these media, users exchange information that they generate by using many different communication mechanisms. In this context, more and more users pass on and trust information published by other users on a large variety of topics, including opinion and information about products. Automatically extracting and processing user-generated information in social media can provide relevant information and knowledge to a variety of interesting applications. In particular, one of the content analysis techniques most often applied to social media is that of opinion mining. One of the basic tasks associated with opinion mining is extracting and categorizing target entities, i.e., identifying entity mentions in text, and linking these entity mentions to unique real world entities about which the opinions are made. In our work, we focus on target entities of a specific, and currently relevant, type: consumer electronic products. Such products are the main subject of opinions posted by users on a number of posts in discussion forums and retail sites over the Web. In this work, we are interested in using the unstructured textual content generated by social media users to continuously allow enriching the knowledge about products represented in product catalogs. Therefore, the task we address here is how to recognize and link mentions to products in user generated textual content to the product, from a catalog, they refer to. We claim that two basic sub-tasks arise: first, extraction of target entities mentions from unstructured textual content; second, disambiguation of extracted entities, i.e., linking extracted mentions to their real world counterpart. In this work, we developed methods to address these two sub-tasks. This thesis details these tasks, discusses our ideas for the methods we developed, and presents our contributions and results towards this goal.
id UFAM_dffb2fe8ecf033ac29c899ad038b7d50
oai_identifier_str oai:https://tede.ufam.edu.br/handle/:tede/6686
network_acronym_str UFAM
network_name_str Biblioteca Digital de Teses e Dissertações da UFAM
repository_id_str
spelling Recognition and Linking of Product Mentions in User-generated ContentsOpinion miningExtracting target entitiesCategorizing target entitiesProduct recognitionProduct linkingCIÊNCIAS EXATAS E DA TERRA: CIÊNCIA DA COMPUTAÇÃOOnline social media has grown into an essential part of our daily life. Through these media, users exchange information that they generate by using many different communication mechanisms. In this context, more and more users pass on and trust information published by other users on a large variety of topics, including opinion and information about products. Automatically extracting and processing user-generated information in social media can provide relevant information and knowledge to a variety of interesting applications. In particular, one of the content analysis techniques most often applied to social media is that of opinion mining. One of the basic tasks associated with opinion mining is extracting and categorizing target entities, i.e., identifying entity mentions in text, and linking these entity mentions to unique real world entities about which the opinions are made. In our work, we focus on target entities of a specific, and currently relevant, type: consumer electronic products. Such products are the main subject of opinions posted by users on a number of posts in discussion forums and retail sites over the Web. In this work, we are interested in using the unstructured textual content generated by social media users to continuously allow enriching the knowledge about products represented in product catalogs. Therefore, the task we address here is how to recognize and link mentions to products in user generated textual content to the product, from a catalog, they refer to. We claim that two basic sub-tasks arise: first, extraction of target entities mentions from unstructured textual content; second, disambiguation of extracted entities, i.e., linking extracted mentions to their real world counterpart. In this work, we developed methods to address these two sub-tasks. This thesis details these tasks, discusses our ideas for the methods we developed, and presents our contributions and results towards this goal.A mídia social online tornou-se uma parte essencial de nossa vida diária. Por meio dessas mídias, os usuários trocam informações que geram usando diversos mecanismos de comunicação. Nesse contexto, mais e mais usuários transmitem e confiam em informações publicadas por outros usuários sobre uma grande variedade de tópicos, incluindo opiniões e informações sobre produtos. A extração e o processamento automáticos de informações geradas pelo usuário nas mídias sociais podem fornecer informações e conhecimento relevantes para uma variedade de aplicativos interessantes. Em particular, uma das técnicas de análise de conteúdo mais aplicadas às mídias sociais é a de mineração de opinião. Uma das tarefas básicas associadas à mineração de opinião é extrair e categorizar as entidades de destino, ou seja, identificar as menções de entidade no texto e vincular essas menções de entidade a entidades do mundo real sobre as quais as opiniões são feitas. Em nosso trabalho, nos concentramos em entidades-alvo de um tipo específico e atualmente relevante: produtos eletrônicos de consumo. Tais produtos são o principal assunto de opiniões postadas pelos usuários em várias postagens em fóruns de discussão e sites de varejo na Web. Neste trabalho, estamos interessados ​​em usar o conteúdo textual não estruturado gerado por usuários de mídia social para permitir continuamente enriquecer o conhecimento sobre produtos representados em catálogos de produtos. Portanto, a tarefa que abordamos aqui é como reconhecer e vincular menções a produtos em conteúdo textual gerado pelo usuário para o produto, de um catálogo, ao qual eles se referem. Afirmamos que duas sub-tarefas básicas surgem: primeiro, a extração de entidades alvo mencionada em conteúdo textual não-estruturado; segundo, a desambiguação de entidades extraídas, isto é, ligação menções extraídas à sua contraparte do mundo real. Neste trabalho, desenvolvemos métodos para abordar essas duas subtarefas. Esta tese detalha essas tarefas, discute nossas ideias para os métodos que desenvolvemos e apresenta nossas contribuições e resultados para esse objetivo.CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorFAPEAM - Fundação de Amparo à Pesquisa do Estado do AmazonasNão tive dificuldades, tudo funcionou corretamente.Universidade Federal do AmazonasInstituto de ComputaçãoBrasilUFAMPrograma de Pós-graduação em InformáticaSilva, Altigran Soares dahttp://lattes.cnpq.br/3405503472010994Moura, Edleno Silva deCalado, Pável PereiraMarinho, Leandro BalbyVieira, Henry Silvahttp://lattes.cnpq.br/76352175860288022018-10-16T17:41:31Z2018-09-25info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfVIEIRA, Henry Silva. Recognition and Linking of Product Mentions in User-generated Contents. 2018. 127 f. Tese (Doutorado em Informática) - Universidade Federal do Amazonas, Manaus, 2018.https://tede.ufam.edu.br/handle/tede/6686enghttp://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFAMinstname:Universidade Federal do Amazonas (UFAM)instacron:UFAM2018-10-17T05:03:26Zoai:https://tede.ufam.edu.br/handle/:tede/6686Biblioteca Digital de Teses e Dissertaçõeshttp://200.129.163.131:8080/PUBhttp://200.129.163.131:8080/oai/requestddbc@ufam.edu.br||ddbc@ufam.edu.bropendoar:65922018-10-17T05:03:26Biblioteca Digital de Teses e Dissertações da UFAM - Universidade Federal do Amazonas (UFAM)false
dc.title.none.fl_str_mv Recognition and Linking of Product Mentions in User-generated Contents
title Recognition and Linking of Product Mentions in User-generated Contents
spellingShingle Recognition and Linking of Product Mentions in User-generated Contents
Vieira, Henry Silva
Opinion mining
Extracting target entities
Categorizing target entities
Product recognition
Product linking
CIÊNCIAS EXATAS E DA TERRA: CIÊNCIA DA COMPUTAÇÃO
title_short Recognition and Linking of Product Mentions in User-generated Contents
title_full Recognition and Linking of Product Mentions in User-generated Contents
title_fullStr Recognition and Linking of Product Mentions in User-generated Contents
title_full_unstemmed Recognition and Linking of Product Mentions in User-generated Contents
title_sort Recognition and Linking of Product Mentions in User-generated Contents
author Vieira, Henry Silva
author_facet Vieira, Henry Silva
http://lattes.cnpq.br/7635217586028802
author_role author
author2 http://lattes.cnpq.br/7635217586028802
author2_role author
dc.contributor.none.fl_str_mv Silva, Altigran Soares da
http://lattes.cnpq.br/3405503472010994
Moura, Edleno Silva de
Calado, Pável Pereira
Marinho, Leandro Balby
dc.contributor.author.fl_str_mv Vieira, Henry Silva
http://lattes.cnpq.br/7635217586028802
dc.subject.por.fl_str_mv Opinion mining
Extracting target entities
Categorizing target entities
Product recognition
Product linking
CIÊNCIAS EXATAS E DA TERRA: CIÊNCIA DA COMPUTAÇÃO
topic Opinion mining
Extracting target entities
Categorizing target entities
Product recognition
Product linking
CIÊNCIAS EXATAS E DA TERRA: CIÊNCIA DA COMPUTAÇÃO
description Online social media has grown into an essential part of our daily life. Through these media, users exchange information that they generate by using many different communication mechanisms. In this context, more and more users pass on and trust information published by other users on a large variety of topics, including opinion and information about products. Automatically extracting and processing user-generated information in social media can provide relevant information and knowledge to a variety of interesting applications. In particular, one of the content analysis techniques most often applied to social media is that of opinion mining. One of the basic tasks associated with opinion mining is extracting and categorizing target entities, i.e., identifying entity mentions in text, and linking these entity mentions to unique real world entities about which the opinions are made. In our work, we focus on target entities of a specific, and currently relevant, type: consumer electronic products. Such products are the main subject of opinions posted by users on a number of posts in discussion forums and retail sites over the Web. In this work, we are interested in using the unstructured textual content generated by social media users to continuously allow enriching the knowledge about products represented in product catalogs. Therefore, the task we address here is how to recognize and link mentions to products in user generated textual content to the product, from a catalog, they refer to. We claim that two basic sub-tasks arise: first, extraction of target entities mentions from unstructured textual content; second, disambiguation of extracted entities, i.e., linking extracted mentions to their real world counterpart. In this work, we developed methods to address these two sub-tasks. This thesis details these tasks, discusses our ideas for the methods we developed, and presents our contributions and results towards this goal.
publishDate 2018
dc.date.none.fl_str_mv 2018-10-16T17:41:31Z
2018-09-25
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv VIEIRA, Henry Silva. Recognition and Linking of Product Mentions in User-generated Contents. 2018. 127 f. Tese (Doutorado em Informática) - Universidade Federal do Amazonas, Manaus, 2018.
https://tede.ufam.edu.br/handle/tede/6686
identifier_str_mv VIEIRA, Henry Silva. Recognition and Linking of Product Mentions in User-generated Contents. 2018. 127 f. Tese (Doutorado em Informática) - Universidade Federal do Amazonas, Manaus, 2018.
url https://tede.ufam.edu.br/handle/tede/6686
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv http://creativecommons.org/licenses/by/4.0/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal do Amazonas
Instituto de Computação
Brasil
UFAM
Programa de Pós-graduação em Informática
publisher.none.fl_str_mv Universidade Federal do Amazonas
Instituto de Computação
Brasil
UFAM
Programa de Pós-graduação em Informática
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da UFAM
instname:Universidade Federal do Amazonas (UFAM)
instacron:UFAM
instname_str Universidade Federal do Amazonas (UFAM)
instacron_str UFAM
institution UFAM
reponame_str Biblioteca Digital de Teses e Dissertações da UFAM
collection Biblioteca Digital de Teses e Dissertações da UFAM
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da UFAM - Universidade Federal do Amazonas (UFAM)
repository.mail.fl_str_mv ddbc@ufam.edu.br||ddbc@ufam.edu.br
_version_ 1797040497711120384