Exploiting entities for query expansion

Detalhes bibliográficos
Ano de defesa: 2013
Autor(a) principal: Wladmir Cardoso Brandao
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Minas Gerais
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://hdl.handle.net/1843/ESBF-9GMJW2
Resumo: A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. In this work, we propose entity-oriented query expansion approaches that exploit semantic sources of evidence devising discriminative term features and machine learning techniques that effectively combines these features to rank candidate expansion terms. Particularly, our unsupervised approach (UQEE) uses taxonomic features devised by the semantic structure implicitly provided by infobox templates, while our learning to rank approach (L2EE) considers semantic evidence encoded in the content of Wikipedia article fields to automatically labels training examples proportionally to their observed retrieval effectiveness. Lastly, we propose a self-supervised approach to autonomously generate infoboxes for Wikipedia articles (WAVE). Experiments attest the effectiveness of our approaches, with significantly gains compared to state-of-the-art PRF and ePRF approaches.
id UFMG_7cf1e7f8f8837f89a7bf4f6a1e16c0bc
oai_identifier_str oai:repositorio.ufmg.br:1843/ESBF-9GMJW2
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling Exploiting entities for query expansionAprendizado por computadorComputaçãoSistemas de recuperação de informaçãoExpansão de consultasAprendizagem para ranqueamentoWikipédiaFeedback de relevânciaReconhecimento de entidadesA substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. In this work, we propose entity-oriented query expansion approaches that exploit semantic sources of evidence devising discriminative term features and machine learning techniques that effectively combines these features to rank candidate expansion terms. Particularly, our unsupervised approach (UQEE) uses taxonomic features devised by the semantic structure implicitly provided by infobox templates, while our learning to rank approach (L2EE) considers semantic evidence encoded in the content of Wikipedia article fields to automatically labels training examples proportionally to their observed retrieval effectiveness. Lastly, we propose a self-supervised approach to autonomously generate infoboxes for Wikipedia articles (WAVE). Experiments attest the effectiveness of our approaches, with significantly gains compared to state-of-the-art PRF and ePRF approaches.Universidade Federal de Minas Gerais2019-08-10T07:19:13Z2025-09-09T00:08:22Z2019-08-10T07:19:13Z2013-11-18info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://hdl.handle.net/1843/ESBF-9GMJW2Wladmir Cardoso Brandaoinfo:eu-repo/semantics/openAccessengreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-09T00:08:22Zoai:repositorio.ufmg.br:1843/ESBF-9GMJW2Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T00:08:22Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.none.fl_str_mv Exploiting entities for query expansion
title Exploiting entities for query expansion
spellingShingle Exploiting entities for query expansion
Wladmir Cardoso Brandao
Aprendizado por computador
Computação
Sistemas de recuperação de informação
Expansão de consultas
Aprendizagem para ranqueamento
Wikipédia
Feedback de relevância
Reconhecimento de entidades
title_short Exploiting entities for query expansion
title_full Exploiting entities for query expansion
title_fullStr Exploiting entities for query expansion
title_full_unstemmed Exploiting entities for query expansion
title_sort Exploiting entities for query expansion
author Wladmir Cardoso Brandao
author_facet Wladmir Cardoso Brandao
author_role author
dc.contributor.author.fl_str_mv Wladmir Cardoso Brandao
dc.subject.por.fl_str_mv Aprendizado por computador
Computação
Sistemas de recuperação de informação
Expansão de consultas
Aprendizagem para ranqueamento
Wikipédia
Feedback de relevância
Reconhecimento de entidades
topic Aprendizado por computador
Computação
Sistemas de recuperação de informação
Expansão de consultas
Aprendizagem para ranqueamento
Wikipédia
Feedback de relevância
Reconhecimento de entidades
description A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. In this work, we propose entity-oriented query expansion approaches that exploit semantic sources of evidence devising discriminative term features and machine learning techniques that effectively combines these features to rank candidate expansion terms. Particularly, our unsupervised approach (UQEE) uses taxonomic features devised by the semantic structure implicitly provided by infobox templates, while our learning to rank approach (L2EE) considers semantic evidence encoded in the content of Wikipedia article fields to automatically labels training examples proportionally to their observed retrieval effectiveness. Lastly, we propose a self-supervised approach to autonomously generate infoboxes for Wikipedia articles (WAVE). Experiments attest the effectiveness of our approaches, with significantly gains compared to state-of-the-art PRF and ePRF approaches.
publishDate 2013
dc.date.none.fl_str_mv 2013-11-18
2019-08-10T07:19:13Z
2019-08-10T07:19:13Z
2025-09-09T00:08:22Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/1843/ESBF-9GMJW2
url https://hdl.handle.net/1843/ESBF-9GMJW2
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv repositorio@ufmg.br
_version_ 1856413969405706240