Exploiting entities for query expansion
| Ano de defesa: | 2013 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Tese |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Universidade Federal de Minas Gerais
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://hdl.handle.net/1843/ESBF-9GMJW2 |
Resumo: | A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. In this work, we propose entity-oriented query expansion approaches that exploit semantic sources of evidence devising discriminative term features and machine learning techniques that effectively combines these features to rank candidate expansion terms. Particularly, our unsupervised approach (UQEE) uses taxonomic features devised by the semantic structure implicitly provided by infobox templates, while our learning to rank approach (L2EE) considers semantic evidence encoded in the content of Wikipedia article fields to automatically labels training examples proportionally to their observed retrieval effectiveness. Lastly, we propose a self-supervised approach to autonomously generate infoboxes for Wikipedia articles (WAVE). Experiments attest the effectiveness of our approaches, with significantly gains compared to state-of-the-art PRF and ePRF approaches. |
| id |
UFMG_7cf1e7f8f8837f89a7bf4f6a1e16c0bc |
|---|---|
| oai_identifier_str |
oai:repositorio.ufmg.br:1843/ESBF-9GMJW2 |
| network_acronym_str |
UFMG |
| network_name_str |
Repositório Institucional da UFMG |
| repository_id_str |
|
| spelling |
Exploiting entities for query expansionAprendizado por computadorComputaçãoSistemas de recuperação de informaçãoExpansão de consultasAprendizagem para ranqueamentoWikipédiaFeedback de relevânciaReconhecimento de entidadesA substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. In this work, we propose entity-oriented query expansion approaches that exploit semantic sources of evidence devising discriminative term features and machine learning techniques that effectively combines these features to rank candidate expansion terms. Particularly, our unsupervised approach (UQEE) uses taxonomic features devised by the semantic structure implicitly provided by infobox templates, while our learning to rank approach (L2EE) considers semantic evidence encoded in the content of Wikipedia article fields to automatically labels training examples proportionally to their observed retrieval effectiveness. Lastly, we propose a self-supervised approach to autonomously generate infoboxes for Wikipedia articles (WAVE). Experiments attest the effectiveness of our approaches, with significantly gains compared to state-of-the-art PRF and ePRF approaches.Universidade Federal de Minas Gerais2019-08-10T07:19:13Z2025-09-09T00:08:22Z2019-08-10T07:19:13Z2013-11-18info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://hdl.handle.net/1843/ESBF-9GMJW2Wladmir Cardoso Brandaoinfo:eu-repo/semantics/openAccessengreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-09T00:08:22Zoai:repositorio.ufmg.br:1843/ESBF-9GMJW2Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T00:08:22Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
| dc.title.none.fl_str_mv |
Exploiting entities for query expansion |
| title |
Exploiting entities for query expansion |
| spellingShingle |
Exploiting entities for query expansion Wladmir Cardoso Brandao Aprendizado por computador Computação Sistemas de recuperação de informação Expansão de consultas Aprendizagem para ranqueamento Wikipédia Feedback de relevância Reconhecimento de entidades |
| title_short |
Exploiting entities for query expansion |
| title_full |
Exploiting entities for query expansion |
| title_fullStr |
Exploiting entities for query expansion |
| title_full_unstemmed |
Exploiting entities for query expansion |
| title_sort |
Exploiting entities for query expansion |
| author |
Wladmir Cardoso Brandao |
| author_facet |
Wladmir Cardoso Brandao |
| author_role |
author |
| dc.contributor.author.fl_str_mv |
Wladmir Cardoso Brandao |
| dc.subject.por.fl_str_mv |
Aprendizado por computador Computação Sistemas de recuperação de informação Expansão de consultas Aprendizagem para ranqueamento Wikipédia Feedback de relevância Reconhecimento de entidades |
| topic |
Aprendizado por computador Computação Sistemas de recuperação de informação Expansão de consultas Aprendizagem para ranqueamento Wikipédia Feedback de relevância Reconhecimento de entidades |
| description |
A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. In this work, we propose entity-oriented query expansion approaches that exploit semantic sources of evidence devising discriminative term features and machine learning techniques that effectively combines these features to rank candidate expansion terms. Particularly, our unsupervised approach (UQEE) uses taxonomic features devised by the semantic structure implicitly provided by infobox templates, while our learning to rank approach (L2EE) considers semantic evidence encoded in the content of Wikipedia article fields to automatically labels training examples proportionally to their observed retrieval effectiveness. Lastly, we propose a self-supervised approach to autonomously generate infoboxes for Wikipedia articles (WAVE). Experiments attest the effectiveness of our approaches, with significantly gains compared to state-of-the-art PRF and ePRF approaches. |
| publishDate |
2013 |
| dc.date.none.fl_str_mv |
2013-11-18 2019-08-10T07:19:13Z 2019-08-10T07:19:13Z 2025-09-09T00:08:22Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
| format |
doctoralThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1843/ESBF-9GMJW2 |
| url |
https://hdl.handle.net/1843/ESBF-9GMJW2 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
| instname_str |
Universidade Federal de Minas Gerais (UFMG) |
| instacron_str |
UFMG |
| institution |
UFMG |
| reponame_str |
Repositório Institucional da UFMG |
| collection |
Repositório Institucional da UFMG |
| repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
| repository.mail.fl_str_mv |
repositorio@ufmg.br |
| _version_ |
1856413969405706240 |