Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa

Fonseca, Evandro Brasil

Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa

Detalhes bibliográficos
Ano de defesa:	2018
Autor(a) principal:	Fonseca, Evandro Brasil
Orientador(a):	Vieira, Renata
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Pontif?cia Universidade Cat?lica do Rio Grande do Sul
Programa de Pós-Graduação:	Programa de P?s-Gradua??o em Ci?ncia da Computa??o
Departamento:	Escola Polit?cnica
País:	Brasil
Palavras-chave em Português:	Resolu??o de Correfer?ncia Extra??o de Informa??o
Palavras-chave em Inglês:	Coreference Resolution Information Extraction
Área do conhecimento CNPq:	CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO
Link de acesso:	http://tede2.pucrs.br/tede2/handle/tede/8169
Resumo:	Coreference Resolution task is challenging for Natural Language Processing, considering the required linguistic knowledge and the sophistication of language processing techniques involved. Even though it is a demanding task, a motivating factor in the study of this phenomenon is its usefulness. Basically, several Natural Language Processing tasks may benefit from their results, such as named entities recognition, relation extraction between named entities, summarization, sentiment analysis, among others. Coreference Resolution is a process that consists on identifying certain terms and expressions that refer to the same entity. For example, in the sentence ? France is refusing. The country is one of the first in the ranking... ? we can say that [the country] is a coreference of [France]. By grouping these referential terms, we form coreference groups, more commonly known as coreference chains. This thesis proposes a process for coreference resolution between noun phrases for Portuguese, focusing on the use of semantic knowledge. Our proposed approach is based on syntactic-semantic linguistic rules. That is, we combine different levels of linguistic processing, using semantic relations as support, in order to infer referential relations between mentions. Models based on linguistic rules have been efficiently applied in other languages, such as: English, Spanish and Galician. In few words, these models are more efficient than machine learning approaches when we deal with less resourceful languages, since the lack of sample-rich corpora may produce a poor training. The proposed approach is the first model for Portuguese coreference resolution which uses semantic knowledge. Thus, we consider it as the main contribution of this thesis.

Metadados do item

id	P_RS_343e7a6d1062ba023a79e689e27a50b4
oai_identifier_str	oai:tede2.pucrs.br:tede/8169
network_acronym_str	P_RS
network_name_str	Biblioteca Digital de Teses e Dissertações da PUC_RS
repository_id_str
spelling	Vieira, RenataVanin, Aline Averhttp://lattes.cnpq.br/7639784707152839http://lattes.cnpq.br/3229974637891253Fonseca, Evandro Brasil2018-06-26T14:48:46Z2018-03-19http://tede2.pucrs.br/tede2/handle/tede/8169Coreference Resolution task is challenging for Natural Language Processing, considering the required linguistic knowledge and the sophistication of language processing techniques involved. Even though it is a demanding task, a motivating factor in the study of this phenomenon is its usefulness. Basically, several Natural Language Processing tasks may benefit from their results, such as named entities recognition, relation extraction between named entities, summarization, sentiment analysis, among others. Coreference Resolution is a process that consists on identifying certain terms and expressions that refer to the same entity. For example, in the sentence ? France is refusing. The country is one of the first in the ranking... ? we can say that [the country] is a coreference of [France]. By grouping these referential terms, we form coreference groups, more commonly known as coreference chains. This thesis proposes a process for coreference resolution between noun phrases for Portuguese, focusing on the use of semantic knowledge. Our proposed approach is based on syntactic-semantic linguistic rules. That is, we combine different levels of linguistic processing, using semantic relations as support, in order to infer referential relations between mentions. Models based on linguistic rules have been efficiently applied in other languages, such as: English, Spanish and Galician. In few words, these models are more efficient than machine learning approaches when we deal with less resourceful languages, since the lack of sample-rich corpora may produce a poor training. The proposed approach is the first model for Portuguese coreference resolution which uses semantic knowledge. Thus, we consider it as the main contribution of this thesis.A tarefa de Resolu??o de Correfer?ncia ? um grande desafio para a ?rea de Processamento da Linguagem Natural, tendo em vista o conhecimento lingu?stico exigido e a sofistica??o das t?cnicas de processamento da l?ngua empregados. Mesmo sendo uma tarefa desafiadora, um fator motivador do estudo deste fen?meno se d? pela sua utilidade. Basicamente, v?rias tarefas de Processamento da Linguagem Natural podem se beneficiar de seus resultados, como, por exemplo, o reconhecimento de entidades nomeadas, extra??o de rela??o entre entidades nomeadas, sumariza??o, an?lise de sentimentos, entre outras. A Resolu??o de Correfer?ncia ? um processo que consiste em identificar determinados termos e express?es que remetem a uma mesma entidade. Por exemplo, na senten?a ?A Fran?a est? resistindo. O pa?s ? um dos primeiros no ranking...? podemos dizer que [o pa?s] ? uma correfer?ncia de [A Fran?a]. Realizando o agrupamento desses termos referenciais, formamos grupos de men??es correferentes, mais conhecidos como cadeias de correfer?ncia. Esta tese prop?e um processo para a resolu??o de correfer?ncia entre sintagmas nominais para a l?ngua portuguesa, tendo como foco a utiliza??o do conhecimento sem?ntico. Nossa abordagem proposta ? baseada em regras lingu?sticas sint?tico-sem?nticas. Ou seja, combinamos diferentes n?veis de processamento lingu?stico utilizando rela??es sem?nticas como apoio, de forma a inferir rela??es referenciais entre men??es. Modelos baseados em regras lingu?sticas t?m sido aplicados eficientemente em outros idiomas como o ingl?s, o espanhol e o galego. Esses modelos mostram-se mais eficientes que os baseados em aprendizado de m?quina quando lidamos com idiomas menos providos de recursos, dado que a aus?ncia de corpora ricos em amostras pode prejudicar o treino desses modelos. O modelo proposto nesta tese ? o primeiro voltado para a resolu??o de correfer?ncia em portugu?s que faz uso de conhecimento sem?ntico. Dessa forma, tomamos este fator como a principal contribui??o deste trabalho.Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-06-19T11:37:24Z No. of bitstreams: 1 EVANDRO BRASIL FONSECA_TES.pdf: 1972824 bytes, checksum: 9fca0c499753cd9d2822c59040e826bf (MD5)Approved for entry into archive by Sheila Dias (sheila.dias@pucrs.br) on 2018-06-26T14:40:39Z (GMT) No. of bitstreams: 1 EVANDRO BRASIL FONSECA_TES.pdf: 1972824 bytes, checksum: 9fca0c499753cd9d2822c59040e826bf (MD5)Made available in DSpace on 2018-06-26T14:48:46Z (GMT). No. of bitstreams: 1 EVANDRO BRASIL FONSECA_TES.pdf: 1972824 bytes, checksum: 9fca0c499753cd9d2822c59040e826bf (MD5) Previous issue date: 2018-03-19application/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/172616/EVANDRO%20BRASIL%20FONSECA_TES.pdf.jpgporPontif?cia Universidade Cat?lica do Rio Grande do SulPrograma de P?s-Gradua??o em Ci?ncia da Computa??oPUCRSBrasilEscola Polit?cnicaResolu??o de Correfer?nciaExtra??o de Informa??oCoreference ResolutionInformation ExtractionCIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAOResolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesainfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisTrabalho n?o apresenta restri??o para publica??o1974996533081274470500500-862078257083325301info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSTHUMBNAILEVANDRO BRASIL FONSECA_TES.pdf.jpgEVANDRO BRASIL FONSECA_TES.pdf.jpgimage/jpeg4899http://tede2.pucrs.br/tede2/bitstream/tede/8169/4/EVANDRO+BRASIL+FONSECA_TES.pdf.jpgd7fa51000ab126c04f3d0dea38dd68f4MD54TEXTEVANDRO BRASIL FONSECA_TES.pdf.txtEVANDRO BRASIL FONSECA_TES.pdf.txttext/plain208449http://tede2.pucrs.br/tede2/bitstream/tede/8169/3/EVANDRO+BRASIL+FONSECA_TES.pdf.txt0da35164ce29c1637605f29c70d29c6bMD53ORIGINALEVANDRO BRASIL FONSECA_TES.pdfEVANDRO BRASIL FONSECA_TES.pdfapplication/pdf1972824http://tede2.pucrs.br/tede2/bitstream/tede/8169/2/EVANDRO+BRASIL+FONSECA_TES.pdf9fca0c499753cd9d2822c59040e826bfMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-8610http://tede2.pucrs.br/tede2/bitstream/tede/8169/1/license.txt5a9d6006225b368ef605ba16b4f6d1beMD51tede/81692018-06-26 12:00:58.995oai:tede2.pucrs.br:tede/8169QXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBFbGV0csO0bmljYTogQ29tIGJhc2Ugbm8gZGlzcG9zdG8gbmEgTGVpIEZlZGVyYWwgbsK6OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYcOnw6NvIGVsZXRyw7RuaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWbDrWNpYSBVbml2ZXJzaWRhZGUgQ2F0w7NsaWNhIGRvIFJpbyBHcmFuZGUgZG8gU3VsLCBzZWRpYWRhIGEgQXYuIElwaXJhbmdhIDY2ODEsIFBvcnRvIEFsZWdyZSwgUmlvIEdyYW5kZSBkbyBTdWwsIGNvbSByZWdpc3RybyBkZSBDTlBKIDg4NjMwNDEzMDAwMi04MSBiZW0gY29tbyBlbSBvdXRyYXMgYmlibGlvdGVjYXMgZGlnaXRhaXMsIG5hY2lvbmFpcyBlIGludGVybmFjaW9uYWlzLCBjb25zw7NyY2lvcyBlIHJlZGVzIMOgcyBxdWFpcyBhIGJpYmxpb3RlY2EgZGEgUFVDUlMgcG9zc2EgYSB2aXIgcGFydGljaXBhciwgc2VtIMO0bnVzIGFsdXNpdm8gYW9zIGRpcmVpdG9zIGF1dG9yYWlzLCBhIHTDrXR1bG8gZGUgZGl2dWxnYcOnw6NvIGRhIHByb2R1w6fDo28gY2llbnTDrWZpY2EuCg==Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br\|\|opendoar:2018-06-26T15:00:58Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false
dc.title.por.fl_str_mv	Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa
title	Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa
spellingShingle	Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa Fonseca, Evandro Brasil Resolu??o de Correfer?ncia Extra??o de Informa??o Coreference Resolution Information Extraction CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO
title_short	Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa
title_full	Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa
title_fullStr	Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa
title_full_unstemmed	Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa
title_sort	Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa
author	Fonseca, Evandro Brasil
author_facet	Fonseca, Evandro Brasil
author_role	author
dc.contributor.advisor1.fl_str_mv	Vieira, Renata
dc.contributor.advisor-co1.fl_str_mv	Vanin, Aline Aver
dc.contributor.advisor-co1Lattes.fl_str_mv	http://lattes.cnpq.br/7639784707152839
dc.contributor.authorLattes.fl_str_mv	http://lattes.cnpq.br/3229974637891253
dc.contributor.author.fl_str_mv	Fonseca, Evandro Brasil
contributor_str_mv	Vieira, Renata Vanin, Aline Aver
dc.subject.por.fl_str_mv	Resolu??o de Correfer?ncia Extra??o de Informa??o
topic	Resolu??o de Correfer?ncia Extra??o de Informa??o Coreference Resolution Information Extraction CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO
dc.subject.eng.fl_str_mv	Coreference Resolution Information Extraction
dc.subject.cnpq.fl_str_mv	CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO
description	Coreference Resolution task is challenging for Natural Language Processing, considering the required linguistic knowledge and the sophistication of language processing techniques involved. Even though it is a demanding task, a motivating factor in the study of this phenomenon is its usefulness. Basically, several Natural Language Processing tasks may benefit from their results, such as named entities recognition, relation extraction between named entities, summarization, sentiment analysis, among others. Coreference Resolution is a process that consists on identifying certain terms and expressions that refer to the same entity. For example, in the sentence ? France is refusing. The country is one of the first in the ranking... ? we can say that [the country] is a coreference of [France]. By grouping these referential terms, we form coreference groups, more commonly known as coreference chains. This thesis proposes a process for coreference resolution between noun phrases for Portuguese, focusing on the use of semantic knowledge. Our proposed approach is based on syntactic-semantic linguistic rules. That is, we combine different levels of linguistic processing, using semantic relations as support, in order to infer referential relations between mentions. Models based on linguistic rules have been efficiently applied in other languages, such as: English, Spanish and Galician. In few words, these models are more efficient than machine learning approaches when we deal with less resourceful languages, since the lack of sample-rich corpora may produce a poor training. The proposed approach is the first model for Portuguese coreference resolution which uses semantic knowledge. Thus, we consider it as the main contribution of this thesis.
publishDate	2018
dc.date.accessioned.fl_str_mv	2018-06-26T14:48:46Z
dc.date.issued.fl_str_mv	2018-03-19
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://tede2.pucrs.br/tede2/handle/tede/8169
url	http://tede2.pucrs.br/tede2/handle/tede/8169
dc.language.iso.fl_str_mv	por
language	por
dc.relation.program.fl_str_mv	1974996533081274470
dc.relation.confidence.fl_str_mv	500 500
dc.relation.cnpq.fl_str_mv	-862078257083325301
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Pontif?cia Universidade Cat?lica do Rio Grande do Sul
dc.publisher.program.fl_str_mv	Programa de P?s-Gradua??o em Ci?ncia da Computa??o
dc.publisher.initials.fl_str_mv	PUCRS
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	Escola Polit?cnica
publisher.none.fl_str_mv	Pontif?cia Universidade Cat?lica do Rio Grande do Sul
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) instacron:PUC_RS
instname_str	Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron_str	PUC_RS
institution	PUC_RS
reponame_str	Biblioteca Digital de Teses e Dissertações da PUC_RS
collection	Biblioteca Digital de Teses e Dissertações da PUC_RS
bitstream.url.fl_str_mv	http://tede2.pucrs.br/tede2/bitstream/tede/8169/4/EVANDRO+BRASIL+FONSECA_TES.pdf.jpg http://tede2.pucrs.br/tede2/bitstream/tede/8169/3/EVANDRO+BRASIL+FONSECA_TES.pdf.txt http://tede2.pucrs.br/tede2/bitstream/tede/8169/2/EVANDRO+BRASIL+FONSECA_TES.pdf http://tede2.pucrs.br/tede2/bitstream/tede/8169/1/license.txt
bitstream.checksum.fl_str_mv	d7fa51000ab126c04f3d0dea38dd68f4 0da35164ce29c1637605f29c70d29c6b 9fca0c499753cd9d2822c59040e826bf 5a9d6006225b368ef605ba16b4f6d1be
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
repository.mail.fl_str_mv	biblioteca.central@pucrs.br\|\|
_version_	1796793234544918528

Resolu??o de correfer?ncia nominal usando sem?ntica em l?ngua portuguesa

Registros relacionados