SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural

Detalhes bibliográficos
Ano de defesa: 2010
Autor(a) principal: Pinheiro, Vládia Célia Monteiro
Orientador(a): Pequeno, Tarcisio Haroldo Cavalcante
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/61242
Resumo: Computer processing of natural language is one of the major challenges in the area of Artificial Intelligence. There is an urgent need for computational systems that can help us process the overwhelming quantity of non-structured information that is present in natural language. Driven by this need, we studied various philosophies of language in order to understand what a linguistic expression consists of, or at least what reponses (borrowed from Philosophy) can make it possible for computer systems to address the needs and challenges in an effective way. For representationalist theories, concepts are things (or the state of affairs) represented by words and linguistic expressions. Therefore, there is the idea of a single and regular world—an inductive world. Pragmatic philosophies, such as that of Wittgenstein, define a new concept of "concept": the content of a concept consists of its various uses in language games. Concepts can be defined and understood based on the practice of applying concepts in situations of language use. Inferentialist philosophers such as Sellars, Dummett and Brandom incorporate the project of a semantic theory by presenting a reduction of the pragmatic approach to the uses of concepts in rational games, where the inferential component is the most important. In the second part of this research, we study linguistic resources and systems in the fields of Computational Linguistics and Natural Language Processing (NLP) in order to define a conceptual framework for the way that ordinary systems express and compute meanings. The model of the Computational Semantics—a traditional model in NLP—proposes that (i) one representation of the world is sufficient to define the semantic value of terms and sentences in natural language; and (ii) there is equivalence between the syntax and the semantics of natural languages. As a result of our research, we propose a new computational model for expression and semantic reasoning in natural language systems—the Semantic Inferentialism Model (SIM). SIM is based on Robert Brandom's Semantic Inferentialism Theory. We advocate the idea that in order for a computational system to answer questions, extract and retrieve information, refute arguments, justify answers, or give explanations about a text, among many other applications, it must: (i) express the inferential content of concepts and sentences (pre-conditions and post-conditions of use); and (ii) manipulate this inferential content within the flow of reasoning. The semantic bases of SIM were constructed and have formed the first large-scale linguistic resource for the Portuguese language with inferentialist content—InferenceNet.BR. The semantic reasoning algorithm (SIA) and the InferenceNet.BR resource were applied in a real system for extracting information on crime—the WikiCrimes Information Extractor (WikiCrimesTE). The results of the extraction were evaluated and superseded the state of the art of the foremost systems that execute tasks of understanding natural language, even for the English language. The greatest legacy of this research was to have planted a seed for research in Inferentialist Computational Semantics, which we believe will allow a more solid and fruitful evolutionary path for the field of Computational Linguistics.
id UFC-7_33abd068183ca950805662ec6f2a521e
oai_identifier_str oai:repositorio.ufc.br:riufc/61242
network_acronym_str UFC-7
network_name_str Repositório Institucional da Universidade Federal do Ceará (UFC)
repository_id_str
spelling Pinheiro, Vládia Célia MonteiroFurtado, João José Vasco PeixotoPequeno, Tarcisio Haroldo Cavalcante2021-10-15T14:43:27Z2021-10-15T14:43:27Z2010PINHEIRO, Vládia Célia Monteiro. SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural. 2010. 229 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2010.http://www.repositorio.ufc.br/handle/riufc/61242Computer processing of natural language is one of the major challenges in the area of Artificial Intelligence. There is an urgent need for computational systems that can help us process the overwhelming quantity of non-structured information that is present in natural language. Driven by this need, we studied various philosophies of language in order to understand what a linguistic expression consists of, or at least what reponses (borrowed from Philosophy) can make it possible for computer systems to address the needs and challenges in an effective way. For representationalist theories, concepts are things (or the state of affairs) represented by words and linguistic expressions. Therefore, there is the idea of a single and regular world—an inductive world. Pragmatic philosophies, such as that of Wittgenstein, define a new concept of "concept": the content of a concept consists of its various uses in language games. Concepts can be defined and understood based on the practice of applying concepts in situations of language use. Inferentialist philosophers such as Sellars, Dummett and Brandom incorporate the project of a semantic theory by presenting a reduction of the pragmatic approach to the uses of concepts in rational games, where the inferential component is the most important. In the second part of this research, we study linguistic resources and systems in the fields of Computational Linguistics and Natural Language Processing (NLP) in order to define a conceptual framework for the way that ordinary systems express and compute meanings. The model of the Computational Semantics—a traditional model in NLP—proposes that (i) one representation of the world is sufficient to define the semantic value of terms and sentences in natural language; and (ii) there is equivalence between the syntax and the semantics of natural languages. As a result of our research, we propose a new computational model for expression and semantic reasoning in natural language systems—the Semantic Inferentialism Model (SIM). SIM is based on Robert Brandom's Semantic Inferentialism Theory. We advocate the idea that in order for a computational system to answer questions, extract and retrieve information, refute arguments, justify answers, or give explanations about a text, among many other applications, it must: (i) express the inferential content of concepts and sentences (pre-conditions and post-conditions of use); and (ii) manipulate this inferential content within the flow of reasoning. The semantic bases of SIM were constructed and have formed the first large-scale linguistic resource for the Portuguese language with inferentialist content—InferenceNet.BR. The semantic reasoning algorithm (SIA) and the InferenceNet.BR resource were applied in a real system for extracting information on crime—the WikiCrimes Information Extractor (WikiCrimesTE). The results of the extraction were evaluated and superseded the state of the art of the foremost systems that execute tasks of understanding natural language, even for the English language. The greatest legacy of this research was to have planted a seed for research in Inferentialist Computational Semantics, which we believe will allow a more solid and fruitful evolutionary path for the field of Computational Linguistics.0 processamento computacional de linguagem natural é um dos grandes desafios da área de Inteligência Artificial. E premente a necessidade por sistemas computacionais que nos auxiliem a processar a quantidade de informações não estruturadas, em linguagem natural, que nos inunda. Impulsionados por esta necessidade, estudamos várias filosofias da linguagem para entender em que consiste o significado de uma expressão linguistica ou, pelo menos, qual das respostas que a Filosofia nos empresta torna possível aos sistemas de computador endereçar as necessidades e desafios de forma eficaz. Para as teorias representacionalistas, conceitos são as coisas ou estado de coisas representadas pelos termos e expressões linguisticas. Tem-se, portanto, a ideia de um mundo único e regular — um mundo indutivo. Filosofias pragmáticas, como a do segundo Wittgenstein, definem novo conceito de "conceito": o conteúdo de um conceito compõe-se de seus diversos usos em jogos de linguagem. E com base na prática de aplicar conceitos em situações de uso da linguagem que os conceitos podem ser definidos e entendidos. Filósofos inferencialistas como Sellars, Dummett e Brandom retomam o projeto de uma teoria semântica, apresentando uma redução da óptica pragmática para os usos de conceitos em jogos racionais — onde o componente inferencial é o mais importante. Na segunda direção desta pesquisa, estudamos os recursos linguisticos e sistemas das áreas de Linguistica Computacional (LC) e Processamento de Linguagem Natural (PLN) com o objetivo de delimitar um quadro conceitual de como os sistemas ordinários expressam e computam significados. 0 modelo da Semântica Computacional, tradicional em PLN, preconiza que (i) é suficiente uma representação do mundo para definir o valor semântico dos termos e sentenças em linguagem natural; e (ii) há uma equivalência entre a sintaxe e a semântica das línguas naturais. Como resultado da pesquisa, propomos um novo modelo computacional para expressão e raciocínio semântico em sistemas de linguagem natural — o Modelo Semântico Inferencialista (SIM). SIM se baseia na Teoria Semântica Inferencialista, de Robert Brandom. Defendemos a ideia que um sistema computacional, para responder a perguntas, extrair e recuperar informações, refutar argumentos, justificar respostas ou dar explicações sobre um texto, dentre tantas outras aplicações, deve: (i) expressar o conteúdo inferencial de conceitos e sentenças (precondições e pós-condições de uso); e (ii) manipular este conteúdo inferencial dentro do fluxo de raciocínio. As bases semânticas do SIM foram construídas e formaram o primeiro recurso linguistico, em larga escala e para a lingua portuguesa, com conteúdo inferencialista — InferenceNet.BR. O algoritmo para raciocínio semântico — SIA — e o recurso InferenceNet.BR foram aplicados em um sistema real de extração de informações sobre crimes — o Extrator de Informações para WikiCrimes (WikiCrimeslE). Os resultados da extração foram avaliados e suplantaram o estado da arte dos principais sistemas que executam tarefas de entendimento de linguagem natural, inclusive para a lingua inglesa. 0 maior legado desta pesquisa é ter lançado uma semente para pesquisas em Semântica Computacional Inferencialista, as quais, acreditamos, permitirão um caminho evolutivo mais sólido e frutífero para a área de Linguistica Computacional.Processamento de linguagem natural (Computação)SemânticaInferência (Lógica)SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem naturalinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisporreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFCinfo:eu-repo/semantics/openAccessLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ufc.br/bitstream/riufc/61242/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52ORIGINAL2010_tese_vcmpinheiro.pdf2010_tese_vcmpinheiro.pdfapplication/pdf41066957http://repositorio.ufc.br/bitstream/riufc/61242/1/2010_tese_vcmpinheiro.pdffb84f2064158f2f49dfce06f5b80ef1bMD51riufc/612422021-10-15 11:43:27.695oai:repositorio.ufc.br:riufc/61242Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2021-10-15T14:43:27Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false
dc.title.pt_BR.fl_str_mv SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural
title SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural
spellingShingle SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural
Pinheiro, Vládia Célia Monteiro
Processamento de linguagem natural (Computação)
Semântica
Inferência (Lógica)
title_short SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural
title_full SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural
title_fullStr SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural
title_full_unstemmed SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural
title_sort SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural
author Pinheiro, Vládia Célia Monteiro
author_facet Pinheiro, Vládia Célia Monteiro
author_role author
dc.contributor.co-advisor.none.fl_str_mv Furtado, João José Vasco Peixoto
dc.contributor.author.fl_str_mv Pinheiro, Vládia Célia Monteiro
dc.contributor.advisor1.fl_str_mv Pequeno, Tarcisio Haroldo Cavalcante
contributor_str_mv Pequeno, Tarcisio Haroldo Cavalcante
dc.subject.por.fl_str_mv Processamento de linguagem natural (Computação)
Semântica
Inferência (Lógica)
topic Processamento de linguagem natural (Computação)
Semântica
Inferência (Lógica)
description Computer processing of natural language is one of the major challenges in the area of Artificial Intelligence. There is an urgent need for computational systems that can help us process the overwhelming quantity of non-structured information that is present in natural language. Driven by this need, we studied various philosophies of language in order to understand what a linguistic expression consists of, or at least what reponses (borrowed from Philosophy) can make it possible for computer systems to address the needs and challenges in an effective way. For representationalist theories, concepts are things (or the state of affairs) represented by words and linguistic expressions. Therefore, there is the idea of a single and regular world—an inductive world. Pragmatic philosophies, such as that of Wittgenstein, define a new concept of "concept": the content of a concept consists of its various uses in language games. Concepts can be defined and understood based on the practice of applying concepts in situations of language use. Inferentialist philosophers such as Sellars, Dummett and Brandom incorporate the project of a semantic theory by presenting a reduction of the pragmatic approach to the uses of concepts in rational games, where the inferential component is the most important. In the second part of this research, we study linguistic resources and systems in the fields of Computational Linguistics and Natural Language Processing (NLP) in order to define a conceptual framework for the way that ordinary systems express and compute meanings. The model of the Computational Semantics—a traditional model in NLP—proposes that (i) one representation of the world is sufficient to define the semantic value of terms and sentences in natural language; and (ii) there is equivalence between the syntax and the semantics of natural languages. As a result of our research, we propose a new computational model for expression and semantic reasoning in natural language systems—the Semantic Inferentialism Model (SIM). SIM is based on Robert Brandom's Semantic Inferentialism Theory. We advocate the idea that in order for a computational system to answer questions, extract and retrieve information, refute arguments, justify answers, or give explanations about a text, among many other applications, it must: (i) express the inferential content of concepts and sentences (pre-conditions and post-conditions of use); and (ii) manipulate this inferential content within the flow of reasoning. The semantic bases of SIM were constructed and have formed the first large-scale linguistic resource for the Portuguese language with inferentialist content—InferenceNet.BR. The semantic reasoning algorithm (SIA) and the InferenceNet.BR resource were applied in a real system for extracting information on crime—the WikiCrimes Information Extractor (WikiCrimesTE). The results of the extraction were evaluated and superseded the state of the art of the foremost systems that execute tasks of understanding natural language, even for the English language. The greatest legacy of this research was to have planted a seed for research in Inferentialist Computational Semantics, which we believe will allow a more solid and fruitful evolutionary path for the field of Computational Linguistics.
publishDate 2010
dc.date.issued.fl_str_mv 2010
dc.date.accessioned.fl_str_mv 2021-10-15T14:43:27Z
dc.date.available.fl_str_mv 2021-10-15T14:43:27Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv PINHEIRO, Vládia Célia Monteiro. SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural. 2010. 229 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2010.
dc.identifier.uri.fl_str_mv http://www.repositorio.ufc.br/handle/riufc/61242
identifier_str_mv PINHEIRO, Vládia Célia Monteiro. SIM: um modelo semântico inferencialista para expressão e raciocínio em sistemas de linguagem natural. 2010. 229 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2010.
url http://www.repositorio.ufc.br/handle/riufc/61242
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Ceará (UFC)
instname:Universidade Federal do Ceará (UFC)
instacron:UFC
instname_str Universidade Federal do Ceará (UFC)
instacron_str UFC
institution UFC
reponame_str Repositório Institucional da Universidade Federal do Ceará (UFC)
collection Repositório Institucional da Universidade Federal do Ceará (UFC)
bitstream.url.fl_str_mv http://repositorio.ufc.br/bitstream/riufc/61242/2/license.txt
http://repositorio.ufc.br/bitstream/riufc/61242/1/2010_tese_vcmpinheiro.pdf
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
fb84f2064158f2f49dfce06f5b80ef1b
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)
repository.mail.fl_str_mv bu@ufc.br || repositorio@ufc.br
_version_ 1847793289726525440