Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus

Detalhes bibliográficos
Ano de defesa: 2025
Autor(a) principal: Vaz, Eurico Mayer
Orientador(a): Cardoso, Lídia Amélia de Barros
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Área do conhecimento CNPq:
Link de acesso: http://repositorio.ufc.br/handle/riufc/80057
Resumo: This research explores the extraction of specialized collocations and their syntactic- morphological and lexical-semantic analyses for the creation of a corpus-based bilingual glossary of specialized collocations in American English and Brazilian Portuguese. The ultimate goal of this research is to aid language users in navigating the specificities of legal terminology and to explore the lexicographic approach towards specialized language. The specialized collocations were extracted from a legal English corpus constituted of the subtitles from 134 episodes of the North-American TV Series “Suits”, named “Corpus Suits” (CS), which was submitted to analysis using the software Sketch Engine (Kilgariff et al, 2014). The comparable corpus English Web 2021 (enTenTen21), provided by the same software, was chosen in order to find evidence of usage and cooccurrence of the collocations beyond the scope of CS. The corpora were contrasted by usage of the Keywords tool from Sketch Engine in order to create a list of words that represent possible specific terminology, ordered by their keyness score. Fifteen nodes were elected from the Keywords to undergo further analysis with the Word Sketch tool searching for phraseological units that signal a high typicality score, indicating possibility of identifying as specialized collocations. Once the 42 candidates for specialized collocations were organized according to their frequency in the analyzed corpora, the tool Concordance was used to help understand the context they were used in the corpus. The context of usage guided a following classification of each of the nineteen selected collocations according to Hausmann's taxonomy (1985), feeding a table which compiles the collocation's frequency in each corpus, typicality score, syntactic-morphological formation and taxonomical classification. The table serves as input for the analysis of which classifications are predominant in this legal English extract, as well as for the construction of a bilingual glossary of specialized collocations, based on the methodological approach of Faulstich (2011) and analysis of a corpus of fifteen lexicographic products focused on legal terminology selected from the web. Furthermore, the corpus being constituted of a cultural product is an aspect which narrows the gap between academy and community, whom these results should serve in the first place. This research is anchored on the theoretical and methodological framework of corpus linguistics, lexicography and phraseology, especially regarding collocations and specialized collocations.
id UFC-7_7a6960c2e4e50f1cc441dd021fda374c
oai_identifier_str oai:repositorio.ufc.br:riufc/80057
network_acronym_str UFC-7
network_name_str Repositório Institucional da Universidade Federal do Ceará (UFC)
repository_id_str
spelling Vaz, Eurico MayerCardoso, Lídia Amélia de Barros2025-03-14T13:14:19Z2025-03-14T13:14:19Z2025VAZ, Eurico Mayer. Legalese and suits: uma proposta de glossário bilíngue de colocações especializadas baseado em corpus. 2025. 146 f. Dissertação (Mestrado em Linguística) - Programa de Pós-graduação em Linguística, Centro de Humanidades, Universidade Federal do Ceará, Fortaleza, 2025.http://repositorio.ufc.br/handle/riufc/80057This research explores the extraction of specialized collocations and their syntactic- morphological and lexical-semantic analyses for the creation of a corpus-based bilingual glossary of specialized collocations in American English and Brazilian Portuguese. The ultimate goal of this research is to aid language users in navigating the specificities of legal terminology and to explore the lexicographic approach towards specialized language. The specialized collocations were extracted from a legal English corpus constituted of the subtitles from 134 episodes of the North-American TV Series “Suits”, named “Corpus Suits” (CS), which was submitted to analysis using the software Sketch Engine (Kilgariff et al, 2014). The comparable corpus English Web 2021 (enTenTen21), provided by the same software, was chosen in order to find evidence of usage and cooccurrence of the collocations beyond the scope of CS. The corpora were contrasted by usage of the Keywords tool from Sketch Engine in order to create a list of words that represent possible specific terminology, ordered by their keyness score. Fifteen nodes were elected from the Keywords to undergo further analysis with the Word Sketch tool searching for phraseological units that signal a high typicality score, indicating possibility of identifying as specialized collocations. Once the 42 candidates for specialized collocations were organized according to their frequency in the analyzed corpora, the tool Concordance was used to help understand the context they were used in the corpus. The context of usage guided a following classification of each of the nineteen selected collocations according to Hausmann's taxonomy (1985), feeding a table which compiles the collocation's frequency in each corpus, typicality score, syntactic-morphological formation and taxonomical classification. The table serves as input for the analysis of which classifications are predominant in this legal English extract, as well as for the construction of a bilingual glossary of specialized collocations, based on the methodological approach of Faulstich (2011) and analysis of a corpus of fifteen lexicographic products focused on legal terminology selected from the web. Furthermore, the corpus being constituted of a cultural product is an aspect which narrows the gap between academy and community, whom these results should serve in the first place. This research is anchored on the theoretical and methodological framework of corpus linguistics, lexicography and phraseology, especially regarding collocations and specialized collocations.A pesquisa empreendida explora a extração de colocações especializadas e suas análises sintático-morfológica e léxico-semântica para a criação de um glossário bilíngue baseado em corpus de colocações especializadas em inglês americano e português brasileiro. O objetivo final desta pesquisa é auxiliar os usuários da língua a navegar pelas especificidades da terminologia jurídica e a explorar a abordagem lexicográfica da linguagem especializada. As colocações especializadas foram extraídas de um corpus jurídico em inglês constituído pelas legendas de 134 episódios da série norte-americana “Suits”, denominado “Corpus Suits” (CS), que foi submetido à análise através do software Sketch Engine (Kilgariff et al, 2014). O corpus comparável English Web 2021 (enTenTen21), fornecido pelo mesmo software, foi escolhido com o objetivo de encontrar evidências de uso e coocorrência das colocações fora do escopo do CS. Os corpora foram contrastados utilizando a ferramenta Keywords do Sketch Engine para criar uma lista de palavras-chave que representassem possíveis terminologias específicas, ordenadas por seu escore de chavicidade. Foram eleitos quinze nódulos a partir das palavras-chave para serem submetidos a análise posterior com a ferramenta Word Sketch em busca de unidades fraseológicas que sinalizassem alto escore de tipicidade, indicando possibilidade de identificação como colocações especializadas. Uma vez realizada a organização das 42 candidatas a colocações especializadas de acordo com sua frequência nos corpora analisados, utilizou-se a ferramenta Concordance para auxiliar na compreensão do contexto em que foram utilizados no corpus. O contexto de uso orientou uma posterior classificação de cada uma das dezenove colocações selecionadas segundo a taxonomia de Hausmann (1985), alimentando uma tabela que compila a frequência da colocação em cada corpus, escore de tipicidade, formação sintático-morfológica e classificação taxonômica. A tabela serve de subsídio para a análise de quais classificações predominam nesta delimitação do inglês jurídico, bem como para a construção de um glossário bilíngue de colocações especializadas, com base na abordagem metodológica de Faulstich (2011) e na análise de um corpus de quinze obras lexicográficas voltadas para a terminologia jurídica selecionadas na web. Além disso, o fato de o corpus ser constituído por um produto cultural é um aspecto que estreita a distância entre a academia e a comunidade, a quem estes resultados devem servir em primeiro lugar. Esta pesquisa está ancorada no referencial teórico e metodológico da linguística de corpus, da lexicografia e da fraseologia, especialmente no que diz respeito às colocações e às colocações especializadas.Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpusinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisLinguística de corpusLexicografiaFraseologiaColocações especializadasCorpus linguisticsLexicographyPhraseologySpecialized collocationsCNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICAinfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFChttp://lattes.cnpq.br/7810501035844280https://orcid.org/0000-0003-4164-2248http://lattes.cnpq.br/03457992707112652025-03-14LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ufc.br/bitstream/riufc/80057/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52ORIGINAL2025_dis_emvaz.pdf2025_dis_emvaz.pdfapplication/pdf5581409http://repositorio.ufc.br/bitstream/riufc/80057/3/2025_dis_emvaz.pdfe2053d3a43515a6b2ad763c58aec921fMD53riufc/800572025-03-14 10:17:06.26oai:repositorio.ufc.br:riufc/80057Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2025-03-14T13:17:06Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false
dc.title.pt_BR.fl_str_mv Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus
title Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus
spellingShingle Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus
Vaz, Eurico Mayer
CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA
Linguística de corpus
Lexicografia
Fraseologia
Colocações especializadas
Corpus linguistics
Lexicography
Phraseology
Specialized collocations
title_short Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus
title_full Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus
title_fullStr Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus
title_full_unstemmed Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus
title_sort Legalese and suits: uma proposta de glossário bilíngue de Colocações especializadas baseado em corpus
author Vaz, Eurico Mayer
author_facet Vaz, Eurico Mayer
author_role author
dc.contributor.author.fl_str_mv Vaz, Eurico Mayer
dc.contributor.advisor1.fl_str_mv Cardoso, Lídia Amélia de Barros
contributor_str_mv Cardoso, Lídia Amélia de Barros
dc.subject.cnpq.fl_str_mv CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA
topic CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA
Linguística de corpus
Lexicografia
Fraseologia
Colocações especializadas
Corpus linguistics
Lexicography
Phraseology
Specialized collocations
dc.subject.ptbr.pt_BR.fl_str_mv Linguística de corpus
Lexicografia
Fraseologia
Colocações especializadas
dc.subject.en.pt_BR.fl_str_mv Corpus linguistics
Lexicography
Phraseology
Specialized collocations
description This research explores the extraction of specialized collocations and their syntactic- morphological and lexical-semantic analyses for the creation of a corpus-based bilingual glossary of specialized collocations in American English and Brazilian Portuguese. The ultimate goal of this research is to aid language users in navigating the specificities of legal terminology and to explore the lexicographic approach towards specialized language. The specialized collocations were extracted from a legal English corpus constituted of the subtitles from 134 episodes of the North-American TV Series “Suits”, named “Corpus Suits” (CS), which was submitted to analysis using the software Sketch Engine (Kilgariff et al, 2014). The comparable corpus English Web 2021 (enTenTen21), provided by the same software, was chosen in order to find evidence of usage and cooccurrence of the collocations beyond the scope of CS. The corpora were contrasted by usage of the Keywords tool from Sketch Engine in order to create a list of words that represent possible specific terminology, ordered by their keyness score. Fifteen nodes were elected from the Keywords to undergo further analysis with the Word Sketch tool searching for phraseological units that signal a high typicality score, indicating possibility of identifying as specialized collocations. Once the 42 candidates for specialized collocations were organized according to their frequency in the analyzed corpora, the tool Concordance was used to help understand the context they were used in the corpus. The context of usage guided a following classification of each of the nineteen selected collocations according to Hausmann's taxonomy (1985), feeding a table which compiles the collocation's frequency in each corpus, typicality score, syntactic-morphological formation and taxonomical classification. The table serves as input for the analysis of which classifications are predominant in this legal English extract, as well as for the construction of a bilingual glossary of specialized collocations, based on the methodological approach of Faulstich (2011) and analysis of a corpus of fifteen lexicographic products focused on legal terminology selected from the web. Furthermore, the corpus being constituted of a cultural product is an aspect which narrows the gap between academy and community, whom these results should serve in the first place. This research is anchored on the theoretical and methodological framework of corpus linguistics, lexicography and phraseology, especially regarding collocations and specialized collocations.
publishDate 2025
dc.date.accessioned.fl_str_mv 2025-03-14T13:14:19Z
dc.date.available.fl_str_mv 2025-03-14T13:14:19Z
dc.date.issued.fl_str_mv 2025
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv VAZ, Eurico Mayer. Legalese and suits: uma proposta de glossário bilíngue de colocações especializadas baseado em corpus. 2025. 146 f. Dissertação (Mestrado em Linguística) - Programa de Pós-graduação em Linguística, Centro de Humanidades, Universidade Federal do Ceará, Fortaleza, 2025.
dc.identifier.uri.fl_str_mv http://repositorio.ufc.br/handle/riufc/80057
identifier_str_mv VAZ, Eurico Mayer. Legalese and suits: uma proposta de glossário bilíngue de colocações especializadas baseado em corpus. 2025. 146 f. Dissertação (Mestrado em Linguística) - Programa de Pós-graduação em Linguística, Centro de Humanidades, Universidade Federal do Ceará, Fortaleza, 2025.
url http://repositorio.ufc.br/handle/riufc/80057
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Ceará (UFC)
instname:Universidade Federal do Ceará (UFC)
instacron:UFC
instname_str Universidade Federal do Ceará (UFC)
instacron_str UFC
institution UFC
reponame_str Repositório Institucional da Universidade Federal do Ceará (UFC)
collection Repositório Institucional da Universidade Federal do Ceará (UFC)
bitstream.url.fl_str_mv http://repositorio.ufc.br/bitstream/riufc/80057/2/license.txt
http://repositorio.ufc.br/bitstream/riufc/80057/3/2025_dis_emvaz.pdf
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
e2053d3a43515a6b2ad763c58aec921f
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)
repository.mail.fl_str_mv bu@ufc.br || repositorio@ufc.br
_version_ 1847793230259683328