Analysis of natural disasters in data from news
| Ano de defesa: | 2024 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Tese |
| Tipo de acesso: | Acesso aberto |
| dARK ID: | ark:/48912/001300002ssws |
| Idioma: | eng |
| Instituição de defesa: |
Universidade Federal de São Paulo
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://hdl.handle.net/11600/72677 |
Resumo: | Natural disasters have been occurring with increasing frequency as a result of human activity on the environment, causing significant damage to society. Minimizing these losses depends on the development of protection policies, which need to be supported by accurate information about the events. However, collecting information on disasters presents several challenges, such as insufficient manpower to document every detail of the event and the unpredictability of the events, making it difficult to capture the initial moments after a disaster. In light of these challenges, this work developed methodologies to utilize news data as an alternative source of information on disasters. Specifically, techniques for document filtering, event detection, and automatic summarization were proposed and optimized to achieve better results in this domain, with a particular focus on improving applications in Portuguese, as there is a shortage of research in this language. The main contributions of this work are: 1) a complete framework for building knowledge bases from news articles, 2) new Portuguese datasets for several Natural Language Processing (NLP) tasks, 3) a novel method to produce more accurate summaries based on siamese networks, 4) an evaluation of the latest text classification techniques for application in Portuguese, and 5) a systematic literature review on event detection in news. This work provides contributions to various NLP tasks, with a special emphasis on addressing and developing solutions for the Portuguese language. |
| id |
UFSP_9e938335fbe41b9acafbc9c9ae6036e3 |
|---|---|
| oai_identifier_str |
oai:repositorio.unifesp.br:11600/72677 |
| network_acronym_str |
UFSP |
| network_name_str |
Repositório Institucional da UNIFESP |
| repository_id_str |
|
| spelling |
http://lattes.cnpq.br/9064767888093340Garcia, Klaifer [UNIFESP]http://lattes.cnpq.br/0896350174589757Berton, Lilian [UNIFESP]São José dos Campos, SP2024-12-30T13:24:58Z2024-12-30T13:24:58Z2024-11-25Natural disasters have been occurring with increasing frequency as a result of human activity on the environment, causing significant damage to society. Minimizing these losses depends on the development of protection policies, which need to be supported by accurate information about the events. However, collecting information on disasters presents several challenges, such as insufficient manpower to document every detail of the event and the unpredictability of the events, making it difficult to capture the initial moments after a disaster. In light of these challenges, this work developed methodologies to utilize news data as an alternative source of information on disasters. Specifically, techniques for document filtering, event detection, and automatic summarization were proposed and optimized to achieve better results in this domain, with a particular focus on improving applications in Portuguese, as there is a shortage of research in this language. The main contributions of this work are: 1) a complete framework for building knowledge bases from news articles, 2) new Portuguese datasets for several Natural Language Processing (NLP) tasks, 3) a novel method to produce more accurate summaries based on siamese networks, 4) an evaluation of the latest text classification techniques for application in Portuguese, and 5) a systematic literature review on event detection in news. This work provides contributions to various NLP tasks, with a special emphasis on addressing and developing solutions for the Portuguese language.lberton@unifesp.br149 f.https://hdl.handle.net/11600/72677ark:/48912/001300002sswsengUniversidade Federal de São Pauloinfo:eu-repo/semantics/openAccessNatural Language ProcessingAutomatic Text SummarizationEvent DetectionAutomatic Text ClassificationMachine LearningAnalysis of natural disasters in data from newsAnálise de desastres naturais em dados de notíciasinfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/publishedVersionreponame:Repositório Institucional da UNIFESPinstname:Universidade Federal de São Paulo (UNIFESP)instacron:UNIFESPInstituto de Ciência e Tecnologia (ICT)Ciência da ComputaçãoSistemas InteligentesORIGINALDoutorado2024Klaifer_Revisado_A.pdfDoutorado2024Klaifer_Revisado_A.pdfapplication/pdf18642958https://repositorio.unifesp.br/bitstreams/907b293e-33ca-406e-b766-b7355d33eba8/download67927bc0550b6b596fcda526ab1b2d6eMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-86456https://repositorio.unifesp.br/bitstreams/fb260a8d-cafb-48b6-a696-ccbcb800c1d6/download79881d6dea480587c66312d1102a8942MD52TEXTDoutorado2024Klaifer_Revisado_A.pdf.txtDoutorado2024Klaifer_Revisado_A.pdf.txtExtracted texttext/plain100386https://repositorio.unifesp.br/bitstreams/5deacdce-3332-454c-b91f-3887fc5cc84a/download40e24ee945e929f5f8fbcbca90c2781aMD53THUMBNAILDoutorado2024Klaifer_Revisado_A.pdf.jpgDoutorado2024Klaifer_Revisado_A.pdf.jpgGenerated Thumbnailimage/jpeg3704https://repositorio.unifesp.br/bitstreams/18115ff8-ec2b-4263-8529-dc257086fb6c/download9e34f324ebf410007c2a731e6e3b2eeeMD5411600/726772024-12-31 04:01:32.148oai:repositorio.unifesp.br:11600/72677https://repositorio.unifesp.brRepositório InstitucionalPUBhttp://www.repositorio.unifesp.br/oai/requestbiblioteca.csp@unifesp.bropendoar:34652024-12-31T04:01:32Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP)falsePGgxPjxzdHJvbmc+TGljZW7Dp2EgZGlzdHJpYnXDrWRhPC9zdHJvbmc+PC9oMT4KPGJyPjxicj4KTm8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgVW5pZmVzcCwgcGFyYSByZXByb2R1emlyLCB0cmFkdXppciBlIGRpc3RyaWJ1aXIgc3VhIHN1Ym1pc3PDo28gZW0gdG9kbyBvIG11bmRvLCB2b2PDqiBkZXZlIGNvbmNvcmRhciBjb20gb3MgdGVybW9zIGEgc2VndWlyLgo8YnI+PGJyPgpQYXJhIGNvbmNlZGVyIGEgbGljZW7Dp2EgZGUgZGlzdHJpYnVpw6fDo28gcGFkcsOjbywgYXDDs3MgYSBsZWl0dXJhIGRvcyB0ZXJtb3MsIHNlbGVjaW9uZTogIkV1IGNvbmNlZG8gYSBMaWNlbsOnYSIgZSBjbGlxdWUgZW0gIkZpbmFsaXphciBzdWJtaXNzw6NvIi4KPGJyPjxicj4KVEVSTU9TIEUgQ09OREnDh8OVRVMgUEFSQSBPIExJQ0VOQ0lBTUVOVE8gRE8gQVJRVUlWQU1FTlRPLCBSRVBST0RVw4fDg08gRSBESVZVTEdBw4fDg08gUMOaQkxJQ0EgREUgQ09OVEXDmkRPIE5PIFJFUE9TSVTDk1JJTyBJTlNUSVRVQ0lPTkFMIFVOSUZFU1AuCjxicj48YnI+CjEuIEV1LCByZXNwb25zw6F2ZWwgcGVsbyB0cmFiYWxobyBlL291IHVzdcOhcmlvLWRlcG9zaXRhbnRlIG5vIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIFVOSUZFU1AsIGFzc2VndXJvIG5vIHByZXNlbnRlIGF0byBxdWUgc291IHRpdHVsYXIgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhdHJpbW9uaWFpcyBlL291IGRpcmVpdG9zIGNvbmV4b3MgcmVmZXJlbnRlcyDDoCB0b3RhbGlkYWRlIGRhIE9icmEgb3JhIGRlcG9zaXRhZGEgZW0gZm9ybWF0byBkaWdpdGFsLCBiZW0gY29tbyBkZSBzZXVzIGNvbXBvbmVudGVzIG1lbm9yZXMsIGVtIHNlIHRyYXRhbmRvIGRlIG9icmEgY29sZXRpdmEsIGNvbmZvcm1lIG8gcHJlY2VpdHVhZG8gcGVsYSBMZWkgOS42MTAvOTggZS9vdSBMZWkgOS42MDkvOTguIE7Do28gc2VuZG8gZXN0ZSBvIGNhc28sIGFzc2VndXJvIHRlciBvYnRpZG8gZGlyZXRhbWVudGUgZG9zIGRldmlkb3MgdGl0dWxhcmVzIGF1dG9yaXphw6fDo28gcHLDqXZpYSBlIGV4cHJlc3NhIHBhcmEgbyBkZXDDs3NpdG8gZSBwYXJhIGEgZGl2dWxnYcOnw6NvIGRhIE9icmEsIGFicmFuZ2VuZG8gdG9kb3Mgb3MgZGlyZWl0b3MgYXV0b3JhaXMgZSBjb25leG9zIGFmZXRhZG9zIHBlbGEgYXNzaW5hdHVyYSBkbyBwcmVzZW50ZSB0ZXJtbyBkZSBsaWNlbmNpYW1lbnRvLCBkZSBtb2RvIGEgZWZldGl2YW1lbnRlIGlzZW50YXIgYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBTw6NvIFBhdWxvIChVTklGRVNQKSBlIHNldXMgZnVuY2lvbsOhcmlvcyBkZSBxdWFscXVlciByZXNwb25zYWJpbGlkYWRlIHBlbG8gdXNvIG7Do28tYXV0b3JpemFkbyBkbyBtYXRlcmlhbCBkZXBvc2l0YWRvLCBzZWphIGVtIHZpbmN1bGHDp8OjbyBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBVTklGRVNQLCBzZWphIGVtIHZpbmN1bGHDp8OjbyBhIHF1YWlzcXVlciBzZXJ2acOnb3MgZGUgYnVzY2EgZSBkZSBkaXN0cmlidWnDp8OjbyBkZSBjb250ZcO6ZG8gcXVlIGZhw6dhbSB1c28gZGFzIGludGVyZmFjZXMgZSBlc3Bhw6dvIGRlIGFybWF6ZW5hbWVudG8gcHJvdmlkZW5jaWFkb3MgcGVsYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBTw6NvIFBhdWxvIChVTklGRVNQKSBwb3IgbWVpbyBkZSBzZXVzIHNpc3RlbWFzIGluZm9ybWF0aXphZG9zLiAKPGJyPjxicj4KMi4gQSBjb25jb3Jkw6JuY2lhIGNvbSBlc3RhIGxpY2Vuw6dhIHRlbSBjb21vIGNvbnNlcXXDqm5jaWEgYSB0cmFuc2ZlcsOqbmNpYSwgYSB0w610dWxvIG7Do28tZXhjbHVzaXZvIGUgbsOjby1vbmVyb3NvLCBpc2VudGEgZG8gcGFnYW1lbnRvIGRlIHJveWFsdGllcyBvdSBxdWFscXVlciBvdXRyYSBjb250cmFwcmVzdGHDp8OjbywgcGVjdW5pw6FyaWEgb3UgbsOjbywgw6AgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgU8OjbyBQYXVsbyAoVU5JRkVTUCkgZG9zIGRpcmVpdG9zIGRlIGFybWF6ZW5hciBkaWdpdGFsbWVudGUsIGRlIHJlcHJvZHV6aXIgZSBkZSBkaXN0cmlidWlyIG5hY2lvbmFsIGUgaW50ZXJuYWNpb25hbG1lbnRlIGEgT2JyYSwgaW5jbHVpbmRvLXNlIG8gc2V1IHJlc3Vtby9hYnN0cmFjdCwgcG9yIG1laW9zIGVsZXRyw7RuaWNvcyBhbyBww7pibGljbyBlbSBnZXJhbCwgZW0gcmVnaW1lIGRlIGFjZXNzbyBhYmVydG8uCjxicj48YnI+CjMuIEEgcHJlc2VudGUgbGljZW7Dp2EgdGFtYsOpbSBhYnJhbmdlLCBub3MgbWVzbW9zIHRlcm1vcyBlc3RhYmVsZWNpZG9zIG5vIGl0ZW0gMiwgc3VwcmEsIHF1YWxxdWVyIGRpcmVpdG8gZGUgY29tdW5pY2HDp8OjbyBhbyBww7pibGljbyBjYWLDrXZlbCBlbSByZWxhw6fDo28gw6AgT2JyYSBvcmEgZGVwb3NpdGFkYSwgaW5jbHVpbmRvLXNlIG9zIHVzb3MgcmVmZXJlbnRlcyDDoCByZXByZXNlbnRhw6fDo28gcMO6YmxpY2EgZS9vdSBleGVjdcOnw6NvIHDDumJsaWNhLCBiZW0gY29tbyBxdWFscXVlciBvdXRyYSBtb2RhbGlkYWRlIGRlIGNvbXVuaWNhw6fDo28gYW8gcMO6YmxpY28gcXVlIGV4aXN0YSBvdSB2ZW5oYSBhIGV4aXN0aXIsIG5vcyB0ZXJtb3MgZG8gYXJ0aWdvIDY4IGUgc2VndWludGVzIGRhIExlaSA5LjYxMC85OCwgbmEgZXh0ZW5zw6NvIHF1ZSBmb3IgYXBsaWPDoXZlbCBhb3Mgc2VydmnDp29zIHByZXN0YWRvcyBhbyBww7pibGljbyBwZWxhIFVuaXZlcnNpZGFkZSBGZWRlcmFsIGRlIFPDo28gUGF1bG8gKFVOSUZFU1ApLgo8YnI+PGJyPgo0LiBFc3RhIGxpY2Vuw6dhIGFicmFuZ2UsIGFpbmRhLCBub3MgbWVzbW9zIHRlcm1vcyBlc3RhYmVsZWNpZG9zIG5vIGl0ZW0gMiwgc3VwcmEsIHRvZG9zIG9zIGRpcmVpdG9zIGNvbmV4b3MgZGUgYXJ0aXN0YXMgaW50w6lycHJldGVzIG91IGV4ZWN1dGFudGVzLCBwcm9kdXRvcmVzIGZvbm9ncsOhZmljb3Mgb3UgZW1wcmVzYXMgZGUgcmFkaW9kaWZ1c8OjbyBxdWUgZXZlbnR1YWxtZW50ZSBzZWphbSBhcGxpY8OhdmVpcyBlbSByZWxhw6fDo28gw6Agb2JyYSBkZXBvc2l0YWRhLCBlbSBjb25mb3JtaWRhZGUgY29tIG8gcmVnaW1lIGZpeGFkbyBubyBUw610dWxvIFYgZGEgTGVpIDkuNjEwLzk4Lgo8YnI+PGJyPgo1LiBTZSBhIE9icmEgZGVwb3NpdGFkYSBmb2kgb3Ugw6kgb2JqZXRvIGRlIGZpbmFuY2lhbWVudG8gcG9yIGluc3RpdHVpw6fDtWVzIGRlIGZvbWVudG8gw6AgcGVzcXVpc2Egb3UgcXVhbHF1ZXIgb3V0cmEgc2VtZWxoYW50ZSwgdm9jw6ogb3UgbyB0aXR1bGFyIGFzc2VndXJhIHF1ZSBjdW1wcml1IHRvZGFzIGFzIG9icmlnYcOnw7VlcyBxdWUgbGhlIGZvcmFtIGltcG9zdGFzIHBlbGEgaW5zdGl0dWnDp8OjbyBmaW5hbmNpYWRvcmEgZW0gcmF6w6NvIGRvIGZpbmFuY2lhbWVudG8sIGUgcXVlIG7Do28gZXN0w6EgY29udHJhcmlhbmRvIHF1YWxxdWVyIGRpc3Bvc2nDp8OjbyBjb250cmF0dWFsIHJlZmVyZW50ZSDDoCBwdWJsaWNhw6fDo28gZG8gY29udGXDumRvIG9yYSBzdWJtZXRpZG8gYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgVU5JRkVTUC4KPGJyPjxicj4KNi4gQXV0b3JpemEgYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBTw6NvIFBhdWxvIGEgZGlzcG9uaWJpbGl6YXIgYSBvYnJhIG5vIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIFVOSUZFU1AgZGUgZm9ybWEgZ3JhdHVpdGEsIGRlIGFjb3JkbyBjb20gYSBsaWNlbsOnYSBww7pibGljYSBDcmVhdGl2ZSBDb21tb25zOiBBdHJpYnVpw6fDo28tU2VtIERlcml2YcOnw7Vlcy1TZW0gRGVyaXZhZG9zIDQuMCBJbnRlcm5hY2lvbmFsIChDQyBCWS1OQy1ORCksIHBlcm1pdGluZG8gc2V1IGxpdnJlIGFjZXNzbywgdXNvIGUgY29tcGFydGlsaGFtZW50bywgZGVzZGUgcXVlIGNpdGFkYSBhIGZvbnRlLiBBIG9icmEgY29udGludWEgcHJvdGVnaWRhIHBvciBEaXJlaXRvcyBBdXRvcmFpcyBlL291IHBvciBvdXRyYXMgbGVpcyBhcGxpY8OhdmVpcy4gUXVhbHF1ZXIgdXNvIGRhIG9icmEsIHF1ZSBuw6NvIG8gYXV0b3JpemFkbyBzb2IgZXN0YSBsaWNlbsOnYSBvdSBwZWxhIGxlZ2lzbGHDp8OjbyBhdXRvcmFsLCDDqSBwcm9pYmlkby4gIAo8YnI+PGJyPgo3LiBBdGVzdGEgcXVlIGEgT2JyYSBzdWJtZXRpZGEgbsOjbyBjb250w6ltIHF1YWxxdWVyIGluZm9ybWHDp8OjbyBjb25maWRlbmNpYWwgc3VhIG91IGRlIHRlcmNlaXJvcy4KPGJyPjxicj4KOC4gQXRlc3RhIHF1ZSBvIHRyYWJhbGhvIHN1Ym1ldGlkbyDDqSBvcmlnaW5hbCBlIGZvaSBlbGFib3JhZG8gcmVzcGVpdGFuZG8gb3MgcHJpbmPDrXBpb3MgZGEgbW9yYWwgZSBkYSDDqXRpY2EgZSBuw6NvIHZpb2xvdSBxdWFscXVlciBkaXJlaXRvIGRlIHByb3ByaWVkYWRlIGludGVsZWN0dWFsLCBzb2IgcGVuYSBkZSByZXNwb25kZXIgY2l2aWwsIGNyaW1pbmFsLCDDqXRpY2EgZSBwcm9maXNzaW9uYWxtZW50ZSBwb3IgbWV1cyBhdG9zOwo8YnI+PGJyPgo5LiBBdGVzdGEgcXVlIGEgdmVyc8OjbyBkbyB0cmFiYWxobyBwcmVzZW50ZSBubyBhcnF1aXZvIHN1Ym1ldGlkbywgZW0gY2Fzb3MgZGUgdHJhYmFsaG9zIHF1ZSBleGlnaXJhbSBvcmllbnRhw6fDo28sIMOpIGEgdmVyc8OjbyBkZWZpbml0aXZhIHF1ZSBpbmNsdWkgYXMgYWx0ZXJhw6fDtWVzIGRlY29ycmVudGVzIGRhIGRlZmVzYSwgc29saWNpdGFkYXMgcGVsYSBiYW5jYSwgc2UgaG91dmUgYWxndW1hLCBvdSBzb2xpY2l0YWRhcyBwb3IgcGFydGUgZGUgb3JpZW50YcOnw6NvIGRvY2VudGUgcmVzcG9uc8OhdmVsLiBBdGVzdG8gYWluZGEgcXVlIG8gdHJhYmFsaG8gb2J0ZXZlIGF1dG9yaXphw6fDo28gZGUgcHVibGljYcOnw6NvIGUgYWNlc3NvIGRvIChhKSBvcmllbnRhZG9yIChhKSBkZSBhY29yZG8gY29tIGFzIGluZm9ybWHDp8O1ZXMgYXF1aSBwcmVzdGFkYXM7Cjxicj48YnI+CjEwLiBDb25jZWRlIMOgIFVuaXZlcnNpZGFkZSBGZWRlcmFsIGRlIFPDo28gUGF1bG8gKFVOSUZFU1ApIG8gZGlyZWl0byBuw6NvIGV4Y2x1c2l2byBkZSByZWFsaXphciBxdWFpc3F1ZXIgYWx0ZXJhw6fDtWVzIG5hIG3DrWRpYSBvdSBubyBmb3JtYXRvIGRvIGFycXVpdm8gcGFyYSBwcm9ww7NzaXRvcyBkZSBwcmVzZXJ2YcOnw6NvIGRpZ2l0YWwsIGRlIGFjZXNzaWJpbGlkYWRlIGUgZGUgbWVsaG9yIGlkZW50aWZpY2HDp8OjbyBkbyB0cmFiYWxobyBzdWJtZXRpZG8sIGRlc2RlIHF1ZSBuw6NvIHNlamEgYWx0ZXJhZG8gc2V1IGNvbnRlw7pkbyBpbnRlbGVjdHVhbC4KPGJyPjxicj4KQW8gY29uY2x1aXIgYXMgZXRhcGFzIGRvIHByb2Nlc3NvIGRlIHN1Ym1pc3PDo28gZGUgYXJxdWl2b3Mgbm8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgVU5JRkVTUCwgYXRlc3RvIHF1ZSBsaSBlIGNvbmNvcmRlaSBpbnRlZ3JhbG1lbnRlIGNvbSBvcyB0ZXJtb3MgYWNpbWEgZGVsaW1pdGFkb3MsIHNlbSBmYXplciBxdWFscXVlciByZXNlcnZhIGUgbm92YW1lbnRlIGNvbmZpcm1hbmRvIHF1ZSBjdW1wcm8gb3MgcmVxdWlzaXRvcyBpbmRpY2Fkb3Mgbm9zIGl0ZW5zIG1lbmNpb25hZG9zIGFudGVyaW9ybWVudGUuCjxicj48YnI+CkhhdmVuZG8gcXVhbHF1ZXIgZGlzY29yZMOibmNpYSBlbSByZWxhw6fDo28gYSBwcmVzZW50ZSBsaWNlbsOnYSBvdSBuw6NvIHNlIHZlcmlmaWNhbmRvIG8gZXhpZ2lkbyBub3MgaXRlbnMgYW50ZXJpb3Jlcywgdm9jw6ogZGV2ZSBpbnRlcnJvbXBlciBpbWVkaWF0YW1lbnRlIG8gcHJvY2Vzc28gZGUgc3VibWlzc8Ojby4gQSBjb250aW51aWRhZGUgZG8gcHJvY2Vzc28gZXF1aXZhbGUgw6AgY29uY29yZMOibmNpYSBlIMOgIGFzc2luYXR1cmEgZGVzdGUgZG9jdW1lbnRvLCBjb20gdG9kYXMgYXMgY29uc2VxdcOqbmNpYXMgbmVsZSBwcmV2aXN0YXMsIHN1amVpdGFuZG8tc2UgbyBzaWduYXTDoXJpbyBhIHNhbsOnw7VlcyBjaXZpcyBlIGNyaW1pbmFpcyBjYXNvIG7Do28gc2VqYSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXRyaW1vbmlhaXMgZS9vdSBjb25leG9zIGFwbGljw6F2ZWlzIMOgIE9icmEgZGVwb3NpdGFkYSBkdXJhbnRlIGVzdGUgcHJvY2Vzc28sIG91IGNhc28gbsOjbyB0ZW5oYSBvYnRpZG8gcHLDqXZpYSBlIGV4cHJlc3NhIGF1dG9yaXphw6fDo28gZG8gdGl0dWxhciBwYXJhIG8gZGVww7NzaXRvIGUgdG9kb3Mgb3MgdXNvcyBkYSBPYnJhIGVudm9sdmlkb3MuCjxicj48YnI+ClNlIHRpdmVyIHF1YWxxdWVyIGTDunZpZGEgcXVhbnRvIGFvcyB0ZXJtb3MgZGUgbGljZW5jaWFtZW50byBlIHF1YW50byBhbyBwcm9jZXNzbyBkZSBzdWJtaXNzw6NvLCBlbnRyZSBlbSBjb250YXRvIGNvbSBhIGJpYmxpb3RlY2EgZG8gc2V1IGNhbXB1cyAoY29uc3VsdGUgZW06IDxhIGhyZWY9Imh0dHBzOi8vYmlibGlvdGVjYXMudW5pZmVzcC5ici9iaWJsaW90ZWNhcy1kYS1yZWRlIj5odHRwczovL2JpYmxpb3RlY2FzLnVuaWZlc3AuYnIvYmlibGlvdGVjYXMtZGEtcmVkZTwvYT4pIAo8YnI+PGJyPgpTw6NvIFBhdWxvLCBNb24gSmFuIDE4IDIxOjQ5OjE4IEJSU1QgMjAyMS4K |
| dc.title.none.fl_str_mv |
Analysis of natural disasters in data from news |
| dc.title.alternative.none.fl_str_mv |
Análise de desastres naturais em dados de notícias |
| title |
Analysis of natural disasters in data from news |
| spellingShingle |
Analysis of natural disasters in data from news Garcia, Klaifer [UNIFESP] Natural Language Processing Automatic Text Summarization Event Detection Automatic Text Classification Machine Learning |
| title_short |
Analysis of natural disasters in data from news |
| title_full |
Analysis of natural disasters in data from news |
| title_fullStr |
Analysis of natural disasters in data from news |
| title_full_unstemmed |
Analysis of natural disasters in data from news |
| title_sort |
Analysis of natural disasters in data from news |
| author |
Garcia, Klaifer [UNIFESP] |
| author_facet |
Garcia, Klaifer [UNIFESP] |
| author_role |
author |
| dc.contributor.advisorLattes.none.fl_str_mv |
http://lattes.cnpq.br/9064767888093340 |
| dc.contributor.authorLattes.none.fl_str_mv |
http://lattes.cnpq.br/0896350174589757 |
| dc.contributor.author.fl_str_mv |
Garcia, Klaifer [UNIFESP] |
| dc.contributor.advisor1.fl_str_mv |
Berton, Lilian [UNIFESP] |
| contributor_str_mv |
Berton, Lilian [UNIFESP] |
| dc.subject.por.fl_str_mv |
Natural Language Processing Automatic Text Summarization Event Detection Automatic Text Classification Machine Learning |
| topic |
Natural Language Processing Automatic Text Summarization Event Detection Automatic Text Classification Machine Learning |
| description |
Natural disasters have been occurring with increasing frequency as a result of human activity on the environment, causing significant damage to society. Minimizing these losses depends on the development of protection policies, which need to be supported by accurate information about the events. However, collecting information on disasters presents several challenges, such as insufficient manpower to document every detail of the event and the unpredictability of the events, making it difficult to capture the initial moments after a disaster. In light of these challenges, this work developed methodologies to utilize news data as an alternative source of information on disasters. Specifically, techniques for document filtering, event detection, and automatic summarization were proposed and optimized to achieve better results in this domain, with a particular focus on improving applications in Portuguese, as there is a shortage of research in this language. The main contributions of this work are: 1) a complete framework for building knowledge bases from news articles, 2) new Portuguese datasets for several Natural Language Processing (NLP) tasks, 3) a novel method to produce more accurate summaries based on siamese networks, 4) an evaluation of the latest text classification techniques for application in Portuguese, and 5) a systematic literature review on event detection in news. This work provides contributions to various NLP tasks, with a special emphasis on addressing and developing solutions for the Portuguese language. |
| publishDate |
2024 |
| dc.date.accessioned.fl_str_mv |
2024-12-30T13:24:58Z |
| dc.date.available.fl_str_mv |
2024-12-30T13:24:58Z |
| dc.date.issued.fl_str_mv |
2024-11-25 |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| format |
doctoralThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/11600/72677 |
| dc.identifier.dark.fl_str_mv |
ark:/48912/001300002ssws |
| url |
https://hdl.handle.net/11600/72677 |
| identifier_str_mv |
ark:/48912/001300002ssws |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
149 f. |
| dc.coverage.spatial.none.fl_str_mv |
São José dos Campos, SP |
| dc.publisher.none.fl_str_mv |
Universidade Federal de São Paulo |
| publisher.none.fl_str_mv |
Universidade Federal de São Paulo |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UNIFESP instname:Universidade Federal de São Paulo (UNIFESP) instacron:UNIFESP |
| instname_str |
Universidade Federal de São Paulo (UNIFESP) |
| instacron_str |
UNIFESP |
| institution |
UNIFESP |
| reponame_str |
Repositório Institucional da UNIFESP |
| collection |
Repositório Institucional da UNIFESP |
| bitstream.url.fl_str_mv |
https://repositorio.unifesp.br/bitstreams/907b293e-33ca-406e-b766-b7355d33eba8/download https://repositorio.unifesp.br/bitstreams/fb260a8d-cafb-48b6-a696-ccbcb800c1d6/download https://repositorio.unifesp.br/bitstreams/5deacdce-3332-454c-b91f-3887fc5cc84a/download https://repositorio.unifesp.br/bitstreams/18115ff8-ec2b-4263-8529-dc257086fb6c/download |
| bitstream.checksum.fl_str_mv |
67927bc0550b6b596fcda526ab1b2d6e 79881d6dea480587c66312d1102a8942 40e24ee945e929f5f8fbcbca90c2781a 9e34f324ebf410007c2a731e6e3b2eee |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
| repository.name.fl_str_mv |
Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP) |
| repository.mail.fl_str_mv |
biblioteca.csp@unifesp.br |
| _version_ |
1866180585478684672 |