Automated emerging cyber threat identification and profiling based on natural language processing

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Marinho, Renato Rodrigues
Orientador(a): Holanda Filho, Raimir
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade de Fortaleza
Programa de Pós-Graduação: Doutorado em Informática Aplicada
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Link de acesso: https://uol.unifor.br/auth-sophia/exibicao/30446
Resumo: The time window between the disclosure of a new cyber vulnerability and its use by cybercriminals has been getting smaller and smaller over time. Recent episodes, such as Log4j vulnerability, exemplifies this well. Within hours after the exploit being released, attackers started scanning the internet looking for vulnerable hosts to deploy threats like cryptocurrency miners and ransomware on vulnerable systems. Thus, it becomes imperative for the cybersecurity defense strategy to detect threats and their capabilities as early as possible to maximize the success of prevention actions. Although crucial, discovering new threats is a challenging activity for security analysts due to the immense volume of data and information sources to be analyzed for signs that a threat is emerging. In this sense, we present a framework for automatic identification and profiling of emerging threats using Twitter messages as a source of events and MITRE ATT&CK as a source of knowledge for threat characterization. The framework comprises three main parts: identification of cyber threats and their names; profiling the identified threat in terms of its intentions or goals by employing two machine learning layers to filter and classify tweets; and alarm generation. The main contribution of our work is the approach to characterize or profile the identified threats in terms of their intentions or goals, providing additional context on the threat and avenues for mitigation. In our experiments, the profiling stage reached an F1 score of 77% in correctly profiling discovered threats. Keywords: Emerging Cyber Threats, Cyber Threat Ingelligence, Automated Cyber Threat Profilling, Natural Language Processing.
id UFOR_2bf01078fc4c83096124d4c009c1eae7
oai_identifier_str oai::581815
network_acronym_str UFOR
network_name_str Biblioteca Digital de Teses e Dissertações da UNIFOR
repository_id_str
spelling info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisAutomated emerging cyber threat identification and profiling based on natural language processing2023-08-04Holanda Filho, RaimirMarinho, Renato RodriguesUniversidade de FortalezaDoutorado em Informática AplicadaUniversidade de FortalezaengcibersegurançaCrimes virtuaisCibernéticaThe time window between the disclosure of a new cyber vulnerability and its use by cybercriminals has been getting smaller and smaller over time. Recent episodes, such as Log4j vulnerability, exemplifies this well. Within hours after the exploit being released, attackers started scanning the internet looking for vulnerable hosts to deploy threats like cryptocurrency miners and ransomware on vulnerable systems. Thus, it becomes imperative for the cybersecurity defense strategy to detect threats and their capabilities as early as possible to maximize the success of prevention actions. Although crucial, discovering new threats is a challenging activity for security analysts due to the immense volume of data and information sources to be analyzed for signs that a threat is emerging. In this sense, we present a framework for automatic identification and profiling of emerging threats using Twitter messages as a source of events and MITRE ATT&CK as a source of knowledge for threat characterization. The framework comprises three main parts: identification of cyber threats and their names; profiling the identified threat in terms of its intentions or goals by employing two machine learning layers to filter and classify tweets; and alarm generation. The main contribution of our work is the approach to characterize or profile the identified threats in terms of their intentions or goals, providing additional context on the threat and avenues for mitigation. In our experiments, the profiling stage reached an F1 score of 77% in correctly profiling discovered threats. Keywords: Emerging Cyber Threats, Cyber Threat Ingelligence, Automated Cyber Threat Profilling, Natural Language Processing.https://uol.unifor.br/auth-sophia/exibicao/30446application/pdfreponame:Biblioteca Digital de Teses e Dissertações da UNIFORinstname:Universidade de Fortaleza (UNIFOR)instacron:UNIFORinfo:eu-repo/semantics/openAccess2025-09-26T16:48:20Zoai::581815Biblioteca Digital de Teses e Dissertaçõeshttps://www.unifor.br/bdtdONGhttp://dspace.unifor.br/oai/requestbib@unifor.br||bib@unifor.bropendoar:2023-10-13T11:47:33Biblioteca Digital de Teses e Dissertações da UNIFOR - Universidade de Fortaleza (UNIFOR)false
dc.title.pt.fl_str_mv Automated emerging cyber threat identification and profiling based on natural language processing
title Automated emerging cyber threat identification and profiling based on natural language processing
spellingShingle Automated emerging cyber threat identification and profiling based on natural language processing
Marinho, Renato Rodrigues
title_short Automated emerging cyber threat identification and profiling based on natural language processing
title_full Automated emerging cyber threat identification and profiling based on natural language processing
title_fullStr Automated emerging cyber threat identification and profiling based on natural language processing
title_full_unstemmed Automated emerging cyber threat identification and profiling based on natural language processing
title_sort Automated emerging cyber threat identification and profiling based on natural language processing
author Marinho, Renato Rodrigues
author_facet Marinho, Renato Rodrigues
author_role author
dc.contributor.advisor1.fl_str_mv Holanda Filho, Raimir
dc.contributor.author.fl_str_mv Marinho, Renato Rodrigues
contributor_str_mv Holanda Filho, Raimir
description The time window between the disclosure of a new cyber vulnerability and its use by cybercriminals has been getting smaller and smaller over time. Recent episodes, such as Log4j vulnerability, exemplifies this well. Within hours after the exploit being released, attackers started scanning the internet looking for vulnerable hosts to deploy threats like cryptocurrency miners and ransomware on vulnerable systems. Thus, it becomes imperative for the cybersecurity defense strategy to detect threats and their capabilities as early as possible to maximize the success of prevention actions. Although crucial, discovering new threats is a challenging activity for security analysts due to the immense volume of data and information sources to be analyzed for signs that a threat is emerging. In this sense, we present a framework for automatic identification and profiling of emerging threats using Twitter messages as a source of events and MITRE ATT&CK as a source of knowledge for threat characterization. The framework comprises three main parts: identification of cyber threats and their names; profiling the identified threat in terms of its intentions or goals by employing two machine learning layers to filter and classify tweets; and alarm generation. The main contribution of our work is the approach to characterize or profile the identified threats in terms of their intentions or goals, providing additional context on the threat and avenues for mitigation. In our experiments, the profiling stage reached an F1 score of 77% in correctly profiling discovered threats. Keywords: Emerging Cyber Threats, Cyber Threat Ingelligence, Automated Cyber Threat Profilling, Natural Language Processing.
publishDate 2023
dc.date.issued.fl_str_mv 2023-08-04
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://uol.unifor.br/auth-sophia/exibicao/30446
url https://uol.unifor.br/auth-sophia/exibicao/30446
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade de Fortaleza
dc.publisher.program.fl_str_mv Doutorado em Informática Aplicada
dc.publisher.initials.fl_str_mv Universidade de Fortaleza
publisher.none.fl_str_mv Universidade de Fortaleza
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da UNIFOR
instname:Universidade de Fortaleza (UNIFOR)
instacron:UNIFOR
instname_str Universidade de Fortaleza (UNIFOR)
instacron_str UNIFOR
institution UNIFOR
reponame_str Biblioteca Digital de Teses e Dissertações da UNIFOR
collection Biblioteca Digital de Teses e Dissertações da UNIFOR
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da UNIFOR - Universidade de Fortaleza (UNIFOR)
repository.mail.fl_str_mv bib@unifor.br||bib@unifor.br
_version_ 1844622778874986496