Malware detection in macOS using supervised learning

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: BURGARDT, Caio Augusto Pereira
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/46235
Resumo: The development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms.
id UFPE_d7cb35bf71e5f26dd01e117b14a2124b
oai_identifier_str oai:repositorio.ufpe.br:123456789/46235
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str
spelling Malware detection in macOS using supervised learningRedes de ComputadoresAprendizagem de máquinaThe development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms.O desenvolvimento de malware para macOS cresceu significativamente nos últimos anos. Os invasores se tornaram mais sofisticados e mais direcionados com o surgimento de novas famílias de malware perigosas para o macOS. No entanto, como o problema de detecção de malware é muito dependente da plataforma, as soluções propostas para outros sistemas operacionais não podem ser usadas diretamente no macOS. A detecção de malware é um dos principais pilares da segurança de endpoints. Infelizmente, existem muito poucos trabalhos sobre a segurança de endpoint do macOS, que é considerada território pouco investigado.Atualmente, o único mecanismo de detecção de malware no macOS é um sistema baseado em assinaturas com menos de 200 regras em 2021, conhecido como XProtect. Trabalhos recentes que tentaram melhorar a detecção de malwares no macOS têm limitações de metodologia, como a falta de um grande conjunto de dados de malware do macOS e problemas que surgem com conjuntos de dados em classes desequilibradas.Nessa dissertação, trazemos o problema de detecção de malware para o sistema operacional macOS e avaliamos como algoritmos de aprendizado de máquina supervisionados podem ser usados para melhorar a segurança de end - point do ecossistema macOS. Criamos um novo dataset extraindo informações do formato Mach-O de 631 malwares e 10.141 softwares benignos de fontes públicas. Avaliamos o desempenho de sete algoritmos de aprendizagem de máquina em conjunto com duas estratégias de amostragem e quatro técnicas de redução de features para a detecção de malwares no macOS. Como resultado, apresentamos modelos melhores que as proteções nativas do macOS, com taxas de detecção superiores a 90% e taxas de alarmes falsos inferiores a 1%. Os modelos apresentados demonstram com sucesso que a segurança do macOS pode ser aprimorada usando características estáticas de executáveis nativos em combinação com algoritmos populares de aprendizagem de máquina.Universidade Federal de PernambucoUFPEBrasilPrograma de Pos Graduacao em Ciencia da ComputacaoCAMPELO, Divanilson Rodrigo de Sousahttp://lattes.cnpq.br/0812104184657634http://lattes.cnpq.br/9838400375894439BURGARDT, Caio Augusto Pereira2022-09-08T12:07:45Z2022-09-08T12:07:45Z2022-02-25info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfBURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022.https://repositorio.ufpe.br/handle/123456789/46235enghttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPE2022-09-09T06:05:11Zoai:repositorio.ufpe.br:123456789/46235Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212022-09-09T06:05:11Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.none.fl_str_mv Malware detection in macOS using supervised learning
title Malware detection in macOS using supervised learning
spellingShingle Malware detection in macOS using supervised learning
BURGARDT, Caio Augusto Pereira
Redes de Computadores
Aprendizagem de máquina
title_short Malware detection in macOS using supervised learning
title_full Malware detection in macOS using supervised learning
title_fullStr Malware detection in macOS using supervised learning
title_full_unstemmed Malware detection in macOS using supervised learning
title_sort Malware detection in macOS using supervised learning
author BURGARDT, Caio Augusto Pereira
author_facet BURGARDT, Caio Augusto Pereira
author_role author
dc.contributor.none.fl_str_mv CAMPELO, Divanilson Rodrigo de Sousa
http://lattes.cnpq.br/0812104184657634
http://lattes.cnpq.br/9838400375894439
dc.contributor.author.fl_str_mv BURGARDT, Caio Augusto Pereira
dc.subject.por.fl_str_mv Redes de Computadores
Aprendizagem de máquina
topic Redes de Computadores
Aprendizagem de máquina
description The development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms.
publishDate 2022
dc.date.none.fl_str_mv 2022-09-08T12:07:45Z
2022-09-08T12:07:45Z
2022-02-25
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv BURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022.
https://repositorio.ufpe.br/handle/123456789/46235
identifier_str_mv BURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022.
url https://repositorio.ufpe.br/handle/123456789/46235
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1856042048792035328