Malware detection in macOS using supervised learning
| Ano de defesa: | 2022 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Universidade Federal de Pernambuco
UFPE Brasil Programa de Pos Graduacao em Ciencia da Computacao |
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://repositorio.ufpe.br/handle/123456789/46235 |
Resumo: | The development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms. |
| id |
UFPE_d7cb35bf71e5f26dd01e117b14a2124b |
|---|---|
| oai_identifier_str |
oai:repositorio.ufpe.br:123456789/46235 |
| network_acronym_str |
UFPE |
| network_name_str |
Repositório Institucional da UFPE |
| repository_id_str |
|
| spelling |
Malware detection in macOS using supervised learningRedes de ComputadoresAprendizagem de máquinaThe development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms.O desenvolvimento de malware para macOS cresceu significativamente nos últimos anos. Os invasores se tornaram mais sofisticados e mais direcionados com o surgimento de novas famílias de malware perigosas para o macOS. No entanto, como o problema de detecção de malware é muito dependente da plataforma, as soluções propostas para outros sistemas operacionais não podem ser usadas diretamente no macOS. A detecção de malware é um dos principais pilares da segurança de endpoints. Infelizmente, existem muito poucos trabalhos sobre a segurança de endpoint do macOS, que é considerada território pouco investigado.Atualmente, o único mecanismo de detecção de malware no macOS é um sistema baseado em assinaturas com menos de 200 regras em 2021, conhecido como XProtect. Trabalhos recentes que tentaram melhorar a detecção de malwares no macOS têm limitações de metodologia, como a falta de um grande conjunto de dados de malware do macOS e problemas que surgem com conjuntos de dados em classes desequilibradas.Nessa dissertação, trazemos o problema de detecção de malware para o sistema operacional macOS e avaliamos como algoritmos de aprendizado de máquina supervisionados podem ser usados para melhorar a segurança de end - point do ecossistema macOS. Criamos um novo dataset extraindo informações do formato Mach-O de 631 malwares e 10.141 softwares benignos de fontes públicas. Avaliamos o desempenho de sete algoritmos de aprendizagem de máquina em conjunto com duas estratégias de amostragem e quatro técnicas de redução de features para a detecção de malwares no macOS. Como resultado, apresentamos modelos melhores que as proteções nativas do macOS, com taxas de detecção superiores a 90% e taxas de alarmes falsos inferiores a 1%. Os modelos apresentados demonstram com sucesso que a segurança do macOS pode ser aprimorada usando características estáticas de executáveis nativos em combinação com algoritmos populares de aprendizagem de máquina.Universidade Federal de PernambucoUFPEBrasilPrograma de Pos Graduacao em Ciencia da ComputacaoCAMPELO, Divanilson Rodrigo de Sousahttp://lattes.cnpq.br/0812104184657634http://lattes.cnpq.br/9838400375894439BURGARDT, Caio Augusto Pereira2022-09-08T12:07:45Z2022-09-08T12:07:45Z2022-02-25info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfBURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022.https://repositorio.ufpe.br/handle/123456789/46235enghttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPE2022-09-09T06:05:11Zoai:repositorio.ufpe.br:123456789/46235Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212022-09-09T06:05:11Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false |
| dc.title.none.fl_str_mv |
Malware detection in macOS using supervised learning |
| title |
Malware detection in macOS using supervised learning |
| spellingShingle |
Malware detection in macOS using supervised learning BURGARDT, Caio Augusto Pereira Redes de Computadores Aprendizagem de máquina |
| title_short |
Malware detection in macOS using supervised learning |
| title_full |
Malware detection in macOS using supervised learning |
| title_fullStr |
Malware detection in macOS using supervised learning |
| title_full_unstemmed |
Malware detection in macOS using supervised learning |
| title_sort |
Malware detection in macOS using supervised learning |
| author |
BURGARDT, Caio Augusto Pereira |
| author_facet |
BURGARDT, Caio Augusto Pereira |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
CAMPELO, Divanilson Rodrigo de Sousa http://lattes.cnpq.br/0812104184657634 http://lattes.cnpq.br/9838400375894439 |
| dc.contributor.author.fl_str_mv |
BURGARDT, Caio Augusto Pereira |
| dc.subject.por.fl_str_mv |
Redes de Computadores Aprendizagem de máquina |
| topic |
Redes de Computadores Aprendizagem de máquina |
| description |
The development of macOS malware has grown significantly in recent years. Attackers have become more sophisticated and more targeted with the emergence of new dangerous malware families for macOS. However, since the malware detection problem is very dependent on the platform, solutions previously proposed for other operating systems cannot be directly used in macOS. Malware detection is one of the main pillars of endpoint security. Unfortunately, there are very few works on macOS endpoint security, which is considered a largely unexplored territory. Currently, the only malware detection mechanism in macOS is a signature-based system with less than 200 rules as of 2021, called XProtect. Recent works that attempted to improve the detection of malwares in macOS have methodology limitations, such as the lack of a large macOS malware dataset and issues that arise with imbalanced datasets. In this work, we bring the malware detection issue to the macOS operating system and evaluate how supervised machine learning algorithms can be used to improve endpoint security in the macOS ecosystem. We create a new and larger dataset of 631 malware and 10,141 benign software using public sources and extracting information from the Mach-O format. We evaluate the performance of seven different machine learning algorithms, two sampling strategies and four feature reduction techniques in the detection of malwares in macOS. As a result, we present models that are better than macOS native protections, with detection rates larger than 90% while maintaining a false alarm rate of less than 1%. The presented models successfully demonstrate that macOS security can be improved by using static characteristics of native executables in combination with common machine learning algorithms. |
| publishDate |
2022 |
| dc.date.none.fl_str_mv |
2022-09-08T12:07:45Z 2022-09-08T12:07:45Z 2022-02-25 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
BURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022. https://repositorio.ufpe.br/handle/123456789/46235 |
| identifier_str_mv |
BURGARDT, Caio Augusto Pereira. Malware detection in macOS using supervised learning. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2022. |
| url |
https://repositorio.ufpe.br/handle/123456789/46235 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.rights.driver.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/br/ |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Pernambuco UFPE Brasil Programa de Pos Graduacao em Ciencia da Computacao |
| publisher.none.fl_str_mv |
Universidade Federal de Pernambuco UFPE Brasil Programa de Pos Graduacao em Ciencia da Computacao |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFPE instname:Universidade Federal de Pernambuco (UFPE) instacron:UFPE |
| instname_str |
Universidade Federal de Pernambuco (UFPE) |
| instacron_str |
UFPE |
| institution |
UFPE |
| reponame_str |
Repositório Institucional da UFPE |
| collection |
Repositório Institucional da UFPE |
| repository.name.fl_str_mv |
Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE) |
| repository.mail.fl_str_mv |
attena@ufpe.br |
| _version_ |
1856042048792035328 |