Colocação Automática de Computação em Hardware Heterogêneo

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: Kezia Correa Andrade
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://hdl.handle.net/1843/ESBF-AE2GAF
Resumo: Graphics processing Units (GPUs) have revolutionized high performance programming.They reduced the cost of parallel hardware, however, programming these devices is still a challenge. Programmers are not able (yet) to write code to coordinate the simultaneousperformance of thousands of threads. To deal with this problem, the industry and the academia have introduced annotation systems. Examples of those systems are OpenMP 4.0, OpenSS and OpenACC, which allow developers to indicate which parts of a C or Fortran program should perform on GPU or CPU. This approach has two advantages. First, it lets programmers to obtain the benefits of the parallel hardwarewhile coding in their preferred programming languages. Second, the annotations protect programmers from details of the parallel hardware, once they pass the task of parallelizing programs for the code generator.The inclusion of pragmas in the code to hide details of the hardware does not solve all the problems that developers face when programming GPUs: they still need to identify when it will be advantageous to run a given piece of code on the GPU. In this context, the objective of this dissertation is to present a solution to such a problem. It was designed, implemented and tested techniques to automatically identify whichportions of the code must run on the GPU. For this, we used dependency information, memory layout and control flow. Were created a set of static analysis that performs three tasks: (i) identify which loops are parallelizable; (ii) insert annotations to copy data between the CPU and the GPU; (iii) estimate which loops, once tagged as parallels, are most likely to lead to performance gains. These tasks are totally automatic, are carried out without any user intervention. The platform that is presented has been implemented on two compilers. The analyses were built on top of the infrastructure available in LLVM. The parallel code generation, from annotated programs, is made by PGCC. The approach that we have developed is completely static: we decide where each function must run during the compilation of the program. This decision does not relyon any runtime system such as a middleware, or special computer architecture hooks. Another benefit of this framework is that it is completely automatic, i.e. it does not require any intervention from the programmer. As a result, programs that we produce automatically - can be up to 121x faster than their original versions.
id UFMG_e80c7d293ea8a19103b00f7d32bbe3c1
oai_identifier_str oai:repositorio.ufmg.br:1843/ESBF-AE2GAF
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling Colocação Automática de Computação em Hardware HeterogêneoLinguagem de programação (Computação)Compiladores (Computadores)ComputaçãoAnálise estáticaCompiladoresAnálise EstáticaLinguagens de ProgramaçãoParalelismoGPGPUGraphics processing Units (GPUs) have revolutionized high performance programming.They reduced the cost of parallel hardware, however, programming these devices is still a challenge. Programmers are not able (yet) to write code to coordinate the simultaneousperformance of thousands of threads. To deal with this problem, the industry and the academia have introduced annotation systems. Examples of those systems are OpenMP 4.0, OpenSS and OpenACC, which allow developers to indicate which parts of a C or Fortran program should perform on GPU or CPU. This approach has two advantages. First, it lets programmers to obtain the benefits of the parallel hardwarewhile coding in their preferred programming languages. Second, the annotations protect programmers from details of the parallel hardware, once they pass the task of parallelizing programs for the code generator.The inclusion of pragmas in the code to hide details of the hardware does not solve all the problems that developers face when programming GPUs: they still need to identify when it will be advantageous to run a given piece of code on the GPU. In this context, the objective of this dissertation is to present a solution to such a problem. It was designed, implemented and tested techniques to automatically identify whichportions of the code must run on the GPU. For this, we used dependency information, memory layout and control flow. Were created a set of static analysis that performs three tasks: (i) identify which loops are parallelizable; (ii) insert annotations to copy data between the CPU and the GPU; (iii) estimate which loops, once tagged as parallels, are most likely to lead to performance gains. These tasks are totally automatic, are carried out without any user intervention. The platform that is presented has been implemented on two compilers. The analyses were built on top of the infrastructure available in LLVM. The parallel code generation, from annotated programs, is made by PGCC. The approach that we have developed is completely static: we decide where each function must run during the compilation of the program. This decision does not relyon any runtime system such as a middleware, or special computer architecture hooks. Another benefit of this framework is that it is completely automatic, i.e. it does not require any intervention from the programmer. As a result, programs that we produce automatically - can be up to 121x faster than their original versions.Universidade Federal de Minas Gerais2019-08-11T07:51:44Z2025-09-08T23:15:08Z2019-08-11T07:51:44Z2016-06-24info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/1843/ESBF-AE2GAFKezia Correa Andradeinfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-09T18:04:45Zoai:repositorio.ufmg.br:1843/ESBF-AE2GAFRepositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T18:04:45Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.none.fl_str_mv Colocação Automática de Computação em Hardware Heterogêneo
title Colocação Automática de Computação em Hardware Heterogêneo
spellingShingle Colocação Automática de Computação em Hardware Heterogêneo
Kezia Correa Andrade
Linguagem de programação (Computação)
Compiladores (Computadores)
Computação
Análise estática
Compiladores
Análise Estática
Linguagens de Programação
Paralelismo
GPGPU
title_short Colocação Automática de Computação em Hardware Heterogêneo
title_full Colocação Automática de Computação em Hardware Heterogêneo
title_fullStr Colocação Automática de Computação em Hardware Heterogêneo
title_full_unstemmed Colocação Automática de Computação em Hardware Heterogêneo
title_sort Colocação Automática de Computação em Hardware Heterogêneo
author Kezia Correa Andrade
author_facet Kezia Correa Andrade
author_role author
dc.contributor.author.fl_str_mv Kezia Correa Andrade
dc.subject.por.fl_str_mv Linguagem de programação (Computação)
Compiladores (Computadores)
Computação
Análise estática
Compiladores
Análise Estática
Linguagens de Programação
Paralelismo
GPGPU
topic Linguagem de programação (Computação)
Compiladores (Computadores)
Computação
Análise estática
Compiladores
Análise Estática
Linguagens de Programação
Paralelismo
GPGPU
description Graphics processing Units (GPUs) have revolutionized high performance programming.They reduced the cost of parallel hardware, however, programming these devices is still a challenge. Programmers are not able (yet) to write code to coordinate the simultaneousperformance of thousands of threads. To deal with this problem, the industry and the academia have introduced annotation systems. Examples of those systems are OpenMP 4.0, OpenSS and OpenACC, which allow developers to indicate which parts of a C or Fortran program should perform on GPU or CPU. This approach has two advantages. First, it lets programmers to obtain the benefits of the parallel hardwarewhile coding in their preferred programming languages. Second, the annotations protect programmers from details of the parallel hardware, once they pass the task of parallelizing programs for the code generator.The inclusion of pragmas in the code to hide details of the hardware does not solve all the problems that developers face when programming GPUs: they still need to identify when it will be advantageous to run a given piece of code on the GPU. In this context, the objective of this dissertation is to present a solution to such a problem. It was designed, implemented and tested techniques to automatically identify whichportions of the code must run on the GPU. For this, we used dependency information, memory layout and control flow. Were created a set of static analysis that performs three tasks: (i) identify which loops are parallelizable; (ii) insert annotations to copy data between the CPU and the GPU; (iii) estimate which loops, once tagged as parallels, are most likely to lead to performance gains. These tasks are totally automatic, are carried out without any user intervention. The platform that is presented has been implemented on two compilers. The analyses were built on top of the infrastructure available in LLVM. The parallel code generation, from annotated programs, is made by PGCC. The approach that we have developed is completely static: we decide where each function must run during the compilation of the program. This decision does not relyon any runtime system such as a middleware, or special computer architecture hooks. Another benefit of this framework is that it is completely automatic, i.e. it does not require any intervention from the programmer. As a result, programs that we produce automatically - can be up to 121x faster than their original versions.
publishDate 2016
dc.date.none.fl_str_mv 2016-06-24
2019-08-11T07:51:44Z
2019-08-11T07:51:44Z
2025-09-08T23:15:08Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/1843/ESBF-AE2GAF
url https://hdl.handle.net/1843/ESBF-AE2GAF
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv repositorio@ufmg.br
_version_ 1856413890863169536