Colocação Automática de Computação em Hardware Heterogêneo
| Ano de defesa: | 2016 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | por |
| Instituição de defesa: |
Universidade Federal de Minas Gerais
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://hdl.handle.net/1843/ESBF-AE2GAF |
Resumo: | Graphics processing Units (GPUs) have revolutionized high performance programming.They reduced the cost of parallel hardware, however, programming these devices is still a challenge. Programmers are not able (yet) to write code to coordinate the simultaneousperformance of thousands of threads. To deal with this problem, the industry and the academia have introduced annotation systems. Examples of those systems are OpenMP 4.0, OpenSS and OpenACC, which allow developers to indicate which parts of a C or Fortran program should perform on GPU or CPU. This approach has two advantages. First, it lets programmers to obtain the benefits of the parallel hardwarewhile coding in their preferred programming languages. Second, the annotations protect programmers from details of the parallel hardware, once they pass the task of parallelizing programs for the code generator.The inclusion of pragmas in the code to hide details of the hardware does not solve all the problems that developers face when programming GPUs: they still need to identify when it will be advantageous to run a given piece of code on the GPU. In this context, the objective of this dissertation is to present a solution to such a problem. It was designed, implemented and tested techniques to automatically identify whichportions of the code must run on the GPU. For this, we used dependency information, memory layout and control flow. Were created a set of static analysis that performs three tasks: (i) identify which loops are parallelizable; (ii) insert annotations to copy data between the CPU and the GPU; (iii) estimate which loops, once tagged as parallels, are most likely to lead to performance gains. These tasks are totally automatic, are carried out without any user intervention. The platform that is presented has been implemented on two compilers. The analyses were built on top of the infrastructure available in LLVM. The parallel code generation, from annotated programs, is made by PGCC. The approach that we have developed is completely static: we decide where each function must run during the compilation of the program. This decision does not relyon any runtime system such as a middleware, or special computer architecture hooks. Another benefit of this framework is that it is completely automatic, i.e. it does not require any intervention from the programmer. As a result, programs that we produce automatically - can be up to 121x faster than their original versions. |
| id |
UFMG_e80c7d293ea8a19103b00f7d32bbe3c1 |
|---|---|
| oai_identifier_str |
oai:repositorio.ufmg.br:1843/ESBF-AE2GAF |
| network_acronym_str |
UFMG |
| network_name_str |
Repositório Institucional da UFMG |
| repository_id_str |
|
| spelling |
Colocação Automática de Computação em Hardware HeterogêneoLinguagem de programação (Computação)Compiladores (Computadores)ComputaçãoAnálise estáticaCompiladoresAnálise EstáticaLinguagens de ProgramaçãoParalelismoGPGPUGraphics processing Units (GPUs) have revolutionized high performance programming.They reduced the cost of parallel hardware, however, programming these devices is still a challenge. Programmers are not able (yet) to write code to coordinate the simultaneousperformance of thousands of threads. To deal with this problem, the industry and the academia have introduced annotation systems. Examples of those systems are OpenMP 4.0, OpenSS and OpenACC, which allow developers to indicate which parts of a C or Fortran program should perform on GPU or CPU. This approach has two advantages. First, it lets programmers to obtain the benefits of the parallel hardwarewhile coding in their preferred programming languages. Second, the annotations protect programmers from details of the parallel hardware, once they pass the task of parallelizing programs for the code generator.The inclusion of pragmas in the code to hide details of the hardware does not solve all the problems that developers face when programming GPUs: they still need to identify when it will be advantageous to run a given piece of code on the GPU. In this context, the objective of this dissertation is to present a solution to such a problem. It was designed, implemented and tested techniques to automatically identify whichportions of the code must run on the GPU. For this, we used dependency information, memory layout and control flow. Were created a set of static analysis that performs three tasks: (i) identify which loops are parallelizable; (ii) insert annotations to copy data between the CPU and the GPU; (iii) estimate which loops, once tagged as parallels, are most likely to lead to performance gains. These tasks are totally automatic, are carried out without any user intervention. The platform that is presented has been implemented on two compilers. The analyses were built on top of the infrastructure available in LLVM. The parallel code generation, from annotated programs, is made by PGCC. The approach that we have developed is completely static: we decide where each function must run during the compilation of the program. This decision does not relyon any runtime system such as a middleware, or special computer architecture hooks. Another benefit of this framework is that it is completely automatic, i.e. it does not require any intervention from the programmer. As a result, programs that we produce automatically - can be up to 121x faster than their original versions.Universidade Federal de Minas Gerais2019-08-11T07:51:44Z2025-09-08T23:15:08Z2019-08-11T07:51:44Z2016-06-24info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/1843/ESBF-AE2GAFKezia Correa Andradeinfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-09T18:04:45Zoai:repositorio.ufmg.br:1843/ESBF-AE2GAFRepositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T18:04:45Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
| dc.title.none.fl_str_mv |
Colocação Automática de Computação em Hardware Heterogêneo |
| title |
Colocação Automática de Computação em Hardware Heterogêneo |
| spellingShingle |
Colocação Automática de Computação em Hardware Heterogêneo Kezia Correa Andrade Linguagem de programação (Computação) Compiladores (Computadores) Computação Análise estática Compiladores Análise Estática Linguagens de Programação Paralelismo GPGPU |
| title_short |
Colocação Automática de Computação em Hardware Heterogêneo |
| title_full |
Colocação Automática de Computação em Hardware Heterogêneo |
| title_fullStr |
Colocação Automática de Computação em Hardware Heterogêneo |
| title_full_unstemmed |
Colocação Automática de Computação em Hardware Heterogêneo |
| title_sort |
Colocação Automática de Computação em Hardware Heterogêneo |
| author |
Kezia Correa Andrade |
| author_facet |
Kezia Correa Andrade |
| author_role |
author |
| dc.contributor.author.fl_str_mv |
Kezia Correa Andrade |
| dc.subject.por.fl_str_mv |
Linguagem de programação (Computação) Compiladores (Computadores) Computação Análise estática Compiladores Análise Estática Linguagens de Programação Paralelismo GPGPU |
| topic |
Linguagem de programação (Computação) Compiladores (Computadores) Computação Análise estática Compiladores Análise Estática Linguagens de Programação Paralelismo GPGPU |
| description |
Graphics processing Units (GPUs) have revolutionized high performance programming.They reduced the cost of parallel hardware, however, programming these devices is still a challenge. Programmers are not able (yet) to write code to coordinate the simultaneousperformance of thousands of threads. To deal with this problem, the industry and the academia have introduced annotation systems. Examples of those systems are OpenMP 4.0, OpenSS and OpenACC, which allow developers to indicate which parts of a C or Fortran program should perform on GPU or CPU. This approach has two advantages. First, it lets programmers to obtain the benefits of the parallel hardwarewhile coding in their preferred programming languages. Second, the annotations protect programmers from details of the parallel hardware, once they pass the task of parallelizing programs for the code generator.The inclusion of pragmas in the code to hide details of the hardware does not solve all the problems that developers face when programming GPUs: they still need to identify when it will be advantageous to run a given piece of code on the GPU. In this context, the objective of this dissertation is to present a solution to such a problem. It was designed, implemented and tested techniques to automatically identify whichportions of the code must run on the GPU. For this, we used dependency information, memory layout and control flow. Were created a set of static analysis that performs three tasks: (i) identify which loops are parallelizable; (ii) insert annotations to copy data between the CPU and the GPU; (iii) estimate which loops, once tagged as parallels, are most likely to lead to performance gains. These tasks are totally automatic, are carried out without any user intervention. The platform that is presented has been implemented on two compilers. The analyses were built on top of the infrastructure available in LLVM. The parallel code generation, from annotated programs, is made by PGCC. The approach that we have developed is completely static: we decide where each function must run during the compilation of the program. This decision does not relyon any runtime system such as a middleware, or special computer architecture hooks. Another benefit of this framework is that it is completely automatic, i.e. it does not require any intervention from the programmer. As a result, programs that we produce automatically - can be up to 121x faster than their original versions. |
| publishDate |
2016 |
| dc.date.none.fl_str_mv |
2016-06-24 2019-08-11T07:51:44Z 2019-08-11T07:51:44Z 2025-09-08T23:15:08Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1843/ESBF-AE2GAF |
| url |
https://hdl.handle.net/1843/ESBF-AE2GAF |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
| instname_str |
Universidade Federal de Minas Gerais (UFMG) |
| instacron_str |
UFMG |
| institution |
UFMG |
| reponame_str |
Repositório Institucional da UFMG |
| collection |
Repositório Institucional da UFMG |
| repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
| repository.mail.fl_str_mv |
repositorio@ufmg.br |
| _version_ |
1856413890863169536 |