CFP last date
15 November 2024
Reseach Article

A Translation Technique for Parallelizing Sequential Code using a Single Level Model

by Hisham M. Alosaimi, Abdullah M. Algarni, Fathy E. Eassa
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 12 - Number 36
Year of Publication: 2021
Authors: Hisham M. Alosaimi, Abdullah M. Algarni, Fathy E. Eassa
10.5120/ijais2020451903

Hisham M. Alosaimi, Abdullah M. Algarni, Fathy E. Eassa . A Translation Technique for Parallelizing Sequential Code using a Single Level Model. International Journal of Applied Information Systems. 12, 36 ( March 2021), 30-40. DOI=10.5120/ijais2020451903

@article{ 10.5120/ijais2020451903,
author = { Hisham M. Alosaimi, Abdullah M. Algarni, Fathy E. Eassa },
title = { A Translation Technique for Parallelizing Sequential Code using a Single Level Model },
journal = { International Journal of Applied Information Systems },
issue_date = { March 2021 },
volume = { 12 },
number = { 36 },
month = { March },
year = { 2021 },
issn = { 2249-0868 },
pages = { 30-40 },
numpages = {9},
url = { https://www.ijais.org/archives/volume12/number36/1112-2020451903/ },
doi = { 10.5120/ijais2020451903 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T19:11:03.480440+05:30
%A Hisham M. Alosaimi
%A Abdullah M. Algarni
%A Fathy E. Eassa
%T A Translation Technique for Parallelizing Sequential Code using a Single Level Model
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 12
%N 36
%P 30-40
%D 2021
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Running the code sequentially can be slower, and then the execution time will increase in case of the code has compute-intensive parts. Unfortunately, the sequential code does not employ the device's resources in ideal shape, because it executes one instruction at a time, which means it can perform only a single thread. To overcome the massive time taking issue while large executions, using a paralleling computing approach is a vital solution. A parallel computing code reduces the execution time by executing multiple tasks at the same time. Most researchers and programmers face some difficulties to run their sequential code as parallel due to a lack of knowledge about parallel programming models and the dependency analysis on their codes. Therefore, auto parallelization tools can be helpful to solve this issue. In this study, we have introduced a novel automatic serial to parallel code translation technique that takes serial code written in C++ as an input and generates its parallel code automatically. To validate the objectives of the current study, we compare the results of our proposed method with existing methods. Consequently, the proposed AP4OpenACC tool outperformed the other existing method mentioned in comparative analysis.

References
  1. S. Perarnau, R. Gupta, and P. Beckman, “Argo: An Exascale Operating System and Runtime,” p. 2.
  2. J. Shalf, S. Dosanjh, and J. Morrison, “Exascale Computing Technology Challenges,” in High Performance Computing for Computational Science – VECPAR 2010, Jun. 2010, pp. 1–25, doi: 10.1007/978-3-642-19328-6_1.
  3. “Petascale adaptive computational fluid dynamics - ProQuest.” http://kau.proxy.deepknowledge.io/MuseSessionID=0o10p48z4/MuseProtocol=https/MuseHost=search.proquest.com/MusePath/docview/304985752/20FC96D8B18547A7PQ/1?accountid=43793 (accessed Nov. 20, 2020).
  4. J. J. Dongarra and D. W. Walker, “The quest for petascale computing,” Comput. Sci. Eng., vol. 3, no. 3, pp. 32–39, May 2001, doi: 10.1109/5992.919263.
  5. S. Prema, R. Jehadeesan, B. K. Panigrahi, and S. A. V. Satya Murty, “Dependency analysis and loop transformation characteristics of auto-parallelizers,” in 2015 National Conference on Parallel Computing Technologies (PARCOMPTECH), Bengaluru, India, Feb. 2015, pp. 1–6, doi: 10.1109/PARCOMPTECH.2015.7084524.
  6. A. Tabuchi, M. Nakao, and M. Sato, “A Source-to-Source OpenACC Compiler for CUDA,” in Euro-Par 2013: Parallel Processing Workshops, Aug. 2013, pp. 178–187, doi: 10.1007/978-3-642-54420-0_18.
  7. A. Barve, S. Khomane, B. Kulkarni, S. Ghadage, and S. Katare, “Parallelism in C++ programs targeting objects,” in 2017 International Conference on Advances in Computing, Communication and Control (ICAC3), Dec. 2017, pp. 1–6, doi: 10.1109/ICAC3.2017.8318759.
  8. A. Barve, S. Khomane, B. Kulkarni, S. Katare, and S. Ghadage, “A serial to parallel C++ code converter for multi-core machines,” in 2016 International Conference on ICT in Business Industry Government (ICTBIG), Nov. 2016, pp. 1–5, doi: 10.1109/ICTBIG.2016.7892700.
  9. E. Strohmaier, J. J. Dongarra, H. W. Meuer, and H. D. Simon, “Recent trends in the marketplace of high performance computing,” Parallel Comput., vol. 31, no. 3, pp. 261–273, Mar. 2005, doi: 10.1016/j.parco.2005.02.001.
  10. K. Krewell, “What’s the Difference Between a CPU and a GPU?,” The Official NVIDIA Blog, Dec. 16, 2009. https://blogs.nvidia.com/blog/2009/12/16/whats-the-difference-between-a-cpu-and-a-gpu/ (accessed Mar. 20, 2019).
  11. “Parallel Computing on a Personal Computer | Biomedical Computation Review.” http://www.bcr.org/content/parallel-computing-personal-computer (accessed Nov. 25, 2020).
  12. M. Arora, “The Architecture and Evolution of CPU-GPU Systems for General Purpose Computing,” undefined, 2012. /paper/The-Architecture-and-Evolution-of-CPU-GPU-Systems-Arora/8a77e1722b37fe3d8f5ac56bb50e548b218c4427 (accessed Nov. 21, 2020).
  13. “Combining GPU data-parallel computing with OpenGL | ACM SIGGRAPH 2013 Courses.” http://0o10e49gv.y.https.dl.acm.org.kau.proxy.deepknowledge.io/doi/10.1145/2504435.2504449 (accessed Nov. 21, 2020).
  14. A. R. Brodtkorb, C. Dyken, T. R. Hagen, J. M. Hjelmervik, and O. O. Storaasli, “State-of-the-art in heterogeneous computing,” Sci. Program., vol. 18, no. 1, pp. 1–33, Jan. 2010, doi: 10.3233/SPR-2009-0296.
  15. C. Cullinan, T. R. Frattesi, and C. Wyant, “Computing Performance Benchmarks among CPU, GPU, and FPGA,” undefined, 2012. /paper/Computing-Performance-Benchmarks-among-CPU%2C-GPU%2C-Cullinan-Frattesi/cbecd8cfb5264f8b36dee412c5980e3305c996e6 (accessed Nov. 21, 2020).
  16. S. Mittal and J. S. Vetter, “A Survey of Methods for Analyzing and Improving GPU Energy Efficiency,” ACM Comput. Surv., vol. 47, no. 2, p. 19:1-19:23, Aug. 2014, doi: 10.1145/2636342.
  17. N. Singh, “Automatic parallelization using OpenMP API,” in 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), Oct. 2016, pp. 291–294, doi: 10.1109/SCOPES.2016.7955837.
  18. Y. Qian, “Automatic Parallelization Tools,” p. 5, 2012.
  19. A. Athavale et al., “Automatic Sequential to Parallel Code Conversion,” p. 7.
  20. M. Mathews and J. P. Abraham, “Implementing Coarse Grained Task Parallelism Using OpenMP,” vol. 6, p. 4, 2015.
  21. “Message Passing Interface (MPI).” https://computing.llnl.gov/tutorials/mpi/ (accessed Nov. 27, 2020).
  22. A. Podobas and S. Karlsson, “Towards Unifying OpenMP Under the Task-Parallel Paradigm,” in OpenMP: Memory, Devices, and Tasks, Oct. 2016, pp. 116–129, doi: 10.1007/978-3-319-45550-1_9.
  23. J. D. Owens et al., “A Survey of General-Purpose Computation on Graphics Hardware,” Comput. Graph. Forum, vol. 26, no. 1, pp. 80–113, Mar. 2007, doi: 10.1111/j.1467-8659.2007.01012.x.
  24. Department of Computer Science, King abdulaziz University Jeddah, Saudi Arabia, M. U. Ashraf, F. Fouz, and F. Alboraei Eassa, “Empirical Analysis of HPC Using Different Programming Models,” Int. J. Mod. Educ. Comput. Sci., vol. 8, no. 6, pp. 27–34, Jun. 2016, doi: 10.5815/ijmecs.2016.06.04.
  25. M. Scarpino, “OpenCL in Action: How to Accelerate Graphics and Computations,” Dec. 2011, Accessed: Nov. 21, 2020. [Online]. Available: https://hgpu.org/?p=6708.
  26. N. Newsroom, “NVIDIA, Cray, PGI, CAPS Unveil ‘OpenACC’ Programming Standard for Parallel Computing,” NVIDIA Newsroom Newsroom. http://nvidianews.nvidia.com/news/nvidia-cray-pgi-caps-unveil-openacc-programming-standard-for-parallel-computing (accessed Nov. 25, 2020).
  27. “Getting Started with OpenACC,” NVIDIA Developer Blog, Jul. 14, 2015. https://developer.nvidia.com/blog/getting-started-openacc/ (accessed Nov. 25, 2020).
  28. “OpenACC Tutorial - Adding directives - CC Doc.” https://docs.computecanada.ca/wiki/OpenACC_Tutorial_-_Adding_directives (accessed Nov. 21, 2020).
  29. “OpenACC: More Science Less Programming,” NVIDIA Developer, Jan. 13, 2012. https://developer.nvidia.com/openacc (accessed Nov. 21, 2020).
  30. S. Wienke, P. Springer, C. Terboven, and D. an Mey, “OpenACC — First Experiences with Real-World Applications,” in Euro-Par 2012 Parallel Processing, Aug. 2012, pp. 859–870, doi: 10.1007/978-3-642-32820-6_85.
  31. “OpenACC Directives,” NVIDIA Developer, Mar. 02, 2016. https://developer.nvidia.com/openacc/overview (accessed Mar. 26, 2021).
  32. A. G. Bhat, M. N. Babu, and A. M. R, “Towards automatic parallelization of ‘for’ loops,” in 2015 IEEE International Advance Computing Conference (IACC), Jun. 2015, pp. 136–142, doi: 10.1109/IADCC.2015.7154686.
  33. B. Li, “Manual and Automatic Translation From Sequential to Parallel Programming On Cloud Systems,” Comput. Sci. Diss., Apr. 2018, [Online]. Available: https://scholarworks.gsu.edu/cs_diss/135.
  34. A. Alghamdi and F. Eassa, “Parallel Hybrid Testing Tool for Applications Developed by Using MPI + OpenACC Dual-Programming Model,” vol. 4, pp. 203–210, Mar. 2019, doi: 10.25046/aj040227.
  35. E. Calore, A. Gabbana, J. Kraus, S. F. Schifano, and R. Tripiccione, “Performance and portability of accelerated lattice Boltzmann applications with OpenACC,” Concurr. Comput. Pract. Exp., vol. 28, no. 12, pp. 3485–3502, Aug. 2016, doi: 10.1002/cpe.3862.
  36. J. A. Herdman, W. P. Gaudin, O. Perks, D. A. Beckingsale, A. C. Mallinson, and S. A. Jarvis, “Achieving Portability and Performance through OpenACC,” in 2014 First Workshop on Accelerator Programming using Directives, Nov. 2014, pp. 19–26, doi: 10.1109/WACCPD.2014.10.
  37. J. Kraus, M. Schlottke, A. Adinetz, and D. Pleiter, “Accelerating a C++ CFD Code with OpenACC,” in 2014 First Workshop on Accelerator Programming using Directives, Nov. 2014, pp. 47–54, doi: 10.1109/WACCPD.2014.11.
  38. “[PDF] DawnCC: a Source-to-Source Automatic Parallelizer of C and C++ Programs | Semantic Scholar.” https://www.semanticscholar.org/paper/DawnCC%3A-a-Source-to-Source-Automatic-Parallelizer-C-Guimar%C3%A3es-Mendonca/ac4ee9490909aa0161e8278596e85ecd6ece4148 (accessed Dec. 01, 2020).
  39. M. U. Ashraf, F. A. Eassa, and A. A. Albeshri, “Massive Parallel Computational Model for Heterogeneous Exascale Computing System,” in 2017 9th IEEE-GCC Conference and Exhibition (GCCCE), May 2017, pp. 1–6, doi: 10.1109/IEEEGCC.2017.8448062.
  40. “org.antlr.v4.runtime Class Hierarchy (ANTLR 4 Runtime 4.9 API).” https://www.antlr.org/api/Java/org/antlr/v4/runtime/package-tree.html (accessed Dec. 01, 2020).
  41. “Antlr 4 - Listener vs Visitor.” https://jakubdziworski.github.io/java/2016/04/01/antlr_visitor_vs_listener.html (accessed Dec. 01, 2020).
  42. M. Viñas, B. B. Fraguela, D. Andrade, and R. Doallo, “Towards a High Level Approach for the Programming of Heterogeneous Clusters,” in 2016 45th International Conference on Parallel Processing Workshops (ICPPW), Aug. 2016, pp. 106–114, doi: 10.1109/ICPPW.2016.30.
  43. R. Xu, X. Tian, S. Chandrasekaran, Y. Yan, and B. Chapman, OpenACC Parallelization and Optimization of NAS Parallel Benchmarks. 2014.
  44. A. Paudel and S. Puri, “OpenACC Based GPU Parallelization of Plane Sweep Algorithm for Geometric Intersection,” in Accelerator Programming Using Directives, Nov. 2018, pp. 114–135, doi: 10.1007/978-3-030-12274-4_6.
  45. “OpenACC Programming and Best Practices Guide,” p. 64.
  46. “HPC SDK | NVIDIA,” NVIDIA Developer, Mar. 11, 2020. https://developer.nvidia.com/hpc-sdk (accessed Dec. 13, 2020).
  47. J. Zapletal, “Amdahl’s and Gustafson’s laws,” p. 28.
Index Terms

Computer Science
Information Sciences

Keywords

OpenACC GPU ANTLR Automatic Translation