Graduate Theses & Dissertations

An Investigation of the Impact of Big Data on Bioinformatics Software
As the generation of genetic data accelerates, Big Data has an increasing impact on the way bioinformatics software is used. The experiments become larger and more complex than originally envisioned by software designers. One way to deal with this problem is to use parallel computing. Using the program Structure as a case study, we investigate ways in which to counteract the challenges created by the growing datasets. We propose an OpenMP and an OpenMP-MPI hybrid parallelization of the MCMC steps, and analyse the performance in various scenarios. The results indicate that the parallelizations produce significant speedups over the serial version in all scenarios tested. This allows for using the available hardware more efficiently, by adapting the program to the parallel architecture. This is important because not only does it reduce the time required to perform existing analyses, but it also opens the door to new analyses, which were previously impractical. Author Keywords: Big Data, HPC, MCMC, parallelization, speedup, Structure

Search Our Digital Collections

Query

Enabled Filters

  • (-) ≠ Farell
  • (-) ≠ Bell
  • (-) ≠ Theory, Culture and Politics
  • (-) ≠ Sustainability
  • (-) ≠ Chambers
  • (-) = Computer science
  • (-) = Dobosz, Rafal

Filter Results

Date

2011 - 2021
(decades)
Specify date range: Show
Format: 2021/04/23

Author Name

Degree

Subject (Topic)