Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 21

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems

Efficient Computation of RNA Partition Functions Using McCaskill’s Algorithm

,

DOI: http://dx.doi.org/10.15439/2020F85

Citation: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 21, pages 449452 ()

Full text

Abstract. We develop efficient single- and multi-core algorithms to compute partition functions for RNA sequences. Our algorithms, which are based on McCaskill's algorithm, are benchmarked against state-of-the-art fast algorithms obtained using the parallelizing source-to-source compilers PLUTO and TRACO. On our Intel I9 computational platform, our best single core algorithm takes up to 81.2\% less time than the single core algorithm resulting from PLUTO, which is faster than that obtained from TRACO. Our best multi-core algorithm takes up to 84.7\% less time than the multi-core algorithm obtained using TRACO when run with 20 threads (our I9 has 10 cores and supports hyperthreading); the TRACO multi-core algorithm is faster than the PLUTO one.

References

  1. M. S. Waterman and T. F. Smith, “RNA secondary structure: A complete mathematical analysis,” Mathematical Biosciences, vol. 42, no. 3-4, pp. 257–266, 1978.
  2. R. Nussinov, G. Pieczenik, J. R. Griggs, and D. J. Kleitman, “Algorithms for loop matchings,” SIAM Journal on Applied mathematics, vol. 35, no. 1, pp. 68–82, 1978.
  3. M. Zuker and P. Stiegler, “Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information,” Nucleic acids research, vol. 9, no. 1, pp. 133–148, 1981.
  4. M. Zuker and D. Sankoff, “RNA secondary structures and their prediction,” Bulletin of mathematical biology, vol. 46, no. 4, pp. 591–621, 1984.
  5. M. Zuker, “Computer prediction of RNA structure,” in Methods in enzymology. Elsevier, 1989, vol. 180, pp. 262–288.
  6. M. Zuker, “On finding all suboptimal foldings of an RNA molecule,” Science, vol. 244, no. 4900, pp. 48–52, 1989.
  7. J. S. McCaskill, “The equilibrium partition function and base pair binding probabilities for RNA secondary structure,” Biopolymers: Original Research on Biomolecules, vol. 29, no. 6-7, pp. 1105–1119, 1990.
  8. S. Will, K. Reiche, I. L. Hofacker, P. F. Stadler, and R. Backofen, “Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering,” PLoS computational biology, vol. 3, no. 4, 2007.
  9. I. L. Hofacker, S. H. Bernhart, and P. F. Stadler, “Alignment of RNA base pairing probability matrices,” Bioinformatics, vol. 20, no. 14, pp. 2222–2227, 2004.
  10. Z. J. Lu, J. W. Gloor, and D. H. Mathews, “Improved RNA secondary structure prediction by maximizing expected pair accuracy,” Rna, vol. 15, no. 10, pp. 1805–1813, 2009.
  11. M. Palkowski and W. Bielecki, “Parallel tiled cache and energy efficient codes for o (n4) RNA folding algorithms,” Journal of Parallel and Distributed Computing, vol. 137, pp. 252–258, 2020.
  12. J. Li, S. Ranka, and S. Sahni, “Multicore and GPU algorithms for Nussinov RNA folding,” BMC bioinformatics, vol. 15, no. 8, p. S1, 2014.
  13. A. Mathuriya, D. A. Bader, C. E. Heitsch, and S. C. Harvey, “GTfold: a scalable multicore code for RNA secondary structure prediction,” in Proceedings of the 2009 ACM symposium on Applied Computing, 2009, pp. 981–988.
  14. M. S. Swenson, J. Anderson, A. Ash, P. Gaurav, Z. Sükösd, D. A. Bader, S. C. Harvey, and C. E. Heitsch, “GTfold: Enabling parallel RNA secondary structure prediction on multi-core desktops,” BMC research notes, vol. 5, no. 1, p. 341, 2012.
  15. G. Tan, N. Sun, and G. R. Gao, “A parallel dynamic programming algorithm on a multi-core architecture,” in Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, 2007, pp. 135–144.
  16. T. Estrada, A. Licon, and M. Taufer, “CompPknots: a framework for parallel prediction and comparison of RNA secondary structures with pseudoknots,” in International Symposium on Parallel and Distributed Processing and Applications. Springer, 2006, pp. 677–686.
  17. F. Xia, Y. Dou, X. Zhou, X. Yang, J. Xu, and Y. Zhang, “Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA,” BMC bioinformatics, vol. 10, no. S1, p. S37, 2009.
  18. A. Jacob, J. Buhler, and R. D. Chamberlain, “Accelerating Nussinov RNA secondary structure prediction with systolic arrays on FPGAs,” in 2008 International Conference on Application-Specific Systems, Architectures and Processors. IEEE, 2008, pp. 191–196.
  19. Y. Dou, F. Xia, and J. Jiang, “Fine-grained parallel application specific computing for RNA secondary structure prediction using SCFGS on FPGA,” in Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems, 2009, pp. 107–116.
  20. G. Rizk, D. Lavenier, and S. Rajopadhye, “GPU accelerated RNA folding algorithm,” in GPU Computing Gems Emerald Edition. Elsevier, 2011, pp. 199–210.
  21. D.-J. Chang, C. Kimmer, and M. Ouyang, “Accelerating the Nussinov RNA folding algorithm with CUDA/GPU,” in The 10th IEEE International Symposium on Signal Processing and Information Technology. IEEE, 2010, pp. 120–125.
  22. M. Fekete, I. L. Hofacker, and P. F. Stadler, “Prediction of RNA base pairing probabilities on massively parallel computers,” Journal of Computational Biology, vol. 7, no. 1-2, pp. 171–182, 2000.
  23. M. Palkowski and W. Bielecki, “Parallel cache-efficient code for computing the mccaskill partition functions,” in 2019 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2019, pp. 207–210.
  24. U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, “A practical and fully automatic polyhedral program optimization system,” in ACM SIGPLAN PLDI, vol. 10, no. 1375581.1375595, 2008.
  25. U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan, “Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model,” in International Conference on Compiler Construction. Springer, 2008, pp. 132–146.
  26. M. Palkowski and W. Bielecki, “TRACO: source-to-source parallelizing compiler,” Computing and Informatics, vol. 35, no. 6, pp. 1277–1306, 2017.
  27. C. Zhao and S. Sahni, “Cache and energy efficient algorithms for Nussinov’s RNA folding,” BMC bioinformatics, vol. 18, no. 15, p. 518, 2017.
  28. C. Zhao and S. Sahni, “Efficient RNA folding using Zuker’s method,” in 2017 IEEE 7th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS). IEEE, 2017, pp. 1–6.
  29. S. Sahni, “Data structures, algorithms, and applications in c++, second edition.” Silicon Press, 2005.
  30. “Ncbi database,” http://www.ncbi.nlm.nih.gov/gquery.