DNA sequence alignment datasets based on NW algorithm
This study presented six datasets for DNA/RNA sequence alignment for one of the most common alignment algorithms, namely, the Needleman–Wunsch (NW) algorithm. This research proposed a fast and parallel implementation of the NW algorithm by using machine learning techniques. This study is an extension and improved version of our previous work . The current implementation achieves 99.7% accuracy using a multilayer perceptron with ADAM optimizer and up to 2912 giga cell updates per second on two real DNA sequences with a of length 4.1 M nucleotides. Our implementation is valid for extremely long sequences by using the divide-and-conquer strategy.
these datasets are illustrated in a manuscript submitted to IEEE OPEN ACCESS entitled “Parallel Implementation of the Needleman–Wunsch Algorithm Using Machine Learning Algorithms”.