Skip to main content

Datasets

Standard Dataset

DNA sequence alignment datasets based on NW algorithm

Citation Author(s):
Amr Rashed (Lecturer with the Computer Engineering Department, Faculty of Computers and Information Technology, Taif University, Saudi Arabia)
Submitted by:
Amr Rashed
Last updated:
DOI:
10.21227/45dr-8p86
Data Format:
Research Article Link:
Links:
No Ratings Yet

Abstract

This study presented six datasets for DNA/RNA sequence alignment for one of the most common alignment algorithms, namely, the Needleman–Wunsch (NW) algorithm. This research proposed a fast and parallel implementation of the NW algorithm by using machine learning techniques. This study is an extension and improved version of our previous work . The current implementation achieves 99.7% accuracy using a multilayer perceptron with ADAM optimizer and up to 2912 giga cell updates per second on two real DNA sequences with a of length 4.1 M nucleotides. Our implementation is valid for extremely long sequences by using the divide-and-conquer strategy.

Instructions:

these datasets are illustrated in a manuscript submitted to IEEE OPEN ACCESS entitled “Parallel Implementation of the Needleman–Wunsch Algorithm Using Machine Learning Algorithms”.