Datasets
Standard Dataset
Tara Oceans Sample Kinase Dataset
- Citation Author(s):
- Submitted by:
- Scott Harrison
- Last updated:
- Wed, 07/12/2023 - 12:30
- DOI:
- 10.21227/yrvx-xn64
- License:
- Categories:
- Keywords:
Abstract
This study first gathered together protein sequences of kinases found for yeasts along with associated annotations and aspects of their features. Data for these yeast kinases were obtained using the Yeast Kinome database (https://yeastkinome.org/, retrieved 04/08/2021. Protein sequences corresponding to each of the following five kinase genes from S. cerevisiae strain ATCC 204508 / S288c were as follows, listed by gene name and corresponding NCBI accession number: CBK1 (QHB11221.1), KIN1 (KZV12362.1), KIN2 (CAA97659.1), CDC28 (CAA85119.1), and DFB20 (KZV07624.1). Protein sequences were inferred from metatranscriptomic data based on features in MATOU and alignments provided through homology searching in MATOU (https://tara-oceans.mio.osupytheas.fr/ocean-gene-atlas/; 04/08/2021). We collected homology hits through homology searches of CBK1, CDC28. KIN1, KIN2 and DFB20 protein sequences from the S. cerevisiae strain ATCC 204508 / S288c. The files abundance_matrix.csv and environmental_parameters.csv were generated from the website for each homolog hit search. For the environmental_parameters.csv file, there are 55 columns for environmental data that chart some of the physical and chemical characteristics of each sampling environment.
See comments in the provided R script regarding this data set.
Dataset Files
- Tara Oceans Sample Kinase Dataset.zip (392.32 MB)
- EDA_Yeast_Cell_Cycle_Kinases.R (40.60 kB)