We have long known that the characterization of protein three-dimensional structure is key to obtaining a detailed understanding of protein function. Computational approaches to protein structure characterization have largely addressed a narrow formulation of the problem, where the goal is the determination of one structure, also known as the native structure, from a given protein amino-acid sequence. However, many researchers over the years have argued for broadening our view of proteins to account for the multiplicity of native structures.

Instructions: 

The .zip file contains 3 folders when unzipped. We provide the details of each folder below.

 

“monomorphic_benchmark_targets” folder: Contains 20 protein targets organized into 20 subfolders. Data for each protein is provided in a subfolder named with its pdb id. Each such subfolder contains the following 4 files.

  1. A .fasta file containing the amino-acid sequence of the protein.

  2. A .pdb file containing the native tertiary conformation coordinates. Detailed format for a .pdb file can be found in http://www.wwpdb.org/documentation/file-format

  3. A .frag3 file containing the fragments of length 3 for the protein sequence generated from http://old.robetta.org/

  4. A .frag9 file containing the fragments of length 9 for the protein sequence generated from http://old.robetta.org/

 

“monomorphic_casp_targets” folder: Contains 10 protein targets organized into 10 subfolders. Data for each protein is provided in a subfolder named with its casp id. Each such subfolder contains the following 4 files.

  1. A .fasta file containing the amino-acid sequence of the protein.

  2. A .pdb file containing the native tertiary conformation coordinates.

  3. A .frag3 file containing the fragments of length 3 for the protein sequence generated from http://old.robetta.org/

  4. A .frag9 file containing the fragments of length 9 for the protein sequence generated from http://old.robetta.org/

 

“metamorphic_benchmark_targets” folder: Contains 18 pairs of protein targets organized into 18 subfolders. Data for each target pair is provided in a subfolder named with its pair id (as indicated in the paper). Each such subfolder contains the following 5 files.

  1. A .fasta file containing the amino-acid sequence common to the pair of target proteins.

  2. A .pdb file containing the native tertiary conformation coordinates for the first target in the target pair.

  3. A .pdb file containing the native tertiary conformation coordinates for the second target in the target pair.

  4. A .frag3 file containing the fragments of length 3 for the protein sequence generated from http://old.robetta.org/

  5. A .frag9 file containing the fragments of length 9 for the protein sequence generated from http://old.robetta.org/

Categories:
70 Views