Datasets
Standard Dataset
GDB-9-Ex_EOM-CCSD-SUBSET-100

- Citation Author(s):
- Submitted by:
- kshitij mehta
- Last updated:
- Tue, 03/11/2025 - 19:57
- DOI:
- 10.21227/zjk1-zp13
- Data Format:
- Research Article Link:
- License:
- Categories:
- Keywords:
Abstract
This is a subset of the original GDB-9-Ex_EOM-CCSD dataset at https://doi.org/10.13139/OLCF/2318313. It consists of 100 randomly selected molecules from the original dataset that consists of 80,593 molecules. This dataset contains data-intensive quantum chemical electronic structure calculations for organic molecules of the GDB-9-Ex dataset. Calculations were performed using the Equation of Motion Coupled Cluster (EOM-CCSD) first principles method using the ORCA software. It provides UV-vis spectra calculations of molecules with a high level of accuracy. The optical spectra behavior was collected based on the optimized molecular geometries in the DFTB method with 3ob parameters. All calculations utilized the def2-TZVP basis sets with the auxiliary def2/J and def2-TZVP/C basis sets. The similarity-transformed EOM-CCSD method that used domain-based local pair natural orbitals (DLPNO) approximation which constitutes the STEOM-DLPNO-CCSD method was used. This method is based on the STEOM approach and was found to make accurate predictions of transition energies for organic molecules. For the excitation energy calculations, the lowest 50 excitation states were calculated.
The dataset consists of 100 directories, one for each of the 100 molecules in the dataset. Each directory contains two files:
geo_end.xyz: ASCII file containing optimized geometry of the molecule with cartesian coordinates.
orca.stdout: Standard ASCII output generated by the ORCA software containing the output of the TDDFT calculations.
Molecule directories are named ‘mol_num’ where num corresponds to the id of the molecule from GDB-9-Ex dataset. It does not possess any practical significance as the geometry of the molecule is provided in the included geo_end.xyz file.
Total dataset size: 101 MiB
More from this Author