optiGAN training Datasets

- Citation Author(s):
- Submitted by:
- Carlotta Trigila
- Last updated:
- DOI:
- 10.21227/3d2r-r491
- Data Format:
- Categories:
- Keywords:
Abstract
Dataset used in an IOP publication under review in the journal Physics in Medicine & Biology.
Schematic of the training dataset. Accurate optical simulations were carried out at various emission positions to generate the dataset for training the conditional generative adversarial network optiGAN. (left) A total of 30 emission points located within 1/8 of the crystal volume were used to build the training dataset. Images are not to scale. (b) Optical photon distributions (position X and Y, energy EKine, and Time) were recorded for all training points (30k entries per point) in a multidimensional matrix that included the emission positions as a condition (Dataset1, shown in green). A second dataset was also generated (Dataset2), which included only the Time distribution and the 3D class label.
Instructions:
The dataset was generated by running optical simulations using the new Python-based GATE (v10). To create the training dataset, simulations were conducted at 30 distinct emission positions within the crystal, each defined by specific X-Y-Z coordinates. At each position, electrons were emitted, generating at least 30k optical photons that reached the photodetector interface. The characteristics of these photons, measured at the coupling-photodetector boundary, were saved in a phase-space GATE root file. The data included the X and Y coordinates at the interface (with the Z coordinate fixed to match the photodetector location), the kinetic energy (EKine), and the time elapsed since the event beginning (Time). These distributions were stored as tabular data, with the 3D emission source position used as the class label to tag the distributions. The tabular data for the 30 training points was concatenated to create a single training matrix. This formed a first multidimensional dataset (Dataset1) used to train our conditional GAN, comprising a total of four distributions (X, Y, Ekine, log(Time)), and a 3D class label defined by the optical photon emission position. To evaluate the GAN's ability to accurately reproduce the detector's timing characteristics, an additional training dataset containing only the logarithm of the Time distribution and the 3D class label (Dataset2) was created.