Disclaimer
DARPA is releasing these files in the public domain to stimulate further research. Their release implies no obligation or desire to support additional work in this space. The data is released as-is. DARPA makes no warranties as to the correctness, accuracy, or usefulness of the released data. In fact, since the data was produced by research prototypes, it is practically guaranteed to be imperfect. Nonetheless, as this data represents a very large repository of semantically rich and structured data, DARPA believes that it is in the best interests of the Department of Defense and the research community to make them freely available.
Distribution Statement A: Approved for public release. Distribution is unlimited.
OpTC Overview
Operationally Transparent Cyber (OpTC) was a DARPA transition pilot activity funded under Boston Fusion Corp.'s (BFC) Cyber APT Scenarios for Enterprise Systems (CASES) project. The main goal of the pilot was to determine if technology developed under the DARPA Transparent Computing (TC) program could scale up to one thousand clients while maintaining detection performance. Boston Fusion along with two performers from the TC program (Five Directions and BAE) developed the OpTC prototype. Provatek joined the team to serve as test coordinator, conducting scaling and detection tests in 2019. This data set represents a subset of collection from that activity.
OpTC was evaluated at the National Cyber Range (NCR), which provided a well-instrumented facility to measure the impact of the system on network and client machine bandwidth, disk, and memory usage. Client machines created using VMware were programmed to complete general tasks such as creating, editing, and deleting presentations and text documents; sending, receiving, and downloading attachments from emails; browsing various websites; and mimicking generic daily user activities.
Each client machine in the NCR evaluations was equipped with an Acuity Intelligent Agent (AIA) sensor developed by Five Directions. This sensor sends real time, system-level data to servers equipped with Acuity Translator (AT) software, also developed by Five Directions. The Acuity Translator servers compile co-related events into aggregate messages and forwards the contents to Rapid Infiltration and Prevention of Exfiltration (RIPE) translators developed by BAE. The messages then undergo additional refinement before being sent to the RIPE Data Analytics Engine, which generates a network topology graph that may be queried to identify advanced persistent threat (APT) activity.
The OpTC team collected the data in this release over three days, during which the number of clients varied from five hundred to one thousand. Working with five hundred clients tended to be more convenient in terms of the amount of time it took to bring up the system and manage the instrumentation. During the three-day evaluation event, randomly chosen machines were attacked, compromised, and used to perform additional attacks on other network clients. All event data was recorded for post-event analysis with ground truth data on attack insertions documented.
The dataset consists of four main directories, each containing a single file per client. These files are sorted by event time and labeled based on data provided by the red team.