Novel context-aware network traffic classification based on a machine learning approach
The dataset was constructed by capturing real-time background traffic of 9 applications. The 9 applications represent different types of network behaviour in the background, for high level of network
interaction; we have considered video and voice calls of Skype and Google Hangouts. For the varied level of interactions Facebook and Gmail been chosen, for Gmail, emails were received at random instances. And tagged posts were received at random instances for Facebook as updates. NSS and NSC chosen to represent all applications with lower degree of interaction, these applications mostly are offline, the interaction occurred only during fetching advertisements. Finally, to represent applications with audio buffering capability XiiaLive internet radio application considered, we chose a random station 128kbps stream. The dataset has been labelled in accordance to the level of interactivity in the background of each application. All inputs of applications with high and constant level of background interactivity are labelled as high. Inputs of applications of varied level of background interactivity were labelled as varied. Low was the label for the inputs of applications with low level of interactivity, and we have labelled the samples of XiiaLive internet radio app with output class buffer.
The dataset with full number of six highly correlated features named as Dataset 1.
to convert the file into weka readable format you can add the following in the first line of the txt document: