This dataset presents the results obtained for Ingestion and Reporting layers of a Big Data architecture for processing performance management (PM) files in a mobile network. Flume was used in the Ingestion layer. Flume collected PM files from a virtual machine that replicates PM files from a 5G network element (gNodeB). Flume transferred PM files to High Distributed File System (HDFS) in XML format. Hive was used in the Reporting layer. Hive queries the raw data from HDFS. Hive queries a view from HDFS.



The current maturity of autonomous underwater vehicles (AUVs) has made their deployment practical and cost-effective, such that many scientific, industrial and military applications now include AUV operations. However, the logistical difficulties and high costs of operating at-sea are still critical limiting factors in further technology development, the benchmarking of new techniques and the reproducibility of research results. To overcome this problem, we present a freely available dataset suitable to test control, navigation, sensor processing algorithms and others tasks.



This dataset is a set of eighteen directed networks that represents message exchanges among Twitter accounts during eighteen crisis events. The dataset comprises 645,339 anonymized unique user IDs and 1,396,709 edges that are labeled with respect to Plutchik's basic emotions (anger, fear, sadness, disgust, joy, trust, anticipation, and surprise) or "neutral" (if a tweet conveys no emotion).


This paper concerns static output feedback stabilization of polytopic discrete LTI systems. The previous related studies were mainly based on LMI approaches which are naturally conservative. In this paper, a novel design algorithm is presented that iteratively partitions a primary design space to subspaces. Then, by assessing stabilizability status of each generated subspace, the algorithm determines the total stabilizable parts and removes the undesired parts of the design space.


EmoSurv is a dataset containing keystroke data along with emotion labels. Timing and frequency data is recorded while participants are typing free and fixed texts before and after being induced specific emotions. These emotions are: Anger, Happiness, Calmness, Sadness, and Neutral state.

First, data is collected while the participant is in a neutral state. Then, the participant watches an eliciting video. Once the emotion is induced in the participant, he types another fixed and free text.


A high level of monitoring is necessary for the safety and product quality of the electrical fused magnesia furnace (EFMF). In this paper, a monitoring method based on latent subspace for EFMF is proposed to fully mine the effective information of multi-source heterogeneous data in the process. By minimizing the distance of different types of data in the subspace, the corresponding projection matrix is obtained. Then the data is projected into the obtained subspace to estimate whether fault occurs.In summary, the main contributions of this paper are threefold.


experimental data


Optical Character Recognition (OCR) system is used to convert the document images, either printed or handwritten, into its electronic counterpart. But dealing with handwritten texts is much more challenging than printed ones due to erratic writing style of the individuals. Problem becomes more severe when the input image is doctor's prescription. Before feeding such image to the OCR engine, the classification of printed and handwritten texts is a necessity as doctor's prescription contains both handwritten and printed texts which are to be processed separately.


This dataset contains the experimental materials for "Use and Perceptions of Multi-Monitor Workstations".

There are two files:

  1. survey.txt: the survey questions
  2. survey-results.csv: the answers obtained from the 101 respondents tot he survey



Most text-simplification systems require an indicator of the complexity of the words. The prevalent approaches to word difficulty prediction are based on manual feature engineering. Using deep learning based models are largely left unexplored due to their comparatively poor performance. We have explored the use of one of such in predicting the difficulty of words. We have treated the problem as a binary classification problem. We have trained traditional machine learning models and evaluated their performance on the task.