A Hybrid Approach to Service Recommendation Based on Network Representation Learning


A Hybrid Approach to Service Recommendation Based on Network Representation Learning


A new dataset named Sanitation is released to evaluate the HAR algorithm’s performance and benefit the researchers in this field, which collects seven types of daily work activity data from sanitation workers.We provide two .csv files, one is the raw dataset “sanitation.csv”, the other is the pre-processed features dataset which is suitable for machine learning based human activity recognition methods.


Includes sentiment-specific distributed word representations that have been trained on 10M Arabic tweets that are distantly supervised using positive and negative keywords. As described in the paper [1], we follow Tang’s [2] three neural architectures, which encode the sentiment of a word in addition to its semantic and syntactic representation. 


Specifications Table

Subject area

 Natural Language Processing


The dataset contains measurements taken from four air handling units (AHU) installed in a medium-to-large size academic building. The building is a 7-story, 9000 sqm facility commissioned in 2016 hosting the PRECIS research center. It contains multiple research laboratories, multifunction spaces, meeting rooms, and a large auditorium as well as administrative offices. It is located at 44°2606.0N and 26°0244.0E in a temperate continental climate with hot summers and cold winters. Cooling is handled using on-site electric chillers while heating is provided from a district heating network.


The presented dataset has been used as a basis for CAO - a system for analysis of emoticons in Japanese online communication, developed by Ptaszynski et al. (2010). Emoticons are strings of symbols widely used in text-based online communication to convey user emotions. The database contains: 1) a predetermined raw emoticon database containing over ten thousand emoticon samples extracted from the Web, 2) emoticon parts automatically divided from raw emoticons into semantic areas representing “mouths” or “eyes”.


Our goal is to find whether a convolutional neural network (CNN) performs better than the existing blind algorithms for image denoising, and, if yes, whether the noise statistics has an effect on the performance gap. We performed automatic identification of noise distribution, over a set of nine possible distributions, namely, Gaussian, log-normal, uniform, exponential, Poisson, salt and pepper, Rayleigh, speckle and Erlang. Next, for each of these noisy image sets, we compared the performance of FFDNet, a CNN based denoising method, with noise clinic, a blind denoising algorithm.


We introduced the task of acoustic question answering (AQA) in https://arxiv.org/abs/1811.10561.

This dataset aim to promote research in the acoustic reasoning area.

It comprise Acoustic Scenes and multiple questions/answers for each of them.

Each question is accompanied by a functional program which describe the reasoning steps needed in order to answer it.


The dataset is constitued is separated in 3 sets :


File Structure

    • /audio : Audio recordings of the scenes
      • /test : Test set recordings
      • /train : Training set recordings
      • /val : Validation set recordings
    • /questions : Questions with their corresponding answers (3 JSON files, one for each set)
    • /scenes : Scenes defintions (3 JSON files, one for each set)
    • /arguments : A copy of all the arguments used as input at generation time (For reproducability)
    • /logs : Logs of the generation scripts



Each scenes is an assembly of 10 Elementary sounds.The scenes are persisted as JSON blobs. They contains the following attributes :

    • scene_index : Numerical identifier of the scene
    • objects : List of elementary sounds contained in the scene (See Elementary Sounds section)
    • relationships : Define the relationships between all the objects of the scene

Elementary Sounds

Elementary sounds are recordings of instruments playing a single note.The Elementary sound bank contains 56 unique recordings separated across 5 instruments family.Each of them have the following attributes :

    • Brightness : {Bright, Dark, Null}
    • Duration : length of the sound (in ms)
    • Filename : Filename of the audio recording
    • Note : Musical note on the chromatic scale { A, A#, B, C, C#, D, D#, E, F, F#, G, G# }
    • ID : Numerical identifier of the sound
    • Instrument : Name of the instrument playing the sound { Cello, Clarinet, Flute, Trumpet, Violin }
    • Loudness : {Loud, Quiet}
    • Octave : Octave of the sound



Date fruit data sets are not publicly available. Previous studies have collected and used their own data set. Almost all these studies have few hundred images per class. As our motive was robust date fruit classification, we did not use the camera to take images of a particular size, angle or images with a particular background, instead to add robustness, we built our date fruit database using Google search engine. Hence the images had the multi-background, noise, different lighting condition, other objects, different packaging and sometimes even partial covering.


The data set contains four classes namely, ajwa, mabroom, sagai and sukkary.


Our Signing in the Wild dataset consists of various videos harvested from YouTube containing people signing in various sign languages and doing so in diverse settings, environments, under complex signer and camera motion, and even group signing. This dataset is intended to be used for sign language detection.