big data

PT7 Web is an annotated Portuguese language Corpus built from samples collected from Sep 2018 to Mar 2020 from seven Portuguese-speaking countries: Angola, Brazil, Portugal, Cape Verde, Guinea-Bissau, Macao e Mozambique. The records were filtered from Common Crawl — a public domain petabyte-scale dataset of webpages in many languages, mixed together in temporal snapshots of the web, monthly available [1]. The Brazilian pages were labeled as the positive class and the others as the negative class (non-Brazillian Portuguese).

Categories:
566 Views

We obtained 6 million instances to be used as an analysis for modelling CO2 behavior. The Data Logging and sensors nodes acquisition are every 1 second.

Categories:
662 Views

The 2020 Data Fusion Contest, organized by the Image Analysis and Data Fusion Technical Committee (IADF TC) of the IEEE Geoscience and Remote Sensing Society (GRSS) and the Technical University of Munich, aims to promote research in large-scale land cover mapping based on weakly supervised learning from globally available multimodal satellite data. The task is to train a machine learning model for global land cover mapping based on weakly annotated samples.

Last Updated On: 
Mon, 01/25/2021 - 09:03

This dataset includes gathering 18-month raw PV data at time intervals of about 200 µs (5 kHz sampling). A post-processing 365-day day-by-day downsampled version, converted to 10 ms intervals (100 Hz sampling), is also included. The end results are two databases: 1. The original, raw, data, including both fast (short circuit, 200 µs) and slow (sweep, 2.5-3.9 s) information for 18 months. These show intervals of missing points, but are provided to allow potential users to reproduce any new work. 2.

Categories:
1177 Views

Pages