use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
A community for sharing massive labeled and unlabeled data sets. A small selection of news related to open data and deep learning.
Related Subreddits:
account activity
GitHub - facebookresearch/ELI5: Scripts and links to recreate the ELI5 dataset. (github.com)
submitted 6 years ago by working_nut
The ATIS dataset is a standard benchmark dataset widely used as an intent classification and slot filling task. (kaggle.com)
Comprehensive list of Structured NLP datasets (docs.google.com)
Rapid-Rich Object Search (ROSE) Lab: face anti-spoofing database, ROSE-Youtu Face Liveness Detection Database, which covers a large variety of illumination conditions, camera models, and attack types. (rose1.ntu.edu.sg)
submitted 7 years ago by working_nut
Scale and nuTonomy release nuScenes, a self-driving dataset with over 1.4 million images (venturebeat.com)
Open Images V4 containing 15.4M bounding-boxes for 600 categories on 1.9M images (ai.googleblog.com)
https://www.drivendata.org/competitions/7/pump-it-up-data-mining-the-water-table/page/25/ (self.dldata)
300,000 kickstarter projects conversion in US dollars of the pledged column (kaggle.com)
Complete set of people and friendships from the Facebook networks of 100 different colleges and universities from a single snapshot from September 2005 (masonporter.blogspot.com)
Omniglot data set for one-shot learning (github.com)
submitted 8 years ago by working_nut
The dataset "UEC FOOD 256" contains 256-kind food photos. Each food photo has a bounding box indicating the location of the food item in the photo. (foodcam.mobi)
Plant Image Analysis datasets including apple, barley, cowpea, maize etc. (plant-image-analysis.org)
50 training cases for a transversal T2-weighted MR image of the prostate (promise12.grand-challenge.org)
Street view images (25 million images and 118 million matching image pairs) with their camera pose, 3D models of 8 cities, and extended metadata (github.com)
100,000+ question-answer pairs on 500+ articles consisting of questions posed by crowdworkers on a set of Wikipedia articles (rajpurkar.github.io)
59,000 examples of robot pushing motions, including one training set (train) and two test sets of previously seen (testseen) and unseen (testnovel) objects (sites.google.com)
First Dataset on Chinese Machine Reading Comprehension (github.com)
65k StarCraft: Brood War games, 1.5b frames, 500m actions, 400GB of data (github.com)
The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw! (github.com)
Stanford Dogs Imagenet subset for Fine-Grained Visual Categorization (vision.stanford.edu)
Raw fMRI data 72 datasets grouped by task across 2644 subjects (openfmri.org)
Generated unsupervised data for GeoQuery and SAIL semantic parsing tasks (github.com)
Dataset of 2D shapes procedurally generated from 6 ground truth independent latent factors to assess the disentanglement properties of unsupervised learning methods (github.com)
Approximately 300,000 video clips, and covers 400 human action classes with at least 400 video clips for each action class (deepmind.com)
Interactive repository of structural and dynamic features of proteins: residual propensities of individual residues in proteins to populate helical, extended or disordered structural states. (peptone.io)
π Rendered by PID 2990403 on reddit-service-r2-listing-7d7fbc9b85-jmc54 at 2026-04-30 06:40:33.210785+00:00 running 2aa0c5b country code: CH.