User Tools

Site Tools


datasets

Datasets

Datasets publicly and freely available for training pattern-recognition algorithms.

Images

ImageNet - 10 million images, hand-classified

CIFAR-10 - 10000 small images, classified, by Alex Krizhevsky at Google

Chest X-Rays - https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community

ShapeNet - 3D models of realworld objects, https://www.shapenet.org/model-querier

Handwriting

NHIST - digits, 0-9. 60,000 images of hand-written digits.

Language Syntax

WordNet - a lexical database for the English language, http://wordnet.princeton.edu/

Language Audio

AISHELL-1 - Mandarin, 400 speakers

NSA

We know the NSA collects every email, phone call, and text from every USA citizen. And we know this is paid for by the USA taxpayers, and is therefore owned by USA citizens.

Lists

kaggle hosts hundreds of downloadable datasets
https://www.kaggle.com/datasets

datasets.txt · Last modified: 2021/01/28 05:46 by 127.0.0.1

Except where otherwise noted, content on this wiki is licensed under the following license: Public Domain
Public Domain Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki