Datasets publicly and freely available for training pattern-recognition algorithms.
ImageNet - 10 million images, hand-classified
CIFAR-10 - 10000 small images, classified, by Alex Krizhevsky at Google
ShapeNet - 3D models of realworld objects, https://www.shapenet.org/model-querier
NHIST - digits, 0-9. 60,000 images of hand-written digits.
WordNet - a lexical database for the English language, http://wordnet.princeton.edu/
AISHELL-1 - Mandarin, 400 speakers
We know the NSA collects every email, phone call, and text from every USA citizen. And we know this is paid for by the USA taxpayers, and is therefore owned by USA citizens.
kaggle hosts hundreds of downloadable datasets
https://www.kaggle.com/datasets