The MNIST database (Modified National Institute of Standards and Technology database[1]) is a large database of handwritten digits that is commonly used for training various image processing systems.[2][3] The database is also widely used for training and testing in the field of machine learning.[4][5] It was created by "re-mixing" the samples from NIST's original datasets.[6] The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments.[7] Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.[7]
The MNIST database contains 60,000 training images and 10,000 testing images.[8] Half of the training set and half of the test set were taken from NIST's training dataset, while the other half of the training set and the other half of the test set were taken from NIST's testing dataset.[9] The original creators of the database keep a list of some of the methods tested on it.[7] In their original paper, they use a support-vector machine to get an error rate of 0.8%.[10]
Extended MNIST (EMNIST) is a newer dataset developed and released by NIST to be the (final) successor to MNIST.[11][12] MNIST included images only of handwritten digits. EMNIST includes all the images from NIST Special Database 19, which is a large database of handwritten uppercase and lower case letters as well as digits.[13][14] The images in EMNIST were converted into the same 28x28 pixel format, by the same process, as were the MNIST images. Accordingly, tools which work with the older, smaller, MNIST dataset will likely work unmodified with EMNIST.
^"THE MNIST DATABASE of handwritten digits". Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond.
^"Support vector machines speed pattern recognition - Vision Systems Design". Vision Systems Design. September 2004. Retrieved 17 August 2013.
^Gangaputra, Sachin. "Handwritten digit database". Retrieved 17 August 2013.
^Qiao, Yu (2007). "THE MNIST DATABASE of handwritten digits". Retrieved 18 August 2013.
^Platt, John C. (1999). "Using analytic QP and sparseness to speed training of support vector machines" (PDF). Advances in Neural Information Processing Systems: 557–563. Archived from the original (PDF) on 4 March 2016. Retrieved 18 August 2013.
^Grother, Patrick J. "NIST Special Database 19 - Handprinted Forms and Characters Database" (PDF). National Institute of Standards and Technology.
^ abcLeCun, Yann; Cortez, Corinna; Burges, Christopher C.J. "The MNIST Handwritten Digit Database". Yann LeCun's Website yann.lecun.com. Retrieved 30 April 2020.
^Kussul, Ernst; Baidyk, Tatiana (2004). "Improved method of handwritten digit recognition tested on MNIST database". Image and Vision Computing. 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008.
^Zhang, Bin; Srihari, Sargur N. (2004). "Fast k-Nearest Neighbor Classification Using Cluster-Based Trees" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 26 (4): 525–528. doi:10.1109/TPAMI.2004.1265868. PMID 15382657. S2CID 6883417. Retrieved 20 April 2020.
^LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner (1998). "Gradient-Based Learning Applied to Document Recognition" (PDF). Proceedings of the IEEE. 86 (11): 2278–2324. doi:10.1109/5.726791. S2CID 14542261. Retrieved 18 August 2013.
^NIST (4 April 2017). "The EMNIST Dataset". NIST. Retrieved 11 April 2022.
^NIST (27 August 2010). "NIST Special Database 19". NIST. Retrieved 11 April 2022.
^Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. (2017). "EMNIST: an extension of MNIST to handwritten letters". arXiv:1702.05373 [cs.CV].
^Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. (2017). "EMNIST: an extension of MNIST to handwritten letters". arXiv:1702.05373v1 [cs.CV].
The MNISTdatabase (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used...
The Fashion MNIST dataset is a large freely available database of fashion images that is commonly used for training and testing various machine learning...
about 9%. Their research continued for the next four years, and in 1994 MNISTdatabase was developed, for which LeNet-1 was too small, hence a new NN LeNet-4...
performance in the literature for multiple image databases, including the MNISTdatabase, the NORB database, the HWDB1.0 dataset (Chinese characters) and...
tools for Science". 10 August 2011. Retrieved 2016-09-18. "MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges". yann.lecun.com...
online by University of California-Irvine Machine Learning Repository. MNISTdatabase – Images of handwritten digits commonly used to test classification...
a single character) – are still the subject of active research. The MNISTdatabase is commonly used for testing systems' ability to recognize handwritten...
tested their approach on the MNISTdatabase. Currently, more than 50 algorithms have been tested on the database. The database has a training set of 60,000...
accuracy in fields like computer vision, specifically on things like the MNISTdatabase, and traffic sign recognition. Language processing engines powered by...
learning. A common evaluation set for image classification is the MNISTdatabase data set. MNIST is composed of handwritten digits and includes 60,000 training...
same 101 categories. List of datasets for machine learning research MNISTdatabase LabelMe Viola, Paul; Jones, Michael J. (2004). "Robust Real-Time Face...
the possible parse trees. That system proved useful on the MNIST handwritten digit database. A dynamic routing mechanism for capsule networks was introduced...
(MS) University of Rochester (PhD) Known for Support vector machines MNISTdatabase Awards Paris Kanellakis Award (2008) ACM Fellow (2023) Scientific career...
on the use of neural network for handwriting recognition using the MNISTdatabase. She is also a co-inventor of the siamese neural networks, a neural...
of content on demand. List of datasets for machine learning research MNISTdatabase Caltech 101 List of Manual Image Annotation Tools VoTT Russell et al...
classification benchmarks, most notably the hand written digits of the MNISTdatabase. The BCPNN approach uses biologically plausible learning and structural...
counterparts, but no lower training error than their 20 layers counterpart (on the MNIST dataset, Figure 1 in ). No improvement on test accuracy was reported with...
connections makes this network difficult to analyze. pattern recognition. used in MNIST digits and speech. recognition & imagination. trained with unsupervised...
missing entire modes from the input data. For example, a GAN trained on the MNIST dataset containing many samples of each digit might only generate pictures...