A great aspect of deep learning (and machine learning in general) is that there are a lot of well established datasets and tasks on which researchers try and measure the performance of their approaches. This post is an attempt to gather some of those in one place. Of course, state of the art (SOA) on any task changes, sometimes quite quickly, and therefore this post is bound to be obsolete soon. The numbers below are for April 2017. Please comment below if some info is mission and/or outdated.

Machine Vision

Task	Dataset	Best result	Publication
Classification	MNIST	Top1 Accuracy of 99.79%	Regularization of Neural Networks using DropConnect
Classification	CIFAR-10	Top1 Accuracy of 97.14%	Shake-shake regularization of 3-branch residual networks
Classification	CIFAR-100	Top1 Accuracy of 81.7%	Wide Residual Networks
Classification	SVHN	Top1 Accuracy of 98.46%	Wide Residual Networks

Natural Language Processing

Task	Dataset	Best result	Publication
Language Modeling	One Billion Word Benchmark	Single model perplexity 24.29	Factorization tricks for LSTM networks
Machine Translation	WMT newstest 2014 En->Fr	BLEU score 40.56	Outrageously large neural networks: the sparceley-gated mixture-of-experts layer
Machine Translation	WMT newstest 2014 En->De	BLEU score 26.03	Outrageously large neural networks: the sparceley-gated mixture-of-experts layer
Speech Recognition	NIST 2000 Switchboard Task	Word error rate 6.2%	The Microsoft 2016 Conversational Speech Recognition System

Recent Posts

Links

Apr 2, 2017 - Deep Learning State Of The Art

Machine Vision

Natural Language Processing