Training CNN to 99% on MNIST in less than 1 second on a laptop.

Follow the full discussion on Reddit.
I was bored waiting around for long training runs so I wanted to find out how fast could we do it. I started with the easiest thing I could think of which was MNIST, but even that took 3 minutes to train using standard code from PyTorch. After a lot of reading and trial and error I was eventually able to train a small CNN on MNIST to 99% accuracy in just 1 epoch, which took 0.76 seconds on average. All code is available in Jupyter notebooks here https://github.com/tuomaso/train_mnist_fast as well as a brief discussion of the process. I haven't seen too much work on training networks fast so if anyone has recommendations on where to learn more or suggestions how to cut down training time even more please let me know!

Comments

There's unfortunately not much to read here yet...

Discover the Best of Machine Learning.

Ever having issues keeping up with everything that's going on in Machine Learning? That's where we help. We're sending out a weekly digest, highlighting the Best of Machine Learning.

Join over 900 Machine Learning Engineers receiving our weekly digest.

Best of Machine LearningBest of Machine Learning

Discover the best guides, books, papers and news in Machine Learning, once per week.

Twitter