D] Genetic Algorithm (GA) vs. Stochastic Gradient Descent (SGD)

Follow the full discussion on Reddit.
For functions that have narrow grooves toward the global minima a simple GA implementation can be as efficient as a naive SGD method. Geoffrey Hinton, in one of his videos (Lecture 3.4) mentioned that GA randomly perturbs one weight at a time making it very inefficient compared to backpropagation. Here we present a simple GA implementation which simultaneously mutates all the weights and can learn reasonably efficiently. To implement such network all you need is to follow these simple steps:


There's unfortunately not much to read here yet...

Discover the Best of Machine Learning.

Ever having issues keeping up with everything that's going on in Machine Learning? That's where we help. We're sending out a weekly digest, highlighting the Best of Machine Learning.

Join over 900 Machine Learning Engineers receiving our weekly digest.

Best of Machine LearningBest of Machine Learning

Discover the best guides, books, papers and news in Machine Learning, once per week.