Comments
There's unfortunately not much to read here yet...
Follow the full discussion on Reddit.
Happy to share that CleanRL now has a new algorithm called Robust Policy Optimization — 5 lines of code change to PPO to get better performance in 57 out of 61 continuous action envs 🚀 (e.g., dm_control)
There's unfortunately not much to read here yet...
Ever having issues keeping up with everything that's going on in Machine Learning? That's where we help. We're sending out a weekly digest, highlighting the Best of Machine Learning.
Discover the best guides, books, papers and news in Machine Learning, once per week.