Comments
There's unfortunately not much to read here yet...
Follow the full discussion on Reddit.
A recent paper published by Microsoft researchers proposes a new vision-language pretrained model for image-text joint embedding, ImageBERT, which achieves SOTA performance on both the MSCOCO (image retrieval task) and Flickr30k (text retrieval) datasets.
There's unfortunately not much to read here yet...
Ever having issues keeping up with everything that's going on in Machine Learning? That's where we help. We're sending out a weekly digest, highlighting the Best of Machine Learning.
Discover the best guides, books, papers and news in Machine Learning, once per week.