ChatGPT Survey: Performance on NLP datasets

Follow the full discussion on Reddit.
I've done a survey of how well ChatGPT performs on various NLP tasks as reported in arXiv papers. I have found 19 papers where they compared ChatGPT with fine-tuned models, but they are being published practically daily now. It seems that for the most of the classical NLP tasks, ChatGPT is not actually that strong and smaller fine-tuned models are often much better. According to the API page, GPT-4 is not expected to be much stronger on tasks like these. I think it is an interesting perspective that shows that for many of the tasks we need to solve, GPT models are actually not the right tool.

Comments

There's unfortunately not much to read here yet...

Discover the Best of Machine Learning.

Ever having issues keeping up with everything that's going on in Machine Learning? That's where we help. We're sending out a weekly digest, highlighting the Best of Machine Learning.

Join over 900 Machine Learning Engineers receiving our weekly digest.

Best of Machine LearningBest of Machine Learning

Discover the best guides, books, papers and news in Machine Learning, once per week.

Twitter