Anyone think that the evaluation of the meta-learning approaches for few-shot classification is not very reasonable?

Meta-learning for few-shot classification (N-way-K-shot) usually uses the same number of query examples for both training and testing. For example, in a 5-way-1-shot classification task over the miniImageNet dataset, during the training phase, there are 1 example per class in the support set and 15 examples per class in the query set. During the testing phase, it's the same. But to be realistic, shouldn't we use more query examples for evaluation? Of course I know the results will not be as good-looking as current ones. Moreover, the setting of the ways & shots during the training phase seems not rigorous, either.


