[#1 ImageNet-1K SSL (without extra data)] [#1 ImageNet-1K Clustering (without extra data)]
[Code] [Paper] [Models] [Codebase Demo Video] [Model Training Demo Video] [BibTeX]
MIM-Refiner refines the representation of pre-trained Masked Image Models (MIM) by attaching Instance Discrimination (ID) heads to multiple intermediate heads. This setup is then trained for a few epochs with with our Nearest Neighbor Alignment (NNA) objective.
MIM-Refiner drastically advances state-of-the-art in ImageNet-1K linear probing. It achieves an improvement of +2.5% over the previous state-of-the-art. In comparison, over the last 4 years, state-of-the-art improved by +2.6%.
MIM-Refiner efficiently combines the advantages of MIM and ID models and surpasses previous state-of-the-art methods while being easy to scale up to extremely large models.