Recently, Professor Yang Jian, Professor Li Jun, and Associate Professor Chen Shuo from the School of Intelligence Science and Technology, Nanjing University, in collaboration with Nanjing University of Science and Technology, have proposed OSESSL (One-Shot Example-Guided Self-Supervised Learning Framework), the first framework of its kind. This work has been accepted for publication at ICLR 2026.
This framework aims to address the lack of ground-truth class structure in self-supervised learning due to the absence of human annotations. It constructs a representation learning method with extremely low labeling cost (requiring only a single sample per class), breaking through the key bottlenecks of existing clustering-based self-supervised methods, namely, difficulty in aligning representations with true semantics and limited effectiveness of representations for downstream tasks.
The research team proposes a new self-supervised learning paradigm centered on one-shot exemplars, organizing the feature learning process into an efficient knowledge transfer pipeline that moves from extremely sparse supervision to massive unlabeled data. The overall framework employs a dynamic memory bank and a dual-branch network to construct a closed loop of semantic anchoring–feature alignment. Specifically, the model first uses a minimal number of labeled samples as anchors to explore the unlabeled data space. It then treats the mined semantic information as targets to guide the clustering direction of massive unlabeled views. Finally, a boundary smoothing strategy is applied to fill uncertain regions in the feature space, achieving a significant leap in representation capability at low cost.

Figure 1: Schematic diagram of the OSESSL framework
In terms of specific mechanism design, OSESSL serves as a bridge connecting supervised and self-supervised learning, incorporating three core innovations:First, Exemplar-Guided Prototype Construction combines a single exemplar with discriminative unlabeled neighbors to ensure that the generated class centers are both anchored in ground-truth semantics and representative of the data distribution.Second, Exemplar-Guided Prototype Learning is enhanced with a dispersion regularization strategy. While aligning cluster assignments across different augmented views, this strategy forces prototypes of different classes to remain mutually repulsive, effectively preventing multi-class feature collapse.Third, Exemplar-Guided Interpolation Consistency extends the guiding force to the feature mixing space. By imposing constraints on mixed samples, it smooths the decision boundaries in ambiguous regions, thereby endowing the model with stronger generalization capability.

Figure 2: T-SNE visualization of features on the CIFAR-10 dataset
Experimental results demonstrate that OSESSL significantly improves linear and k-NN classification accuracy on multiple standard benchmarks, including CIFAR and ImageNet, while maintaining consistent advantages in dense prediction transfer tasks such as object detection on COCO. Compared with existing state-of-the-art self-supervised and semi-supervised methods, this framework achieves more robust representation quality and cross-scenario generalization under extremely low labeling costs, providing support for efficient visual foundation model pre-training.
