CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs

Akshat Ramachandran1, Souvik Kundu2, Tushar Krishna1
1Georgia Institute of Technology, 2Intel Labs
Correspondence: akshat.r@gatech.edu

News

Overview

We present CLAMP-ViT, a data-free post-training quantization method for vision transformers (ViTs). We identify the limitations of recent techniques, notably their inability to leverage meaningful inter-patch relationships, leading to the generation of simplistic and se- mantically vague data, impacting quantization accuracy. CLAMP-ViT employs a two-stage approach, cyclically adapting between data generation and model quantization. Specifically, we incorporate a patch-level contrastive learning scheme to generate richer, semantically meaningful data. Furthermore, we leverage contrastive learning in layer-wise evolutionary search for fixed- and mixed-precision quantization to identify optimal quantization parameters while mitigating the effects of a non-smooth loss landscape. Extensive evaluations across various vision tasks demonstrate the superiority of CLAMP-ViT, with performance improvements of up to 3% in top-1 accuracy for classification, 0.6 mAP for object detection, and 1.5 mIoU for segmentation at similar or better compression ratio over existing alternatives.

Method

It employs a two-stage cyclic process that alternates between generating semantically rich synthetic data using patch-level contrastive learning and optimizing quantization parameters through layer-wise evolutionary search. 

Results

Comparison of synthetic data generated by (a) PSAQ-ViT v1, (b) PSAQ- ViT v2 and (c) CLAMP-ViT (Ours). CLAMP-ViT generates detailed objects within contextually suitable backgrounds, boosting realism and informativeness.

Fixed-precision quantization accuracy comparison with SoTA on image clas- sification tasks with ImageNet-1k testset. ‘R’, ‘S’ signifies real and synthetic calibration data and W/A indicates weight/activation bit-width. The values in bold and underline signifies, the best performance overall, and with synthetic data, respectively.

Citation

@article{ramachandran2024clamp, title={CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs}, author={Ramachandran, Akshat and Kundu, Souvik and Krishna, Tushar}, journal={arXiv preprint arXiv:2407.05266}, year={2024} }

Acknowledgement

This work was supported in part by CoCoSys, one of the seven centers in JUMP 2.0, a Semiconductor Research Corporation (SRC) program sponsored by DARPA.

The whole is greater than the sum of its parts