LLM Memory & RAG 相关度: 7/10

CSRv2: Unlocking Ultra-Sparse Embeddings

Lixuan Guo, Yifei Wang, Tiansheng Wen, Yifan Wang, Aosong Feng, Bo Chen, Stefanie Jegelka, Chenyu You

arXiv: 2602.05735v1 发布: 2026-02-05 更新: 2026-02-05

下载 PDF arXiv 页面

AI 摘要

CSRv2通过改进训练方法，使超稀疏嵌入在保证性能的同时，显著提升计算和存储效率。

主要贡献

提出渐进式k-退火稳定稀疏学习
引入监督对比目标增强表征质量
实现端到端可适应的全骨干微调

方法论

通过渐进式k-退火、监督对比目标和全骨干微调，优化CSR的训练过程，实现超稀疏嵌入的高效利用。

原文摘要

In the era of large foundation models, the quality of embeddings has become a central determinant of downstream task performance and overall system capability. Yet widely used dense embeddings are often extremely high-dimensional, incurring substantial costs in storage, memory, and inference latency. To address these, Contrastive Sparse Representation (CSR) is recently proposed as a promising direction, mapping dense embeddings into high-dimensional but k-sparse vectors, in contrast to compact dense embeddings such as Matryoshka Representation Learning (MRL). Despite its promise, CSR suffers severe degradation in the ultra-sparse regime, where over 80% of neurons remain inactive, leaving much of its efficiency potential unrealized. In this paper, we introduce CSRv2, a principled training approach designed to make ultra-sparse embeddings viable. CSRv2 stabilizes sparsity learning through progressive k-annealing, enhances representational quality via supervised contrastive objectives, and ensures end-to-end adaptability with full backbone finetuning. CSRv2 reduces dead neurons from 80% to 20% and delivers a 14% accuracy gain at k=2, bringing ultra-sparse embeddings on par with CSR at k=8 and MRL at 32 dimensions, all with only two active features. While maintaining comparable performance, CSRv2 delivers a 7x speedup over MRL, and yields up to 300x improvements in compute and memory efficiency relative to dense embeddings in text representation. Extensive experiments across text and vision demonstrate that CSRv2 makes ultra-sparse embeddings practical without compromising performance, where CSRv2 achieves 7%/4% improvement over CSR when k=4 and further increases this gap to 14%/6% when k=2 in text/vision representation. By making extreme sparsity viable, CSRv2 broadens the design space for real-time and edge-deployable AI systems where both embedding quality and efficiency are critical.

arXiv 分类

cs.LG cs.AI cs.IR cs.IT

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类