Google Research Releases Compression Algorithm TurboQuant to Reduce AI Model Memory UsageFinancial News

Desktop

Latest Search

Back Zoom + Zoom -

Google Research Releases Compression Algorithm TurboQuant to Reduce AI Model Memory Usage

Recommend

Positive

Negative

aacat by AASTOCKS
Open a new trading account to get TENCENT shares!

Google Research released TurboQuant, a training-free compression algorithm that can compress the KV cache of large language models (LLM) to 3 bits without affecting model accuracy, on Tuesday (24th), according to foreign media.

In benchmark tests on Nvidia (NVDA.US)'s H100 GPUs, compared to unquantized 32-bit keys, the 4-bit TurboQuant can increase the efficiency of computing attention logits by up to 8x, while reducing the KV cache memory by at least 6x.

Related NewsFed Interest Rate Decision for in the United States is 3.75%, unchanged from its last period. The forecast was 3.75%.
Memory stocks Sandisk (SDNK.US) and Micron Technology (MU.US) cascaded 3.5% and 3.4% each overnight (25th).

AASTOCKS Financial News
Website: www.aastocks.com

Copied to Clipboard

Remark

(1) HK Indices are real time

View: Mobile|Desktop
Lang: 繁|简|EN