8.5
"Novel Method Increases Inference Efficiency of Large Language Models by 26x, Reduces Memory Consumption Through Layer-Condensed KV Cache, Code Available for Integration"
arxiv.org
#
©
News Score
score the news, sort the news, rewrite the headlines
Leaderboard
Submit
About