论文地址 A Survey on Efficient Inference for Large Language Models
Tsinghua University Jul 2024
*图注:大语言模型推理效率瓶颈分析图示 图1:大模型部署挑战*