


default search action
"Falcon: Faster and Parallel Inference of Large Language Models through ..."
Xiangxiang Gao et al. (2024)
- Xiangxiang Gao, Weisheng Xie, Yiwei Xiang, Feng Ji:

Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree. CoRR abs/2412.12639 (2024)

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














