upd

yzh119 · yzh119 · commit 84a98b3689ff · 2025-03-10T18:28:01.000-07:00
diff --git a/_posts/2025-03-10-sampling.md b/_posts/2025-03-10-sampling.md
@@ -169,7 +169,10 @@ Our evaluation demonstrates that FlashInfer's sampling kernel delivers substanti
 </p>
 
 ## Community Adoption and Other Applications
-The FlashInfer sampling kernel has been widely adopted by several prominent frameworks, including [sglang](https://github.com/sgl-project/sglang) and [vLLM](https://github.com/vllm-project/vllm/pull/7137). We are grateful for the community's valuable feedback and bug reports that have helped improve the implementation. Beyond sampling, the core ideas behind our approach have broader applications, particularly in speculative decoding verification. This includes techniques like [chain speculative sampling](https://arxiv.org/pdf/2302.01318) and [tree speculative verification](https://arxiv.org/pdf/2305.09781).
+
+The FlashInfer sampling kernel has gained widespread adoption across major LLM frameworks, including [MLC-LLM](https://github.com/mlc-ai/mlc-llm), [sglang](https://github.com/sgl-project/sglang), and [vLLM](https://github.com/vllm-project/vllm/pull/7137). The community's active engagement through feedback and bug reports has been instrumental in refining and improving our implementation. 
+
+Beyond token sampling, our approach's core principles have proven valuable in other areas of LLM inference optimization. For instance, our techniques have been particularly impactful in speculative decoding verification, as demonstrated in methods like [chain speculative sampling](https://arxiv.org/pdf/2302.01318) and [tree speculative verification](https://arxiv.org/pdf/2305.09781). Building on these foundations, recent innovations like [Twilight](https://github.com/tsinghua-ideal/Twilight) have further advanced the field by successfully combining top-p sampling with sparse attention in a unified approach.
 
 ## Implementation Details