ipex-llm/python/llm/example/GPU/Speculative-Decoding/README.md
2024-05-29 13:15:27 -07:00

6 lines
465 B
Markdown

# Speculative-Decoding Examples on Intel GPU
This folder contains examples of running Speculative-Decoding Examples with IPEX-LLM on Intel GPU:
- [Self-Speculation](Self-Speculation): running BF16 inference for Huggingface Transformer model with ***self-speculative decoding*** with IPEX-LLM on Intel GPUs
- [EAGLE](EAGLE): running speculative sampling using ***EAGLE*** (Extrapolation Algorithm for Greater Language-model Efficiency) with IPEX-LLM on Intel GPUs