History

Yishuo Wang d0d9c9d636 remove load_in_8bit usage as it is not supported a long time ago (#12779 )		2025-02-07 11:21:29 +08:00
..
EAGLE	remove load_in_8bit usage as it is not supported a long time ago (#12779 )	2025-02-07 11:21:29 +08:00
Self-Speculation	Reconstruct Speculative Decoding example directory (#11136 )	2024-05-29 13:15:27 -07:00
README.md	Reconstruct Speculative Decoding example directory (#11136 )	2024-05-29 13:15:27 -07:00

Speculative-Decoding Examples on Intel GPU

This folder contains examples of running Speculative-Decoding Examples with IPEX-LLM on Intel GPU:

Self-Speculation: running BF16 inference for Huggingface Transformer model with self-speculative decoding with IPEX-LLM on Intel GPUs
EAGLE: running speculative sampling using EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency) with IPEX-LLM on Intel GPUs