diff --git a/python/llm/example/CPU/Speculative-Decoding/qwen/README.md b/python/llm/example/CPU/Speculative-Decoding/qwen/README.md
index 85548b55..c2b7dfc7 100644
--- a/python/llm/example/CPU/Speculative-Decoding/qwen/README.md
+++ b/python/llm/example/CPU/Speculative-Decoding/qwen/README.md
@@ -92,7 +92,7 @@ First token latency x.xxxxs
 ```
 
 ### 4. Accelerate with BIGDL_OPT_IPEX
-
+`Qwen/Qwen-72B-Chat` is not supported using ipex now.
 To accelerate speculative decoding on CPU, you can install our validated version of [IPEX 2.3.0+git004cd72d](https://github.com/intel/intel-extension-for-pytorch/tree/004cd72db60e87bb0712d42e3120bac9854bd77e) by following steps: (Other versions of IPEX may have some conflicts and can not accelerate speculative decoding correctly.)
 
 #### 4.1 Download IPEX installation script
diff --git a/python/llm/src/bigdl/llm/transformers/speculative.py b/python/llm/src/bigdl/llm/transformers/speculative.py
index be77fdf8..a8633fca 100644
--- a/python/llm/src/bigdl/llm/transformers/speculative.py
+++ b/python/llm/src/bigdl/llm/transformers/speculative.py
@@ -475,7 +475,7 @@ def speculative_generate(self,
                 ("qwen" in self.config.model_type) or
                 ("chatglm" in self.config.model_type)):
             invalidInputError(False, "BigDL Speculative Decoding with IPEX BF16 only supports \
-                                      Llama, Baichuan2, Mistral and ChatGLM and Qwen models currently.")
+                              Llama, Baichuan2, Mistral, ChatGLM and Qwen models currently.")
         if "chatglm" in self.config.model_type:
             global query_group_size
             query_group_size = draft_model.config.num_attention_heads // \