[LLM]update ipex part in mistral example readme (#10239)
* update ipex part in mistral example readme
This commit is contained in:
		
							parent
							
								
									7c236e4c6d
								
							
						
					
					
						commit
						ea23afc8ec
					
				
					 1 changed files with 21 additions and 1 deletions
				
			
		| 
						 | 
				
			
			@ -84,4 +84,24 @@ First token latency xx.xxxxs
 | 
			
		|||
 | 
			
		||||
### 4. Accelerate with BIGDL_OPT_IPEX
 | 
			
		||||
 | 
			
		||||
BIGDL_OPT_IPEX can help to accelerate speculative decoding on Mistral, and please refer to [here](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/CPU/Speculative-Decoding/llama2#4-accelerate-with-bigdl_opt_ipex) for a try.
 | 
			
		||||
To accelerate speculative decoding on CPU, you can install our validated version of [IPEX 2.2.0+cpu](https://github.com/intel/intel-extension-for-pytorch/tree/v2.2.0%2Bcpu) refering to [IPEX's installation guide](https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=cpu&version=v2.2.0%2Bcpu), or by the following commands: (Other versions of IPEX may have some conflicts and can not accelerate speculative decoding correctly.)
 | 
			
		||||
 | 
			
		||||
```bash
 | 
			
		||||
# Install IPEX 2.2.0+cpu
 | 
			
		||||
python -m pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cpu
 | 
			
		||||
python -m pip install intel-extension-for-pytorch==2.2.0
 | 
			
		||||
python -m pip install oneccl_bind_pt==2.2.0 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/
 | 
			
		||||
# if there is any installation problem for oneccl_binding, you can also find suitable index url at "https://pytorch-extension.intel.com/release-whl/stable/cpu/cn/" or "https://developer.intel.com/ipex-whl-stable-cpu" according to your environment.
 | 
			
		||||
 | 
			
		||||
# Update transformers
 | 
			
		||||
pip install transformers==4.35.2
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
After installed IPEX, you can set `BIGDL_OPT_IPEX=true` to get target model acceleration. Currently `Mistral-7B-Instruct-v0.1 and Mistral-7B-v0.1` are supported.
 | 
			
		||||
 | 
			
		||||
```bash
 | 
			
		||||
source bigdl-llm-init -t
 | 
			
		||||
export BIGDL_OPT_IPEX=true
 | 
			
		||||
export OMP_NUM_THREADS=48 # you can change 48 here to #cores of one processor socket
 | 
			
		||||
numactl -C 0-47 -m 0 python ./speculative.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --n-predict N_PREDICT
 | 
			
		||||
```
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in a new issue