change npu document (#11446)
This commit is contained in:
		
							parent
							
								
									508c364a79
								
							
						
					
					
						commit
						cf0f5c4322
					
				
					 1 changed files with 2 additions and 7 deletions
				
			
		| 
						 | 
					@ -20,12 +20,7 @@ conda activate llm
 | 
				
			||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
 | 
					pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
 | 
				
			||||||
 | 
					
 | 
				
			||||||
# below command will install intel_npu_acceleration_library
 | 
					# below command will install intel_npu_acceleration_library
 | 
				
			||||||
conda install cmake
 | 
					pip install intel-npu-acceleration-library==1.3
 | 
				
			||||||
git clone https://github.com/intel/intel-npu-acceleration-library npu-library
 | 
					 | 
				
			||||||
cd npu-library
 | 
					 | 
				
			||||||
git checkout bcb1315
 | 
					 | 
				
			||||||
python setup.py bdist_wheel
 | 
					 | 
				
			||||||
pip install dist\intel_npu_acceleration_library-1.2.0-cp310-cp310-win_amd64.whl
 | 
					 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
### 2. Runtime Configurations
 | 
					### 2. Runtime Configurations
 | 
				
			||||||
| 
						 | 
					@ -48,7 +43,7 @@ Arguments info:
 | 
				
			||||||
- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the Llama2 model (e.g. `meta-llama/Llama-2-7b-chat-hf` and `meta-llama/Llama-2-13b-chat-hf`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'meta-llama/Llama-2-7b-chat-hf'`.
 | 
					- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the Llama2 model (e.g. `meta-llama/Llama-2-7b-chat-hf` and `meta-llama/Llama-2-13b-chat-hf`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'meta-llama/Llama-2-7b-chat-hf'`.
 | 
				
			||||||
- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'Once upon a time, there existed a little girl who liked to have adventures. She wanted to go to places and meet new people, and have fun'`.
 | 
					- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'Once upon a time, there existed a little girl who liked to have adventures. She wanted to go to places and meet new people, and have fun'`.
 | 
				
			||||||
- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
 | 
					- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
 | 
				
			||||||
- `--load_in_low_bit`: argument defining the load_in_low_bit format used. It is default to be `sym_int8`, `sym_int4` can also be used.
 | 
					- `--load_in_low_bit`: argument defining the `load_in_low_bit` format used. It is default to be `sym_int8`, `sym_int4` can also be used.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
#### Sample Output
 | 
					#### Sample Output
 | 
				
			||||||
#### [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
 | 
					#### [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
		Loading…
	
		Reference in a new issue