Update doc for running npu generate example with ipex-llm[npu] (#11876)
* update doc for running npu generate example with ipex-llm[npu] * switch max_prompt_len to 512 to fix compile error on mtl
This commit is contained in:
parent
209d42ab79
commit
8c5c7f32dd
2 changed files with 4 additions and 9 deletions
|
|
@ -32,13 +32,8 @@ We suggest using conda to manage environment:
|
|||
conda create -n llm python=3.10
|
||||
conda activate llm
|
||||
|
||||
# install ipex-llm with 'all' option
|
||||
pip install --pre --upgrade ipex-llm[all]
|
||||
|
||||
# below command will install intel_npu_acceleration_library
|
||||
pip install intel-npu-acceleration-library==1.3
|
||||
|
||||
pip install transformers==4.40
|
||||
# install ipex-llm with 'npu' option
|
||||
pip install --pre --upgrade ipex-llm[npu]
|
||||
```
|
||||
|
||||
### 2. Runtime Configurations
|
||||
|
|
@ -124,7 +119,7 @@ Arguments info:
|
|||
- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `What is AI?`.
|
||||
- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
|
||||
- `--max-output-len MAX_OUTPUT_LEN`: Defines the maximum sequence length for both input and output tokens. It is default to be `1024`.
|
||||
- `--max-prompt-len MAX_PROMPT_LEN`: Defines the maximum number of tokens that the input prompt can contain. It is default to be `768`.
|
||||
- `--max-prompt-len MAX_PROMPT_LEN`: Defines the maximum number of tokens that the input prompt can contain. It is default to be `512`.
|
||||
|
||||
|
||||
#### Sample Output
|
||||
|
|
|
|||
|
|
@ -54,7 +54,7 @@ if __name__ == "__main__":
|
|||
help='Prompt to infer')
|
||||
parser.add_argument("--n-predict", type=int, default=32, help="Max tokens to predict")
|
||||
parser.add_argument("--max-output-len", type=int, default=1024)
|
||||
parser.add_argument("--max-prompt-len", type=int, default=768)
|
||||
parser.add_argument("--max-prompt-len", type=int, default=512)
|
||||
parser.add_argument("--disable-transpose-value-cache", action="store_true", default=False)
|
||||
parser.add_argument("--intra-pp", type=int, default=2)
|
||||
parser.add_argument("--inter-pp", type=int, default=2)
|
||||
|
|
|
|||
Loading…
Reference in a new issue