Update README.md (#11964)
This commit is contained in:
parent
14b2c8dc32
commit
431affd0a0
1 changed files with 5 additions and 5 deletions
|
|
@ -1,5 +1,5 @@
|
||||||
# Run Large Language Model on Intel NPU
|
# Run HuggingFace `transformers` Models on Intel NPU
|
||||||
In this directory, you will find examples on how you could apply IPEX-LLM INT4 or INT8 optimizations on LLM models on [Intel NPUs](../../../README.md). See the table blow for verified models.
|
In this directory, you will find examples on how to directly run HuggingFace `transformers` models on Intel NPUs (leveraging *Intel NPU Acceleration Library*). See the table blow for verified models.
|
||||||
|
|
||||||
## Verified Models
|
## Verified Models
|
||||||
|
|
||||||
|
|
@ -52,7 +52,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
set BIGDL_USE_NPU=1
|
set BIGDL_USE_NPU=1
|
||||||
```
|
```
|
||||||
|
|
||||||
## 3. Run models
|
## 3. Run Models
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Llama2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel NPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Llama2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel NPUs.
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
@ -77,7 +77,7 @@ done
|
||||||
```
|
```
|
||||||
|
|
||||||
## 4. Run Optimized Models (Experimental)
|
## 4. Run Optimized Models (Experimental)
|
||||||
The example below shows how to run the **_optimized model implementations_** on Intel NPU, including
|
The examples below show how to run the **_optimized HuggingFace model implementations_** on Intel NPU, including
|
||||||
- [Llama2-7B](./llama.py)
|
- [Llama2-7B](./llama.py)
|
||||||
- [Llama3-8B](./llama.py)
|
- [Llama3-8B](./llama.py)
|
||||||
- [Qwen2-1.5B](./qwen2.py)
|
- [Qwen2-1.5B](./qwen2.py)
|
||||||
|
|
@ -92,7 +92,7 @@ Supported models: Llama2-7B, Qwen2-1.5B, Qwen2-7B, MiniCPM-1B, Baichuan2-7B
|
||||||
#### 32.0.101.2715
|
#### 32.0.101.2715
|
||||||
Supported models: Llama3-8B, MiniCPM-2B
|
Supported models: Llama3-8B, MiniCPM-2B
|
||||||
|
|
||||||
### Run Models
|
### Run
|
||||||
```bash
|
```bash
|
||||||
# to run Llama-2-7b-chat-hf
|
# to run Llama-2-7b-chat-hf
|
||||||
python llama.py
|
python llama.py
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue