From e246f1e258402d890f9747ffe75f4fe76f9b2670 Mon Sep 17 00:00:00 2001 From: Yina Chen <33650826+cyita@users.noreply.github.com> Date: Tue, 27 Aug 2024 08:03:18 +0300 Subject: [PATCH] update llama3 npu example (#11933) --- .../NPU/HF-Transformers-AutoModels/LLM/README.md | 15 +++++++++++---- .../LLM/{llama2.py => llama.py} | 0 2 files changed, 11 insertions(+), 4 deletions(-) rename python/llm/example/NPU/HF-Transformers-AutoModels/LLM/{llama2.py => llama.py} (100%) diff --git a/python/llm/example/NPU/HF-Transformers-AutoModels/LLM/README.md b/python/llm/example/NPU/HF-Transformers-AutoModels/LLM/README.md index 52d71ed4..80c82880 100644 --- a/python/llm/example/NPU/HF-Transformers-AutoModels/LLM/README.md +++ b/python/llm/example/NPU/HF-Transformers-AutoModels/LLM/README.md @@ -78,12 +78,16 @@ done ## 4. Run Optimized Models (Experimental) The example below shows how to run the **_optimized model implementations_** on Intel NPU, including -- [Llama2-7B](./llama2.py) +- [Llama2-7B](./llama.py) +- [Llama3-8B](./llama.py) - [Qwen2-1.5B](./qwen2.py) -``` +```bash # to run Llama-2-7b-chat-hf -python llama2.py +python llama.py + +# to run Meta-Llama-3-8B-Instruct +python llama.py --repo-id-or-model-path meta-llama/Meta-Llama-3-8B-Instruct # to run Qwen2-1.5B-Instruct python qwen2.py @@ -102,7 +106,10 @@ Arguments info: If you encounter output problem, please try to disable the optimization of transposing value cache with following command: ```bash # to run Llama-2-7b-chat-hf -python  llama2.py --disable-transpose-value-cache +python  llama.py --disable-transpose-value-cache + +# to run Meta-Llama-3-8B-Instruct +python llama.py --repo-id-or-model-path meta-llama/Meta-Llama-3-8B-Instruct --disable-transpose-value-cache # to run Qwen2-1.5B-Instruct python qwen2.py --disable-transpose-value-cache diff --git a/python/llm/example/NPU/HF-Transformers-AutoModels/LLM/llama2.py b/python/llm/example/NPU/HF-Transformers-AutoModels/LLM/llama.py similarity index 100% rename from python/llm/example/NPU/HF-Transformers-AutoModels/LLM/llama2.py rename to python/llm/example/NPU/HF-Transformers-AutoModels/LLM/llama.py