* Fintune ChatGLM with Deepspeed Zero3 LoRA * add deepspeed zero3 config * rename config * remove offload_param * add save_checkpoint parameter * Update lora_deepspeed_zero3_finetune_chatglm3_6b_arc_2_card.sh * refine |
||
|---|---|---|
| .. | ||
| axolotl | ||
| common | ||
| DPO | ||
| GaLore | ||
| HF-PEFT | ||
| LISA | ||
| LoRA | ||
| QA-LoRA | ||
| QLoRA | ||
| ReLora | ||
| README.md | ||
Running LLM Finetuning using IPEX-LLM on Intel GPU
This folder contains examples of running different training mode with IPEX-LLM on Intel GPU:
- LoRA: examples of running LoRA finetuning
- QLoRA: examples of running QLoRA finetuning
- QA-LoRA: examples of running QA-LoRA finetuning
- ReLora: examples of running ReLora finetuning
- DPO: examples of running DPO finetuning
- common: common templates and utility classes in finetuning examples
- HF-PEFT: run finetuning on Intel GPU using Hugging Face PEFT code without modification
- axolotl: LLM finetuning on Intel GPU using axolotl without writing code
Verified Models
| Model | Finetune mode | Frameworks Support |
|---|---|---|
| LLaMA 2/3 | LoRA, QLoRA, QA-LoRA, ReLora | HF-PEFT, axolotl |
| Mistral | LoRA, QLoRA | DPO |
| ChatGLM 3 | QLoRA | HF-PEFT |
| Qwen-1.5 | QLoRA | HF-PEFT |
| Baichuan2 | QLoRA | HF-PEFT |
Troubleshooting
-
If you fail to finetune on multi cards because of following error message:
RuntimeError: oneCCL: comm_selector.cpp:57 create_comm_impl: EXCEPTION: ze_data was not initializedPlease try
sudo apt install level-zero-devto fix it. -
Please raise the system open file limit using
ulimit -n 1048576. Otherwise, there may exist errorToo many open files. -
If application raise
wandb.errors.UsageError: api_key not configured (no-tty). Please login wandb or disable wandb login with this command:
export WANDB_MODE=offline
- If application raise Hugging Face related errors, i.e.,
NewConnectionErrororFailed to downloadetc. Please download models and datasets, set model and data path, then setHF_HUB_OFFLINEwith this command:
export HF_HUB_OFFLINE=1