ipex-llm/python/llm/example/GPU/LLM-Finetuning
Qiyuan Gong de4bb97b4f
Remove accelerate 0.23.0 install command in readme and docker (#11333)
*ipex-llm's accelerate has been upgraded to 0.23.0. Remove accelerate 0.23.0 install command in README and docker。
2024-06-17 17:52:12 +08:00
..
axolotl Remove accelerate 0.23.0 install command in readme and docker (#11333) 2024-06-17 17:52:12 +08:00
common Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
DPO Remove accelerate 0.23.0 install command in readme and docker (#11333) 2024-06-17 17:52:12 +08:00
GaLore fix missing import (#10839) 2024-04-22 14:34:52 +08:00
HF-PEFT Remove accelerate 0.23.0 install command in readme and docker (#11333) 2024-06-17 17:52:12 +08:00
LISA Remove accelerate 0.23.0 install command in readme and docker (#11333) 2024-06-17 17:52:12 +08:00
LoRA Remove accelerate 0.23.0 install command in readme and docker (#11333) 2024-06-17 17:52:12 +08:00
QA-LoRA Remove accelerate 0.23.0 install command in readme and docker (#11333) 2024-06-17 17:52:12 +08:00
QLoRA Remove accelerate 0.23.0 install command in readme and docker (#11333) 2024-06-17 17:52:12 +08:00
ReLora Remove accelerate 0.23.0 install command in readme and docker (#11333) 2024-06-17 17:52:12 +08:00
README.md Add WANDB_MODE and HF_HUB_OFFLINE to XPU finetune README (#11097) 2024-05-22 15:20:53 +08:00

Running LLM Finetuning using IPEX-LLM on Intel GPU

This folder contains examples of running different training mode with IPEX-LLM on Intel GPU:

  • LoRA: examples of running LoRA finetuning
  • QLoRA: examples of running QLoRA finetuning
  • QA-LoRA: examples of running QA-LoRA finetuning
  • ReLora: examples of running ReLora finetuning
  • DPO: examples of running DPO finetuning
  • common: common templates and utility classes in finetuning examples
  • HF-PEFT: run finetuning on Intel GPU using Hugging Face PEFT code without modification
  • axolotl: LLM finetuning on Intel GPU using axolotl without writing code

Verified Models

Model Finetune mode Frameworks Support
LLaMA 2/3 LoRA, QLoRA, QA-LoRA, ReLora HF-PEFT, axolotl
Mistral LoRA, QLoRA DPO
ChatGLM 3 QLoRA HF-PEFT
Qwen-1.5 QLoRA HF-PEFT
Baichuan2 QLoRA HF-PEFT

Troubleshooting

  • If you fail to finetune on multi cards because of following error message:

    RuntimeError: oneCCL: comm_selector.cpp:57 create_comm_impl: EXCEPTION: ze_data was not initialized
    

    Please try sudo apt install level-zero-dev to fix it.

  • Please raise the system open file limit using ulimit -n 1048576. Otherwise, there may exist error Too many open files.

  • If application raise wandb.errors.UsageError: api_key not configured (no-tty). Please login wandb or disable wandb login with this command:

export WANDB_MODE=offline
  • If application raise Hugging Face related errors, i.e., NewConnectionError or Failed to download etc. Please download models and datasets, set model and data path, then set HF_HUB_OFFLINE with this command:
export HF_HUB_OFFLINE=1