Commit graph

48 commits

Author SHA1 Message Date
Shaojun Liu
72b4efaad4
Enhanced XPU Dockerfiles: Optimized Environment Variables and Documentation (#11506)
* Added SYCL_CACHE_PERSISTENT=1 to xpu Dockerfile

* Update the document to add explanations for environment variables.

* update quickstart
2024-07-04 20:18:38 +08:00
Shaojun Liu
5aa3e427a9
Fix docker images (#11362)
* Fix docker images

* add-apt-repository requires gnupg, gpg-agent, software-properties-common

* update

* avoid importing ipex again
2024-06-20 15:44:55 +08:00
Qiyuan Gong
de4bb97b4f
Remove accelerate 0.23.0 install command in readme and docker (#11333)
*ipex-llm's accelerate has been upgraded to 0.23.0. Remove accelerate 0.23.0 install command in README and docker。
2024-06-17 17:52:12 +08:00
Shaojun Liu
9760ffc256
Fix SDLe CT222 Vulnerabilities (#11237)
* fix ct222 vuln

* update

* fix

* update ENTRYPOINT

* revert ENTRYPOINT

* Fix CT222 Vulns

* fix

* revert changes

* fix

* revert

* add sudo permission to ipex-llm user

* do not use ipex-llm user
2024-06-13 15:31:22 +08:00
Shaojun Liu
744042d1b2
remove software-properties-common from Dockerfile (#11203) 2024-06-04 17:37:42 +08:00
Qiyuan Gong
21a1a973c1
Remove axolotl and python3-blinker (#11127)
* Remove axolotl from image to reduce image size.
* Remove python3-blinker to avoid axolotl lib conflict.
2024-05-24 13:54:19 +08:00
Qiyuan Gong
1e00bd7bbe
Re-org XPU finetune images (#10971)
* Rename xpu finetune image from `ipex-llm-finetune-qlora-xpu` to `ipex-llm-finetune-xpu`.
* Add axolotl to xpu finetune image.
* Upgrade peft to 0.10.0, transformers to 4.36.0.
* Add accelerate default config to home.
2024-05-15 09:42:43 +08:00
Qiyuan Gong
c11170b96f
Upgrade Peft to 0.10.0 in finetune examples and docker (#10930)
* Upgrade Peft to 0.10.0 in finetune examples.
* Upgrade Peft to 0.10.0 in docker.
2024-05-07 15:12:26 +08:00
Qiyuan Gong
41ffe1526c
Modify CPU finetune docker for bz2 error (#10919)
* Avoid bz2 error
* change to cpu torch
2024-05-06 10:41:50 +08:00
Heyang Sun
751f6d11d8
fix typos in qlora README (#10893) 2024-04-26 14:03:06 +08:00
Shaojun Liu
7297036c03
upgrade python (#10769) 2024-04-16 09:28:10 +08:00
Shaojun Liu
3590e1be83
revert python to 3.9 for finetune image (#10758) 2024-04-15 10:37:10 +08:00
Shaojun Liu
29bf28bd6f
Upgrade python to 3.11 in Docker Image (#10718)
* install python 3.11 for cpu-inference docker image

* update xpu-inference dockerfile

* update cpu-serving image

* update qlora image

* update lora image

* update document
2024-04-10 14:41:27 +08:00
Heyang Sun
4f6df37805
fix wrong cpu core num seen by docker (#10645) 2024-04-03 15:52:25 +08:00
Heyang Sun
b8b923ed04
move chown step to behind add script in qlora Dockerfile 2024-04-02 23:04:51 +08:00
Shaojun Liu
a10f5a1b8d
add python style check (#10620)
* add python style check

* fix style checks

* update runner

* add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow

* update tag to 2.1.0-SNAPSHOT
2024-04-02 16:17:56 +08:00
Shaojun Liu
59058bb206
replace 2.5.0-SNAPSHOT with 2.1.0-SNAPSHOT for llm docker images (#10603) 2024-04-01 09:58:51 +08:00
ZehuaCao
52a2135d83
Replace ipex with ipex-llm (#10554)
* fix ipex with ipex_llm

* fix ipex with ipex_llm

* update

* update

* update

* update

* update

* update

* update

* update
2024-03-28 13:54:40 +08:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package (#10557)
* Change to 'pip install .. --extra-index-url' for readthedocs

* Change to 'pip install .. --extra-index-url' for examples

* Change to 'pip install .. --extra-index-url' for remaining files

* Fix URL for ipex

* Add links for ipex US and CN servers

* Update ipex cpu url

* remove readme

* Update for github actions

* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Wang, Jian4
e2d25de17d
Update_docker by heyang (#29) 2024-03-25 10:05:46 +08:00
Heyang Sun
c672e97239 Fix CPU finetuning docker (#10494)
* Fix CPU finetuning docker

* Update README.md
2024-03-21 11:53:30 +08:00
Wang, Jian4
1de13ea578 LLM: remove CPU english_quotes dataset and update docker example (#10399)
* update dataset

* update readme

* update docker cpu

* update xpu docker
2024-03-18 10:45:14 +08:00
ZehuaCao
146b77f113 fix qlora-finetune Dockerfile (#10379) 2024-03-12 13:20:06 +08:00
Shaojun Liu
079f2011ea Update bigdl-llm-finetune-qlora-xpu Docker Image (#10194)
* Bump oneapi version to 2024.0

* pip install bitsandbytes scipy

* Pin level-zero-gpu version

* Pin accelerate version 0.23.0
2024-02-21 15:18:27 +08:00
binbin Deng
171fb2d185 LLM: reorganize GPU finetuning examples (#9952) 2024-01-25 19:02:38 +08:00
ZehuaCao
d204125e88 [LLM] Use to build a more slim docker for k8s (#9608)
* Create Dockerfile.k8s

* Update Dockerfile

More slim standalone image

* Update Dockerfile

* Update Dockerfile.k8s

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

Refer to this [pr](https://github.com/intel-analytics/BigDL/pull/9551/files#diff-2025188afa54672d21236e6955c7c7f7686bec9239532e41c7983858cc9aaa89), update the LoraConfig

* update

* update

* update

* update

* update

* update

* update

* update transformer version

* update Dockerfile

* update Docker image name

* fix error
2023-12-08 10:25:36 +08:00
Heyang Sun
4e70e33934 [LLM] code and document for distributed qlora (#9585)
* [LLM] code and document for distributed qlora

* doc

* refine for gradient checkpoint

* refine

* Update alpaca_qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

* add link in doc
2023-12-06 09:23:17 +08:00
Heyang Sun
74fd7077a2 [LLM] Multi-process and distributed QLoRA on CPU platform (#9491)
* [LLM] Multi-process and distributed QLoRA on CPU platform

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* enable llm-init and bind to socket

* refine

* Update Dockerfile

* add all files of qlora cpu example to /bigdl

* fix

* fix k8s

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update bigdl-qlora-finetuning-job.yaml

* fix train sync and performance issues

* add node affinity

* disable user to tune cpu per pod

* Update bigdl-qlora-finetuning-job.yaml
2023-12-01 13:47:19 +08:00
Wang, Jian4
0f78ebe35e LLM : Add qlora cpu finetune docker image (#9271)
* init qlora cpu docker image

* update

* remove ipex and update

* update

* update readme

* update example and readme
2023-11-14 10:36:53 +08:00
Shaojun Liu
0e5ab5ebfc update docker tag to 2.5.0-SNAPSHOT (#9443) 2023-11-13 16:53:40 +08:00
Ziteng Zhang
4df66f5cbc Update llm-finetune-lora-cpu dockerfile and readme
* Update README.md

* Update Dockerfile
2023-11-02 16:26:24 +08:00
Ziteng Zhang
ca2965fb9f hosted k8s.png on readthedocs (#9258) 2023-10-24 15:07:16 +08:00
Shaojun Liu
9dc76f19c0 fix hadolint error (#9223) 2023-10-19 16:22:32 +08:00
Ziteng Zhang
0d62bd4adb Added Docker installation guide and modified link in Dockerfile (#9224)
* changed '/ppml' into '/bigdl' and modified llama-7b

* Added the contents of finetuning in README

* Modified link of qlora_finetuning.py in Dockerfile
2023-10-19 15:28:05 +08:00
Ziteng Zhang
2f14f53b1c changed '/ppml' into '/bigdl' and modified llama-7b (#9209) 2023-10-18 10:25:12 +08:00
Ziteng Zhang
4a0a3c376a Add stand-alone mode on cpu for finetuning (#9127)
* Added steps for finetune on CPU in stand-alone mode

* Add stand-alone mode to bigdl-lora-finetuing-entrypoint.sh

* delete redundant docker commands

* Update README.md

Turn to intelanalytics/bigdl-llm-finetune-cpu:2.4.0-SNAPSHOT and append example outputs to allow users to check the running

* Update bigdl-lora-finetuing-entrypoint.sh

Add some tunable parameters

* Add parameters --cpus and -e WORKER_COUNT_DOCKER

* Modified the cpu number range parameters

* Set -ppn to CCL_WORKER_COUNT

* Add related configuration suggestions in README.md
2023-10-11 15:01:21 +08:00
Lilac09
1e78b0ac40 Optimize LoRA Docker by Shrinking Image Size (#9110)
* modify dockerfile

* modify dockerfile
2023-10-10 15:53:17 +08:00
Heyang Sun
2c0c9fecd0 refine LLM containers (#9109) 2023-10-09 15:45:30 +08:00
Heyang Sun
2756f9c20d XPU QLoRA Container (#9082)
* XPU QLoRA Container

* fix apt issue

* refine
2023-10-08 11:04:20 +08:00
Heyang Sun
0b40ef8261 separate trusted and native llm cpu finetune from lora (#9050)
* seperate trusted-llm and bigdl from lora finetuning

* add k8s for trusted llm finetune

* refine

* refine

* rename cpu to tdx in trusted llm

* solve conflict

* fix typo

* resolving conflict

* Delete docker/llm/finetune/lora/README.md

* fix

---------

Co-authored-by: Uxito-Ada <seusunheyang@foxmail.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>
2023-10-07 15:26:59 +08:00
Ziteng Zhang
a717352c59 Replace Llama 7b to Llama2-7b in README.md (#9055)
* Replace Llama 7b with Llama2-7b in README.md

Need to replace the base model to Llama2-7b as we are operating on Llama2 here.

* Replace Llama 7b to Llama2-7b in README.md

a llama 7b in the 1st line is missed

* Update architecture graph

---------

Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com>
2023-09-26 09:56:46 +08:00
Heyang Sun
4b843d1dbf change lora-model output behavior on k8s (#9038)
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>
2023-09-25 09:28:44 +08:00
Xiangyu Tian
52878d3e5f [PPML] Enable TLS in Attestation API Serving for LLM finetuning (#8945)
Add enableTLS flag to enable TLS in Attestation API Serving for LLM finetuning.
2023-09-18 09:32:25 +08:00
Heyang Sun
aeef73a182 Tell User How to Find Fine-tuned Model in README (#8985)
* Tell User How to Find Fine-tuned Model in README

* Update README.md
2023-09-15 13:45:40 +08:00
Xiangyu Tian
4dce238867 Fix incorrect usage in docs of Finetuning to enable TDX (#8932) 2023-09-08 16:03:14 +08:00
Xiangyu Tian
ea6d4148e9 [PPML] Add attestation for LLM Finetuning (#8908)
Add TDX attestation for LLM Finetuning in TDX CoCo

---------

Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com>
2023-09-08 10:24:04 +08:00
Heyang Sun
2d97827ec5 fix typo in lora entrypoint (#8862) 2023-09-06 13:52:25 +08:00
Heyang Sun
b1ac8dc1bc BF16 Lora Finetuning on K8S with OneCCL and Intel MPI (#8775)
* BF16 Lora Finetuning on K8S with OneCCL and Intel MPI

* Update README.md

* format

* refine

* Update README.md

* refine

* Update README.md

* increase nfs volume size to improve IO performance

* fix bugs

* Update README.md

* Update README.md

* fix permission

* move output destination

* Update README.md

* fix wrong base model name in doc

* fix output path in entrypoint

* add a permission-precreated output dir

* format

* move output logs to a persistent storage
2023-08-31 14:56:23 +08:00