Commit graph

127 commits

Author SHA1 Message Date
Wang, Jian4
e2d25de17d
Update_docker by heyang (#29) 2024-03-25 10:05:46 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm

* rm python/llm/src/bigdl

* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Heyang Sun
c672e97239 Fix CPU finetuning docker (#10494)
* Fix CPU finetuning docker

* Update README.md
2024-03-21 11:53:30 +08:00
Shaojun Liu
0e388f4b91 Fix Trivy Docker Image Vulnerabilities for BigDL Release 2.5.0 (#10447)
* Update pypi version to fix trivy issues

* refine
2024-03-19 14:52:15 +08:00
Wang, Jian4
1de13ea578 LLM: remove CPU english_quotes dataset and update docker example (#10399)
* update dataset

* update readme

* update docker cpu

* update xpu docker
2024-03-18 10:45:14 +08:00
ZehuaCao
146b77f113 fix qlora-finetune Dockerfile (#10379) 2024-03-12 13:20:06 +08:00
ZehuaCao
267de7abc3 fix fschat DEP version error (#10325) 2024-03-06 16:15:27 +08:00
Lilac09
a2ed4d714e Fix vllm service error (#10279) 2024-02-29 15:45:04 +08:00
Ziteng Zhang
e08c74f1d1 Fix build error of bigdl-llm-cpu (#10228) 2024-02-23 16:30:21 +08:00
Ziteng Zhang
f7e2591f15 [LLM] change IPEX230 to IPEX220 in dockerfile (#10222)
* change IPEX230 to IPEX220 in dockerfile
2024-02-23 15:02:08 +08:00
Shaojun Liu
079f2011ea Update bigdl-llm-finetune-qlora-xpu Docker Image (#10194)
* Bump oneapi version to 2024.0

* pip install bitsandbytes scipy

* Pin level-zero-gpu version

* Pin accelerate version 0.23.0
2024-02-21 15:18:27 +08:00
Lilac09
eca69a6022 Fix build error of bigdl-llm-cpu (#10176)
* fix build error

* fix build error

* fix build error

* fix build error
2024-02-20 14:50:12 +08:00
Lilac09
f8dcaff7f4 use default python (#10070) 2024-02-05 09:06:59 +08:00
Lilac09
72e67eedbb Add speculative support in docker (#10058)
* add speculative environment

* add speculative environment

* add speculative environment
2024-02-01 09:53:53 +08:00
binbin Deng
171fb2d185 LLM: reorganize GPU finetuning examples (#9952) 2024-01-25 19:02:38 +08:00
ZehuaCao
51aa8b62b2 add gradio_web_ui to llm-serving image (#9918) 2024-01-25 11:11:39 +08:00
Lilac09
de27ddd81a Update Dockerfile (#9981) 2024-01-24 11:10:06 +08:00
Lilac09
a2718038f7 Fix qwen model adapter in docker (#9969)
* fix qwen in docker

* add patch for model_adapter.py in fastchat

* add patch for model_adapter.py in fastchat
2024-01-24 11:01:29 +08:00
Lilac09
052962dfa5 Using original fastchat and add bigdl worker in docker image (#9967)
* add vllm worker

* add options in entrypoint
2024-01-23 14:17:05 +08:00
Shaojun Liu
32c56ffc71 pip install deps (#9916) 2024-01-17 11:03:57 +08:00
ZehuaCao
05ea0ecd70 add pv for llm-serving k8s deployment (#9906) 2024-01-16 11:32:54 +08:00
Guancheng Fu
0396fafed1 Update BigDL-LLM-inference image (#9805)
* upgrade to oneapi 2024

* Pin level-zero-gpu version

* add flag
2024-01-03 14:00:09 +08:00
Lilac09
a5c481fedd add chat.py denpendency in Dockerfile (#9699) 2023-12-18 09:00:22 +08:00
Lilac09
3afed99216 fix path issue (#9696) 2023-12-15 11:21:49 +08:00
ZehuaCao
d204125e88 [LLM] Use to build a more slim docker for k8s (#9608)
* Create Dockerfile.k8s

* Update Dockerfile

More slim standalone image

* Update Dockerfile

* Update Dockerfile.k8s

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

Refer to this [pr](https://github.com/intel-analytics/BigDL/pull/9551/files#diff-2025188afa54672d21236e6955c7c7f7686bec9239532e41c7983858cc9aaa89), update the LoraConfig

* update

* update

* update

* update

* update

* update

* update

* update transformer version

* update Dockerfile

* update Docker image name

* fix error
2023-12-08 10:25:36 +08:00
Heyang Sun
4e70e33934 [LLM] code and document for distributed qlora (#9585)
* [LLM] code and document for distributed qlora

* doc

* refine for gradient checkpoint

* refine

* Update alpaca_qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

* add link in doc
2023-12-06 09:23:17 +08:00
Guancheng Fu
8b00653039 fix doc (#9599) 2023-12-05 13:49:31 +08:00
Heyang Sun
74fd7077a2 [LLM] Multi-process and distributed QLoRA on CPU platform (#9491)
* [LLM] Multi-process and distributed QLoRA on CPU platform

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* enable llm-init and bind to socket

* refine

* Update Dockerfile

* add all files of qlora cpu example to /bigdl

* fix

* fix k8s

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update bigdl-qlora-finetuning-job.yaml

* fix train sync and performance issues

* add node affinity

* disable user to tune cpu per pod

* Update bigdl-qlora-finetuning-job.yaml
2023-12-01 13:47:19 +08:00
Lilac09
b785376f5c Add vllm-example to docker inference image (#9570)
* add vllm-serving to cpu image

* add vllm-serving to cpu image

* add vllm-serving
2023-11-30 17:04:53 +08:00
Lilac09
2554ba0913 Add usage of vllm (#9564)
* add usage of vllm

* add usage of vllm

* add usage of vllm

* add usage of vllm

* add usage of vllm

* add usage of vllm
2023-11-30 14:19:23 +08:00
Lilac09
557bb6bbdb add judgement for running serve (#9555) 2023-11-29 16:57:00 +08:00
Guancheng Fu
2b200bf2f2 Add vllm_worker related arguments in docker serving image's entrypoint (#9500)
* fix entrypoint

* fix missing long mode argument
2023-11-21 14:41:06 +08:00
Lilac09
566ec85113 add stream interval option to entrypoint (#9498) 2023-11-21 09:47:32 +08:00
Lilac09
13f6eb77b4 Add exec bash to entrypoint.sh to keep container running after being booted. (#9471)
* add bigdl-llm-init

* boot bash
2023-11-15 16:09:16 +08:00
Lilac09
24146d108f add bigdl-llm-init (#9468) 2023-11-15 14:55:33 +08:00
Lilac09
b2b085550b Remove bigdl-nano and add ipex into inference-cpu image (#9452)
* remove bigdl-nano and add ipex into inference-cpu image

* remove bigdl-nano in docker

* remove bigdl-nano in docker
2023-11-14 10:50:52 +08:00
Wang, Jian4
0f78ebe35e LLM : Add qlora cpu finetune docker image (#9271)
* init qlora cpu docker image

* update

* remove ipex and update

* update

* update readme

* update example and readme
2023-11-14 10:36:53 +08:00
Shaojun Liu
0e5ab5ebfc update docker tag to 2.5.0-SNAPSHOT (#9443) 2023-11-13 16:53:40 +08:00
Lilac09
5d4ec44488 Add all-in-one benchmark into inference-cpu docker image (#9433)
* add all-in-one into inference-cpu image

* manually_build

* revise files
2023-11-13 13:07:56 +08:00
Lilac09
74a8ad32dc Add entry point to llm-serving-xpu (#9339)
* add entry point to llm-serving-xpu

* manually build

* manually build

* add entry point to llm-serving-xpu

* manually build

* add entry point to llm-serving-xpu

* add entry point to llm-serving-xpu

* add entry point to llm-serving-xpu
2023-11-02 16:31:07 +08:00
Ziteng Zhang
4df66f5cbc Update llm-finetune-lora-cpu dockerfile and readme
* Update README.md

* Update Dockerfile
2023-11-02 16:26:24 +08:00
Lilac09
2c2bc959ad add tools into previously built images (#9317)
* modify Dockerfile

* manually build

* modify Dockerfile

* add chat.py into inference-xpu

* add benchmark into inference-cpu

* manually build

* add benchmark into inference-cpu

* add benchmark into inference-cpu

* add benchmark into inference-cpu

* add chat.py into inference-xpu

* add chat.py into inference-xpu

* change ADD to COPY in dockerfile

* fix dependency issue

* temporarily remove run-spr in llm-cpu

* temporarily remove run-spr in llm-cpu
2023-10-31 16:35:18 +08:00
Lilac09
030edeecac Ubuntu upgrade: fix installation error (#9309)
* upgrade ubuntu version in llm-inference cpu image

* fix installation issue

* fix installation issue

* fix installation issue
2023-10-31 09:55:15 +08:00
Lilac09
5842f7530e upgrade ubuntu version in llm-inference cpu image (#9307) 2023-10-30 16:51:38 +08:00
Ziteng Zhang
ca2965fb9f hosted k8s.png on readthedocs (#9258) 2023-10-24 15:07:16 +08:00
Guancheng Fu
7f66bc5c14 Fix bigdl-llm-serving-cpu Dockerfile (#9247) 2023-10-23 16:51:30 +08:00
Shaojun Liu
9dc76f19c0 fix hadolint error (#9223) 2023-10-19 16:22:32 +08:00
Ziteng Zhang
0d62bd4adb Added Docker installation guide and modified link in Dockerfile (#9224)
* changed '/ppml' into '/bigdl' and modified llama-7b

* Added the contents of finetuning in README

* Modified link of qlora_finetuning.py in Dockerfile
2023-10-19 15:28:05 +08:00
Lilac09
160c543a26 README for BigDL-LLM on docker (#9197)
* add instruction for MacOS/Linux

* modify html label of gif images

* organize structure of README

* change title name

* add inference-xpu, serving-cpu and serving-xpu parts

* revise README

* revise README

* revise README
2023-10-19 13:48:06 +08:00
Ziteng Zhang
2f14f53b1c changed '/ppml' into '/bigdl' and modified llama-7b (#9209) 2023-10-18 10:25:12 +08:00
Lilac09
326ef7f491 add README for llm-inference-cpu (#9147)
* add README for llm-inference-cpu

* modify README

* add README for llm-inference-cpu on Windows
2023-10-16 10:27:44 +08:00
Lilac09
e02fbb40cc add bigdl-llm-tutorial into llm-inference-cpu image (#9139)
* add bigdl-llm-tutorial into llm-inference-cpu image

* modify Dockerfile

* modify Dockerfile
2023-10-11 16:41:04 +08:00
Ziteng Zhang
4a0a3c376a Add stand-alone mode on cpu for finetuning (#9127)
* Added steps for finetune on CPU in stand-alone mode

* Add stand-alone mode to bigdl-lora-finetuing-entrypoint.sh

* delete redundant docker commands

* Update README.md

Turn to intelanalytics/bigdl-llm-finetune-cpu:2.4.0-SNAPSHOT and append example outputs to allow users to check the running

* Update bigdl-lora-finetuing-entrypoint.sh

Add some tunable parameters

* Add parameters --cpus and -e WORKER_COUNT_DOCKER

* Modified the cpu number range parameters

* Set -ppn to CCL_WORKER_COUNT

* Add related configuration suggestions in README.md
2023-10-11 15:01:21 +08:00
Lilac09
30e3c196f3 Merge pull request #9108 from Zhengjin-Wang/main
Add instruction for chat.py in bigdl-llm-cpu
2023-10-10 16:40:52 +08:00
Lilac09
1e78b0ac40 Optimize LoRA Docker by Shrinking Image Size (#9110)
* modify dockerfile

* modify dockerfile
2023-10-10 15:53:17 +08:00
Heyang Sun
2c0c9fecd0 refine LLM containers (#9109) 2023-10-09 15:45:30 +08:00
Wang
a1aefdb8f4 modify README 2023-10-09 13:36:29 +08:00
Wang
3814abf95a add instruction for chat.py 2023-10-09 12:57:28 +08:00
Guancheng Fu
df8df751c4 Modify readme for bigdl-llm-serving-cpu (#9105) 2023-10-09 09:56:09 +08:00
Heyang Sun
2756f9c20d XPU QLoRA Container (#9082)
* XPU QLoRA Container

* fix apt issue

* refine
2023-10-08 11:04:20 +08:00
Heyang Sun
0b40ef8261 separate trusted and native llm cpu finetune from lora (#9050)
* seperate trusted-llm and bigdl from lora finetuning

* add k8s for trusted llm finetune

* refine

* refine

* rename cpu to tdx in trusted llm

* solve conflict

* fix typo

* resolving conflict

* Delete docker/llm/finetune/lora/README.md

* fix

---------

Co-authored-by: Uxito-Ada <seusunheyang@foxmail.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>
2023-10-07 15:26:59 +08:00
ZehuaCao
b773d67dd4 Add Kubernetes support for BigDL-LLM-serving CPU. (#9071) 2023-10-07 09:37:48 +08:00
Lilac09
ecee02b34d Add bigdl llm xpu image build (#9062)
* modify Dockerfile

* add README.md

* add README.md

* Modify Dockerfile

* Add bigdl inference cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build

* Modify Dockerfile

* Add bigdl inference cpu image build

* Add bigdl inference cpu image build

* Add bigdl llm xpu image build
2023-09-26 14:29:03 +08:00
Lilac09
9ac950fa52 Add bigdl llm cpu image build (#9047)
* modify Dockerfile

* add README.md

* add README.md

* Modify Dockerfile

* Add bigdl inference cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build
2023-09-26 13:22:11 +08:00
Ziteng Zhang
a717352c59 Replace Llama 7b to Llama2-7b in README.md (#9055)
* Replace Llama 7b with Llama2-7b in README.md

Need to replace the base model to Llama2-7b as we are operating on Llama2 here.

* Replace Llama 7b to Llama2-7b in README.md

a llama 7b in the 1st line is missed

* Update architecture graph

---------

Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com>
2023-09-26 09:56:46 +08:00
Guancheng Fu
cc84ed70b3 Create serving images (#9048)
* Finished & Tested

* Install latest pip from base images

* Add blank line

* Delete unused comment

* fix typos
2023-09-25 15:51:45 +08:00
Heyang Sun
4b843d1dbf change lora-model output behavior on k8s (#9038)
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>
2023-09-25 09:28:44 +08:00
Lilac09
9126abdf9b add README.md for bigdl-llm-cpu image (#9026)
* modify Dockerfile

* add README.md

* add README.md
2023-09-22 09:03:57 +08:00
Guancheng Fu
3913ba4577 add README.md (#9004) 2023-09-21 10:32:56 +08:00
Guancheng Fu
b6c9198d47 Add xpu image for bigdl-llm (#9003)
* Add xpu image

* fix

* fix

* fix format
2023-09-19 16:56:22 +08:00
Guancheng Fu
7353882732 add Dockerfile (#8993) 2023-09-18 13:25:37 +08:00
Xiangyu Tian
52878d3e5f [PPML] Enable TLS in Attestation API Serving for LLM finetuning (#8945)
Add enableTLS flag to enable TLS in Attestation API Serving for LLM finetuning.
2023-09-18 09:32:25 +08:00
Heyang Sun
aeef73a182 Tell User How to Find Fine-tuned Model in README (#8985)
* Tell User How to Find Fine-tuned Model in README

* Update README.md
2023-09-15 13:45:40 +08:00
Xiangyu Tian
4dce238867 Fix incorrect usage in docs of Finetuning to enable TDX (#8932) 2023-09-08 16:03:14 +08:00
Xiangyu Tian
ea6d4148e9 [PPML] Add attestation for LLM Finetuning (#8908)
Add TDX attestation for LLM Finetuning in TDX CoCo

---------

Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com>
2023-09-08 10:24:04 +08:00
Heyang Sun
2d97827ec5 fix typo in lora entrypoint (#8862) 2023-09-06 13:52:25 +08:00
Heyang Sun
b1ac8dc1bc BF16 Lora Finetuning on K8S with OneCCL and Intel MPI (#8775)
* BF16 Lora Finetuning on K8S with OneCCL and Intel MPI

* Update README.md

* format

* refine

* Update README.md

* refine

* Update README.md

* increase nfs volume size to improve IO performance

* fix bugs

* Update README.md

* Update README.md

* fix permission

* move output destination

* Update README.md

* fix wrong base model name in doc

* fix output path in entrypoint

* add a permission-precreated output dir

* format

* move output logs to a persistent storage
2023-08-31 14:56:23 +08:00