Commit graph

5 commits

Author SHA1 Message Date
Wang, Jian4
e2d25de17d
Update_docker by heyang (#29) 2024-03-25 10:05:46 +08:00
ZehuaCao
146b77f113 fix qlora-finetune Dockerfile (#10379) 2024-03-12 13:20:06 +08:00
ZehuaCao
d204125e88 [LLM] Use to build a more slim docker for k8s (#9608)
* Create Dockerfile.k8s

* Update Dockerfile

More slim standalone image

* Update Dockerfile

* Update Dockerfile.k8s

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

Refer to this [pr](https://github.com/intel-analytics/BigDL/pull/9551/files#diff-2025188afa54672d21236e6955c7c7f7686bec9239532e41c7983858cc9aaa89), update the LoraConfig

* update

* update

* update

* update

* update

* update

* update

* update transformer version

* update Dockerfile

* update Docker image name

* fix error
2023-12-08 10:25:36 +08:00
Heyang Sun
4e70e33934 [LLM] code and document for distributed qlora (#9585)
* [LLM] code and document for distributed qlora

* doc

* refine for gradient checkpoint

* refine

* Update alpaca_qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

* Update alpaca_qlora_finetuning_cpu.py

* add link in doc
2023-12-06 09:23:17 +08:00
Heyang Sun
74fd7077a2 [LLM] Multi-process and distributed QLoRA on CPU platform (#9491)
* [LLM] Multi-process and distributed QLoRA on CPU platform

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* enable llm-init and bind to socket

* refine

* Update Dockerfile

* add all files of qlora cpu example to /bigdl

* fix

* fix k8s

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update bigdl-qlora-finetuing-entrypoint.sh

* Update bigdl-qlora-finetuning-job.yaml

* fix train sync and performance issues

* add node affinity

* disable user to tune cpu per pod

* Update bigdl-qlora-finetuning-job.yaml
2023-12-01 13:47:19 +08:00