Wang, Jian4
|
b119825152
|
Remove tgi parameter validation (#11688)
* remove validation
* add min warm up
* remove no need source
|
2024-07-30 16:37:44 +08:00 |
|
Guancheng Fu
|
86fc0492f4
|
Update oneccl used (#11647)
* Add internal oneccl
* fix
* fix
* add oneccl
|
2024-07-26 09:38:39 +08:00 |
|
Wang, Jian4
|
1eed0635f2
|
Add lightweight serving and support tgi parameter (#11600)
* init tgi request
* update openai api
* update for pp
* update and add readme
* add to docker
* add start bash
* update
* update
* update
|
2024-07-19 13:15:56 +08:00 |
|
Wang, Jian4
|
9c15abf825
|
Refactor fastapi-serving and add one card serving(#11581)
* init fastapi-serving one card
* mv api code to source
* update worker
* update for style-check
* add worker
* update bash
* update
* update worker name and add readme
* rename update
* rename to fastapi
|
2024-07-17 11:12:43 +08:00 |
|
Xiangyu Tian
|
7f5111a998
|
LLM: Refine start script for Pipeline Parallel Serving (#11557)
Refine start script and readme for Pipeline Parallel Serving
|
2024-07-11 15:45:27 +08:00 |
|
binbin Deng
|
66f6ffe4b2
|
Update GPU HF-Transformers example structure (#11526)
|
2024-07-08 17:58:06 +08:00 |
|
Shaojun Liu
|
72b4efaad4
|
Enhanced XPU Dockerfiles: Optimized Environment Variables and Documentation (#11506)
* Added SYCL_CACHE_PERSISTENT=1 to xpu Dockerfile
* Update the document to add explanations for environment variables.
* update quickstart
|
2024-07-04 20:18:38 +08:00 |
|
Guancheng Fu
|
4fbb0d33ae
|
Pin compute runtime version for xpu images (#11479)
* pin compute runtime version
* fix done
|
2024-07-01 21:41:02 +08:00 |
|
Wang, Jian4
|
e000ac90c4
|
Add pp_serving example to serving image (#11433)
* init pp
* update
* update
* no clone ipex-llm again
|
2024-06-28 16:45:25 +08:00 |
|
Wang, Jian4
|
b7bc1023fb
|
Add vllm_online_benchmark.py (#11458)
* init
* update and add
* update
|
2024-06-28 14:59:06 +08:00 |
|
Shaojun Liu
|
5aa3e427a9
|
Fix docker images (#11362)
* Fix docker images
* add-apt-repository requires gnupg, gpg-agent, software-properties-common
* update
* avoid importing ipex again
|
2024-06-20 15:44:55 +08:00 |
|
Xiangyu Tian
|
ef9f740801
|
Docs: Fix CPU Serving Docker README (#11351)
Fix CPU Serving Docker README
|
2024-06-18 16:27:51 +08:00 |
|
Guancheng Fu
|
c9b4cadd81
|
fix vLLM/docker issues (#11348)
* fix
* fix
* ffix
|
2024-06-18 16:23:53 +08:00 |
|
Qiyuan Gong
|
de4bb97b4f
|
Remove accelerate 0.23.0 install command in readme and docker (#11333)
*ipex-llm's accelerate has been upgraded to 0.23.0. Remove accelerate 0.23.0 install command in README and docker。
|
2024-06-17 17:52:12 +08:00 |
|
Shaojun Liu
|
77809be946
|
Install packages for ipex-llm-serving-cpu docker image (#11321)
* apt-get install patch
* Update Dockerfile
* Update Dockerfile
* revert
|
2024-06-14 15:26:01 +08:00 |
|
Shaojun Liu
|
9760ffc256
|
Fix SDLe CT222 Vulnerabilities (#11237)
* fix ct222 vuln
* update
* fix
* update ENTRYPOINT
* revert ENTRYPOINT
* Fix CT222 Vulns
* fix
* revert changes
* fix
* revert
* add sudo permission to ipex-llm user
* do not use ipex-llm user
|
2024-06-13 15:31:22 +08:00 |
|
Shaojun Liu
|
84f04087fb
|
Add intelanalytics/ipex-llm:sources image for OSPDT (#11296)
* Add intelanalytics/ipex-llm:sources image
* apt-get source
|
2024-06-13 14:29:14 +08:00 |
|
Guancheng Fu
|
2e75bbccf9
|
Add more control arguments for benchmark_vllm_throughput (#11291)
|
2024-06-12 17:43:06 +08:00 |
|
Guancheng Fu
|
eeffeeb2e2
|
fix benchmark script(#11243)
|
2024-06-06 17:44:19 +08:00 |
|
Shaojun Liu
|
1f2057b16a
|
Fix ipex-llm-cpu docker image (#11213)
* fix
* fix ipex-llm-cpu image
|
2024-06-05 11:13:17 +08:00 |
|
Xiangyu Tian
|
ac3d53ff5d
|
LLM: Fix vLLM CPU version error (#11206)
Fix vLLM CPU version error
|
2024-06-04 19:10:23 +08:00 |
|
Guancheng Fu
|
3ef4aa98d1
|
Refine vllm_quickstart doc (#11199)
* refine doc
* refine
|
2024-06-04 18:46:27 +08:00 |
|
Shaojun Liu
|
744042d1b2
|
remove software-properties-common from Dockerfile (#11203)
|
2024-06-04 17:37:42 +08:00 |
|
Guancheng Fu
|
daf7b1cd56
|
[Docker] Fix image using two cards error (#11144)
* fix all
* done
|
2024-05-27 16:20:13 +08:00 |
|
Qiyuan Gong
|
21a1a973c1
|
Remove axolotl and python3-blinker (#11127)
* Remove axolotl from image to reduce image size.
* Remove python3-blinker to avoid axolotl lib conflict.
|
2024-05-24 13:54:19 +08:00 |
|
Wang, Jian4
|
1443b802cc
|
Docker:Fix building cpp_docker and remove unimportant dependencies (#11114)
* test build
* update
|
2024-05-24 09:49:44 +08:00 |
|
Xiangyu Tian
|
b3f6faa038
|
LLM: Add CPU vLLM entrypoint (#11083)
Add CPU vLLM entrypoint and update CPU vLLM serving example.
|
2024-05-24 09:16:59 +08:00 |
|
Shaojun Liu
|
e0f401d97d
|
FIX: APT Repository not working (signatures invalid) (#11112)
* chmod 644 gpg key
* chmod 644 gpg key
|
2024-05-23 16:15:45 +08:00 |
|
binbin Deng
|
ecb16dcf14
|
Add deepspeed autotp support for xpu docker (#11077)
|
2024-05-21 14:49:54 +08:00 |
|
Wang, Jian4
|
00d4410746
|
Update cpp docker quickstart (#11040)
* add sample output
* update link
* update
* update header
* update
|
2024-05-16 14:55:13 +08:00 |
|
Guancheng Fu
|
7e29928865
|
refactor serving docker image (#11028)
|
2024-05-16 09:30:36 +08:00 |
|
Wang, Jian4
|
86cec80b51
|
LLM: Add llm inference_cpp_xpu_docker (#10933)
* test_cpp_docker
* update
* update
* update
* update
* add sudo
* update nodejs version
* no need npm
* remove blinker
* new cpp docker
* restore
* add line
* add manually_build
* update and add mtl
* update for workdir llm
* add benchmark part
* update readme
* update 1024-128
* update readme
* update
* fix
* update
* update
* update readme too
* update readme
* no change
* update dir_name
* update readme
|
2024-05-15 11:10:22 +08:00 |
|
Qiyuan Gong
|
1e00bd7bbe
|
Re-org XPU finetune images (#10971)
* Rename xpu finetune image from `ipex-llm-finetune-qlora-xpu` to `ipex-llm-finetune-xpu`.
* Add axolotl to xpu finetune image.
* Upgrade peft to 0.10.0, transformers to 4.36.0.
* Add accelerate default config to home.
|
2024-05-15 09:42:43 +08:00 |
|
Shengsheng Huang
|
0b7e78b592
|
revise the benchmark part in python inference docker (#11020)
|
2024-05-14 18:43:41 +08:00 |
|
Shengsheng Huang
|
586a151f9c
|
update the README and reorganize the docker guides structure. (#11016)
* update the README and reorganize the docker guides structure.
* modified docker install guide into overview
|
2024-05-14 17:56:11 +08:00 |
|
Shaojun Liu
|
7f8c5b410b
|
Quickstart: Run PyTorch Inference on Intel GPU using Docker (on Linux or WSL) (#10970)
* add entrypoint.sh
* add quickstart
* remove entrypoint
* update
* Install related library of benchmarking
* update
* print out results
* update docs
* minor update
* update
* update quickstart
* update
* update
* update
* update
* update
* update
* add chat & example section
* add more details
* minor update
* rename quickstart
* update
* minor update
* update
* update config.yaml
* update readme
* use --gpu
* add tips
* minor update
* update
|
2024-05-14 12:58:31 +08:00 |
|
Zephyr1101
|
7e7d969dcb
|
a experimental for workflow abuse step1 fix a typo (#10965)
* Update llm_unit_tests.yml
* Update README.md
* Update llm_unit_tests.yml
* Update llm_unit_tests.yml
|
2024-05-08 17:12:50 +08:00 |
|
Qiyuan Gong
|
c11170b96f
|
Upgrade Peft to 0.10.0 in finetune examples and docker (#10930)
* Upgrade Peft to 0.10.0 in finetune examples.
* Upgrade Peft to 0.10.0 in docker.
|
2024-05-07 15:12:26 +08:00 |
|
Qiyuan Gong
|
41ffe1526c
|
Modify CPU finetune docker for bz2 error (#10919)
* Avoid bz2 error
* change to cpu torch
|
2024-05-06 10:41:50 +08:00 |
|
Guancheng Fu
|
2c64754eb0
|
Add vLLM to ipex-llm serving image (#10807)
* add vllm
* done
* doc work
* fix done
* temp
* add docs
* format
* add start-fastchat-service.sh
* fix
|
2024-04-29 17:25:42 +08:00 |
|
Heyang Sun
|
751f6d11d8
|
fix typos in qlora README (#10893)
|
2024-04-26 14:03:06 +08:00 |
|
Guancheng Fu
|
3b82834aaf
|
Update README.md (#10838)
|
2024-04-22 14:18:51 +08:00 |
|
Shaojun Liu
|
7297036c03
|
upgrade python (#10769)
|
2024-04-16 09:28:10 +08:00 |
|
Shaojun Liu
|
3590e1be83
|
revert python to 3.9 for finetune image (#10758)
|
2024-04-15 10:37:10 +08:00 |
|
Shaojun Liu
|
29bf28bd6f
|
Upgrade python to 3.11 in Docker Image (#10718)
* install python 3.11 for cpu-inference docker image
* update xpu-inference dockerfile
* update cpu-serving image
* update qlora image
* update lora image
* update document
|
2024-04-10 14:41:27 +08:00 |
|
Heyang Sun
|
4f6df37805
|
fix wrong cpu core num seen by docker (#10645)
|
2024-04-03 15:52:25 +08:00 |
|
Shaojun Liu
|
1aef3bc0ab
|
verify and refine ipex-llm-finetune-qlora-xpu docker document (#10638)
* verify and refine finetune-xpu document
* update export_merged_model.py link
* update link
|
2024-04-03 11:33:13 +08:00 |
|
Heyang Sun
|
b8b923ed04
|
move chown step to behind add script in qlora Dockerfile
|
2024-04-02 23:04:51 +08:00 |
|
Shaojun Liu
|
a10f5a1b8d
|
add python style check (#10620)
* add python style check
* fix style checks
* update runner
* add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow
* update tag to 2.1.0-SNAPSHOT
|
2024-04-02 16:17:56 +08:00 |
|
Shaojun Liu
|
20a5e72da0
|
refine and verify ipex-llm-serving-xpu docker document (#10615)
* refine serving on cpu/xpu
* minor fix
* replace localhost with 0.0.0.0 so that service can be accessed through ip address
|
2024-04-02 11:45:45 +08:00 |
|