Xiangyu Tian
|
b3f6faa038
|
LLM: Add CPU vLLM entrypoint (#11083)
Add CPU vLLM entrypoint and update CPU vLLM serving example.
|
2024-05-24 09:16:59 +08:00 |
|
Shengsheng Huang
|
7ed270a4d8
|
update readme docker section, fix quickstart title, remove chs figure (#11044)
* update readme and fix quickstart title, remove chs figure
* update readme according to comment
* reorganize the docker guide structure
|
2024-05-24 00:18:20 +08:00 |
|
Yishuo Wang
|
797dbc48b8
|
fix phi-2 and phi-3 convert (#11116)
|
2024-05-23 17:37:37 +08:00 |
|
Yishuo Wang
|
37b98a531f
|
support running internlm xcomposer2 on gpu and add sdp optimization (#11115)
|
2024-05-23 17:26:24 +08:00 |
|
Zhao Changmin
|
c5e8b90c8d
|
Add Qwen register attention implemention (#11110)
* qwen_register
|
2024-05-23 17:17:45 +08:00 |
|
Yishuo Wang
|
0e53f20edb
|
support running internlm-xcomposer2 on cpu (#11111)
|
2024-05-23 16:36:09 +08:00 |
|
Shaojun Liu
|
e0f401d97d
|
FIX: APT Repository not working (signatures invalid) (#11112)
* chmod 644 gpg key
* chmod 644 gpg key
|
2024-05-23 16:15:45 +08:00 |
|
Yuwen Hu
|
d36b41d59e
|
Add setuptools limitation for ipex-llm[xpu] (#11102)
* Add setuptool limitation for ipex-llm[xpu]
* llamaindex option update
|
2024-05-22 18:20:30 +08:00 |
|
Yishuo Wang
|
cd4dff09ee
|
support phi-3 vision (#11101)
|
2024-05-22 17:43:50 +08:00 |
|
Zhao Changmin
|
15d906a97b
|
Update linux igpu run script (#11098)
* update run script
|
2024-05-22 17:18:07 +08:00 |
|
Kai Huang
|
f63172ef63
|
Align ppl with llama.cpp (#11055)
* update script
* remove
* add header
* update readme
|
2024-05-22 16:43:11 +08:00 |
|
Qiyuan Gong
|
f6c9ffe4dc
|
Add WANDB_MODE and HF_HUB_OFFLINE to XPU finetune README (#11097)
* Add WANDB_MODE=offline to avoid multi-GPUs finetune errors.
* Add HF_HUB_OFFLINE=1 to avoid Hugging Face related errors.
|
2024-05-22 15:20:53 +08:00 |
|
Yuwen Hu
|
1c5ed9b6cf
|
Fix arc ut (#11096)
|
2024-05-22 14:13:13 +08:00 |
|
Guancheng Fu
|
4fd1df9cf6
|
Add toc for docker quickstarts (#11095)
* fix
* fix
|
2024-05-22 11:23:22 +08:00 |
|
Shaojun Liu
|
584439e498
|
update homepage url for ipex-llm (#11094)
* update homepage url
* Update python version to 3.11
* Update long description
|
2024-05-22 11:10:44 +08:00 |
|
Zhao Changmin
|
bf0f904e66
|
Update level_zero on MTL linux (#11085)
* Update level_zero on MTL
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
|
2024-05-22 11:01:56 +08:00 |
|
Shaojun Liu
|
8fdc8fb197
|
Quickstart: Run/Develop PyTorch in VSCode with Docker on Intel GPU (#11070)
* add quickstart: Run/Develop PyTorch in VSCode with Docker on Intel GPU
* add gif
* update index.rst
* update link
* update GIFs
|
2024-05-22 09:29:42 +08:00 |
|
Xin Qiu
|
71bcd18f44
|
fix qwen vl (#11090)
|
2024-05-21 18:40:29 +08:00 |
|
Guancheng Fu
|
f654f7e08c
|
Add serving docker quickstart (#11072)
* add temp file
* add initial docker readme
* temp
* done
* add fastchat service
* fix
* fix
* fix
* fix
* remove stale file
|
2024-05-21 17:00:58 +08:00 |
|
Yishuo Wang
|
f00625f9a4
|
refactor qwen2 (#11087)
|
2024-05-21 16:53:42 +08:00 |
|
Qiyuan Gong
|
492ed3fd41
|
Add verified models to GPU finetune README (#11088)
* Add verified models to GPU finetune README
|
2024-05-21 15:49:15 +08:00 |
|
Qiyuan Gong
|
1210491748
|
ChatGLM3, Baichuan2 and Qwen1.5 QLoRA example (#11078)
* Add chatglm3, qwen15-7b and baichuan-7b QLoRA alpaca example
* Remove unnecessary tokenization setting.
|
2024-05-21 15:29:43 +08:00 |
|
binbin Deng
|
ecb16dcf14
|
Add deepspeed autotp support for xpu docker (#11077)
|
2024-05-21 14:49:54 +08:00 |
|
ZehuaCao
|
842d6dfc2d
|
Further Modify CPU example (#11081)
* modify CPU example
* update
|
2024-05-21 13:55:47 +08:00 |
|
Yishuo Wang
|
d830a63bb7
|
refactor qwen (#11074)
|
2024-05-20 18:08:37 +08:00 |
|
Wang, Jian4
|
74950a152a
|
Fix tgi_api_server error file name (#11075)
|
2024-05-20 16:48:40 +08:00 |
|
Yishuo Wang
|
4e97047d70
|
fix baichuan2 13b fp16 (#11071)
|
2024-05-20 11:21:20 +08:00 |
|
binbin Deng
|
7170dd9192
|
Update guide for running qwen with AutoTP (#11065)
|
2024-05-20 10:53:17 +08:00 |
|
Wang, Jian4
|
a2e1578fd9
|
Merge tgi_api_server to main (#11036)
* init
* fix style
* speculative can not use benchmark
* add tgi server readme
|
2024-05-20 09:15:03 +08:00 |
|
Yuwen Hu
|
f60565adc7
|
Fix toc for vllm serving quickstart (#11068)
|
2024-05-17 17:12:48 +08:00 |
|
Guancheng Fu
|
dfac168d5f
|
fix format/typo (#11067)
|
2024-05-17 16:52:17 +08:00 |
|
Yishuo Wang
|
31ce3e0c13
|
refactor baichuan2-13b (#11064)
|
2024-05-17 16:25:30 +08:00 |
|
Guancheng Fu
|
67db925112
|
Add vllm quickstart (#10978)
* temp
* add doc
* finish
* done
* fix
* add initial docker readme
* temp
* done fixing vllm_quickstart
* done
* remove not used file
* add
* fix
|
2024-05-17 16:16:42 +08:00 |
|
ZehuaCao
|
56cb992497
|
LLM: Modify CPU Installation Command for most examples (#11049)
* init
* refine
* refine
* refine
* modify hf-agent example
* modify all CPU model example
* remove readthedoc modify
* replace powershell with cmd
* fix repo
* fix repo
* update
* remove comment on windows code block
* update
* update
* update
* update
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
|
2024-05-17 15:52:20 +08:00 |
|
Ruonan Wang
|
f1156e6b20
|
support gguf_q4k_m / gguf_q4k_s (#10887)
* initial commit
* UPDATE
* fix style
* fix style
* add gguf_q4k_s
* update comment
* fix
|
2024-05-17 14:30:09 +08:00 |
|
Yishuo Wang
|
981d668be6
|
refactor baichuan2-7b (#11062)
|
2024-05-17 13:01:34 +08:00 |
|
Shaojun Liu
|
84239d0bd3
|
Update docker image tags in Docker Quickstart (#11061)
* update docker image tag to latest
* add note
* simplify note
* add link in reStructuredText
* minor fix
* update tag
|
2024-05-17 11:06:11 +08:00 |
|
Yuwen Hu
|
b3027e2d60
|
Update for cpu install option in performance tests (#11060)
|
2024-05-17 10:33:43 +08:00 |
|
Xiangyu Tian
|
d963e95363
|
LLM: Modify CPU Installation Command for documentation (#11042)
* init
* refine
* refine
* refine
* refine comments
|
2024-05-17 10:14:00 +08:00 |
|
Yuwen Hu
|
fff067d240
|
Make install ut for cpu exactly the same as what we want for users (#11051)
|
2024-05-17 10:11:01 +08:00 |
|
Ruonan Wang
|
3a72e5df8c
|
disable mlp fusion of fp6 on mtl (#11059)
|
2024-05-17 10:10:16 +08:00 |
|
SONG Ge
|
192ae35012
|
Add support for llama2 quantize_kv with transformers 4.38.0 (#11054)
* add support for llama2 quantize_kv with transformers 4.38.0
* fix code style
* fix code style
|
2024-05-16 22:23:39 +08:00 |
|
SONG Ge
|
16b2a418be
|
hotfix native_sdp ut (#11046)
* hotfix native_sdp
* update
|
2024-05-16 17:15:37 +08:00 |
|
Xin Qiu
|
6be70283b7
|
fix chatglm run error (#11045)
* fix chatglm
* update
* fix style
|
2024-05-16 15:39:18 +08:00 |
|
Yishuo Wang
|
8cae897643
|
use new rope in phi3 (#11047)
|
2024-05-16 15:12:35 +08:00 |
|
Wang, Jian4
|
00d4410746
|
Update cpp docker quickstart (#11040)
* add sample output
* update link
* update
* update header
* update
|
2024-05-16 14:55:13 +08:00 |
|
Shaojun Liu
|
c62e828281
|
Create release-ipex-llm.yaml (#11039)
|
2024-05-16 11:10:10 +08:00 |
|
Qiyuan Gong
|
4638682140
|
Fix xpu finetune image path in action (#11037)
* Fix xpu finetune image path in action
|
2024-05-16 10:48:02 +08:00 |
|
Jin Qiao
|
9a96af4232
|
Remove oneAPI pip install command in related examples (#11030)
* Remove pip install command in windows installation guide
* fix chatglm3 installation guide
* Fix gemma cpu example
* Apply on other examples
* fix
|
2024-05-16 10:46:29 +08:00 |
|
Xiangyu Tian
|
612a365479
|
LLM: Install CPU version torch with extras [all] (#10868)
Modify setup.py to install CPU version torch with extras [all]
|
2024-05-16 10:39:55 +08:00 |
|