Qiyuan Gong
120a0035ac
Fix type mismatch in eval for Baichuan2 QLora example ( #11117 )
...
* During the evaluation stage, Baichuan2 will raise type mismatch when training with bfloat16. Fix this issue by modifying modeling_baichuan.py. Add doc about how to modify this file.
2024-05-24 14:14:30 +08:00
Qiyuan Gong
21a1a973c1
Remove axolotl and python3-blinker ( #11127 )
...
* Remove axolotl from image to reduce image size.
* Remove python3-blinker to avoid axolotl lib conflict.
2024-05-24 13:54:19 +08:00
Yishuo Wang
1db9d9a63b
optimize internlm2 xcomposer agin ( #11124 )
2024-05-24 13:44:52 +08:00
Yishuo Wang
9372ce87ce
fix internlm xcomposer2 fp16 ( #11123 )
2024-05-24 11:03:31 +08:00
Cengguang Zhang
011b9faa5c
LLM: unify baichuan2-13b alibi mask dtype with model dtype. ( #11107 )
...
* LLM: unify alibi mask dtype.
* fix comments.
2024-05-24 10:27:53 +08:00
Jiao Wang
0a06a6e1d4
Update tests for transformers 4.36 ( #10858 )
...
* update unit test
* update
* update
* update
* update
* update
* fix gpu attention test
* update
* update
* update
* update
* update
* update
* update example test
* replace replit code
* update
* update
* update
* update
* set safe_serialization false
* perf test
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* delete
* update
* update
* update
* update
* update
* update
* revert
* update
2024-05-24 10:26:38 +08:00
Xiangyu Tian
1291165720
LLM: Add quickstart for vLLM cpu ( #11122 )
...
Add quickstart for vLLM cpu.
2024-05-24 10:21:21 +08:00
Wang, Jian4
1443b802cc
Docker:Fix building cpp_docker and remove unimportant dependencies ( #11114 )
...
* test build
* update
2024-05-24 09:49:44 +08:00
Xiangyu Tian
b3f6faa038
LLM: Add CPU vLLM entrypoint ( #11083 )
...
Add CPU vLLM entrypoint and update CPU vLLM serving example.
2024-05-24 09:16:59 +08:00
Shengsheng Huang
7ed270a4d8
update readme docker section, fix quickstart title, remove chs figure ( #11044 )
...
* update readme and fix quickstart title, remove chs figure
* update readme according to comment
* reorganize the docker guide structure
2024-05-24 00:18:20 +08:00
Yishuo Wang
797dbc48b8
fix phi-2 and phi-3 convert ( #11116 )
2024-05-23 17:37:37 +08:00
Yishuo Wang
37b98a531f
support running internlm xcomposer2 on gpu and add sdp optimization ( #11115 )
2024-05-23 17:26:24 +08:00
Zhao Changmin
c5e8b90c8d
Add Qwen register attention implemention ( #11110 )
...
* qwen_register
2024-05-23 17:17:45 +08:00
Yishuo Wang
0e53f20edb
support running internlm-xcomposer2 on cpu ( #11111 )
2024-05-23 16:36:09 +08:00
Shaojun Liu
e0f401d97d
FIX: APT Repository not working (signatures invalid) ( #11112 )
...
* chmod 644 gpg key
* chmod 644 gpg key
2024-05-23 16:15:45 +08:00
Yuwen Hu
d36b41d59e
Add setuptools limitation for ipex-llm[xpu] ( #11102 )
...
* Add setuptool limitation for ipex-llm[xpu]
* llamaindex option update
2024-05-22 18:20:30 +08:00
Yishuo Wang
cd4dff09ee
support phi-3 vision ( #11101 )
2024-05-22 17:43:50 +08:00
Zhao Changmin
15d906a97b
Update linux igpu run script ( #11098 )
...
* update run script
2024-05-22 17:18:07 +08:00
Kai Huang
f63172ef63
Align ppl with llama.cpp ( #11055 )
...
* update script
* remove
* add header
* update readme
2024-05-22 16:43:11 +08:00
Qiyuan Gong
f6c9ffe4dc
Add WANDB_MODE and HF_HUB_OFFLINE to XPU finetune README ( #11097 )
...
* Add WANDB_MODE=offline to avoid multi-GPUs finetune errors.
* Add HF_HUB_OFFLINE=1 to avoid Hugging Face related errors.
2024-05-22 15:20:53 +08:00
Yuwen Hu
1c5ed9b6cf
Fix arc ut ( #11096 )
2024-05-22 14:13:13 +08:00
Guancheng Fu
4fd1df9cf6
Add toc for docker quickstarts ( #11095 )
...
* fix
* fix
2024-05-22 11:23:22 +08:00
Shaojun Liu
584439e498
update homepage url for ipex-llm ( #11094 )
...
* update homepage url
* Update python version to 3.11
* Update long description
2024-05-22 11:10:44 +08:00
Zhao Changmin
bf0f904e66
Update level_zero on MTL linux ( #11085 )
...
* Update level_zero on MTL
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-05-22 11:01:56 +08:00
Shaojun Liu
8fdc8fb197
Quickstart: Run/Develop PyTorch in VSCode with Docker on Intel GPU ( #11070 )
...
* add quickstart: Run/Develop PyTorch in VSCode with Docker on Intel GPU
* add gif
* update index.rst
* update link
* update GIFs
2024-05-22 09:29:42 +08:00
Xin Qiu
71bcd18f44
fix qwen vl ( #11090 )
2024-05-21 18:40:29 +08:00
Guancheng Fu
f654f7e08c
Add serving docker quickstart ( #11072 )
...
* add temp file
* add initial docker readme
* temp
* done
* add fastchat service
* fix
* fix
* fix
* fix
* remove stale file
2024-05-21 17:00:58 +08:00
Yishuo Wang
f00625f9a4
refactor qwen2 ( #11087 )
2024-05-21 16:53:42 +08:00
Qiyuan Gong
492ed3fd41
Add verified models to GPU finetune README ( #11088 )
...
* Add verified models to GPU finetune README
2024-05-21 15:49:15 +08:00
Qiyuan Gong
1210491748
ChatGLM3, Baichuan2 and Qwen1.5 QLoRA example ( #11078 )
...
* Add chatglm3, qwen15-7b and baichuan-7b QLoRA alpaca example
* Remove unnecessary tokenization setting.
2024-05-21 15:29:43 +08:00
binbin Deng
ecb16dcf14
Add deepspeed autotp support for xpu docker ( #11077 )
2024-05-21 14:49:54 +08:00
ZehuaCao
842d6dfc2d
Further Modify CPU example ( #11081 )
...
* modify CPU example
* update
2024-05-21 13:55:47 +08:00
Yishuo Wang
d830a63bb7
refactor qwen ( #11074 )
2024-05-20 18:08:37 +08:00
Wang, Jian4
74950a152a
Fix tgi_api_server error file name ( #11075 )
2024-05-20 16:48:40 +08:00
Yishuo Wang
4e97047d70
fix baichuan2 13b fp16 ( #11071 )
2024-05-20 11:21:20 +08:00
binbin Deng
7170dd9192
Update guide for running qwen with AutoTP ( #11065 )
2024-05-20 10:53:17 +08:00
Wang, Jian4
a2e1578fd9
Merge tgi_api_server to main ( #11036 )
...
* init
* fix style
* speculative can not use benchmark
* add tgi server readme
2024-05-20 09:15:03 +08:00
Yuwen Hu
f60565adc7
Fix toc for vllm serving quickstart ( #11068 )
2024-05-17 17:12:48 +08:00
Guancheng Fu
dfac168d5f
fix format/typo ( #11067 )
2024-05-17 16:52:17 +08:00
Yishuo Wang
31ce3e0c13
refactor baichuan2-13b ( #11064 )
2024-05-17 16:25:30 +08:00
Guancheng Fu
67db925112
Add vllm quickstart ( #10978 )
...
* temp
* add doc
* finish
* done
* fix
* add initial docker readme
* temp
* done fixing vllm_quickstart
* done
* remove not used file
* add
* fix
2024-05-17 16:16:42 +08:00
ZehuaCao
56cb992497
LLM: Modify CPU Installation Command for most examples ( #11049 )
...
* init
* refine
* refine
* refine
* modify hf-agent example
* modify all CPU model example
* remove readthedoc modify
* replace powershell with cmd
* fix repo
* fix repo
* update
* remove comment on windows code block
* update
* update
* update
* update
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
2024-05-17 15:52:20 +08:00
Ruonan Wang
f1156e6b20
support gguf_q4k_m / gguf_q4k_s ( #10887 )
...
* initial commit
* UPDATE
* fix style
* fix style
* add gguf_q4k_s
* update comment
* fix
2024-05-17 14:30:09 +08:00
Yishuo Wang
981d668be6
refactor baichuan2-7b ( #11062 )
2024-05-17 13:01:34 +08:00
Shaojun Liu
84239d0bd3
Update docker image tags in Docker Quickstart ( #11061 )
...
* update docker image tag to latest
* add note
* simplify note
* add link in reStructuredText
* minor fix
* update tag
2024-05-17 11:06:11 +08:00
Yuwen Hu
b3027e2d60
Update for cpu install option in performance tests ( #11060 )
2024-05-17 10:33:43 +08:00
Xiangyu Tian
d963e95363
LLM: Modify CPU Installation Command for documentation ( #11042 )
...
* init
* refine
* refine
* refine
* refine comments
2024-05-17 10:14:00 +08:00
Yuwen Hu
fff067d240
Make install ut for cpu exactly the same as what we want for users ( #11051 )
2024-05-17 10:11:01 +08:00
Ruonan Wang
3a72e5df8c
disable mlp fusion of fp6 on mtl ( #11059 )
2024-05-17 10:10:16 +08:00
SONG Ge
192ae35012
Add support for llama2 quantize_kv with transformers 4.38.0 ( #11054 )
...
* add support for llama2 quantize_kv with transformers 4.38.0
* fix code style
* fix code style
2024-05-16 22:23:39 +08:00