ipex-llm

Author	SHA1	Message	Date
Yishuo Wang	a47989c860	optimize yuan 2.0 performance (#10244 )	2024-02-26 17:20:10 +08:00
Wang, Jian4	6c74b99a28	LLM: Update qwen readme (#10245 )	2024-02-26 17:03:09 +08:00
hxsz1997	15ad2fd72e	Merge pull request #10226 from zhentaocc/fix_harness Fix harness	2024-02-26 16:49:27 +08:00
Wang, Jian4	f9b75f900b	LLM: Enable qwen target_model ipex (#10232 ) * change order * enable qwen ipex * update qwen example * update * fix style * update	2024-02-26 16:41:12 +08:00
Jin Qiao	3e6d188553	LLM: add baichuan2-13b to mtl perf (#10238 )	2024-02-26 15:55:56 +08:00
Yuwen Hu	e38e29511c	[LLM] Yuan2 MLP and Rotary optimization (#10231 ) * Add optimization for rotary embedding * Add mlp fused optimizatgion * Python style fix * Fix rotary embedding due to logits difference * Small fix	2024-02-26 15:10:08 +08:00
Chen, Zhentao	5ad752bae8	Separate llmcpp build of linux and windows (#10136 ) * separate linux window llmcpp build * harness run on linux only * fix platform * skip error * change to linux only build * add judgement of platform * add download args * remove \|\|true	2024-02-26 15:04:29 +08:00
Ziteng Zhang	ea23afc8ec	[LLM]update ipex part in mistral example readme (#10239 ) * update ipex part in mistral example readme	2024-02-26 14:35:20 +08:00
Chen, Zhentao	62350a36f0	fix if in update html	2024-02-26 13:39:59 +08:00
Zhicun	7c236e4c6d	quick start for windows with gpu (#10221 ) * quick start for windows igpu * Update install_windows_gpu.md * Update install_windows_gpu.md * Update install_windows_gpu.md * Update install_windows_gpu.md * Update install_windows_gpu.md * Update install_windows_gpu.md * update the demo.py * Update install_windows_gpu.md * Update install_windows_gpu.md * fix image position typo * Update install_windows_gpu.md * update pip install command --------- Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>	2024-02-26 12:19:36 +08:00
SONG Ge	df2f3885ba	[LLM] Enable kv_cache and forward_qkv optimizations for yuan2 (#10225 ) * add init kv_cache support for yuan2 * add forward qkv in yuan	2024-02-26 11:29:48 +08:00
Xiangyu Tian	85a99e13e8	LLM: Fix ChatGLM3 Speculative Example (#10236 ) Fix ChatGLM3 Speculative Example.	2024-02-26 10:57:28 +08:00
Yuxuan Xia	0c6aef0f47	Add einops dependency for C-Eval (#10234 ) * Add c-eval workflow and modify running files * Modify the chatglm evaluator file * Modify the ceval workflow for triggering test * Modify the ceval workflow file * Modify the ceval workflow file * Modify ceval workflow * Adjust the ceval dataset download * Add ceval workflow dependencies * Modify ceval workflow dataset download * Add ceval test dependencies * Add ceval test dependencies * Correct the result print * Fix the nightly test trigger time * Fix ChatGLM loading issue * Add einops dependency	2024-02-26 10:13:10 +08:00
Chen, Zhentao	213ef06691	fix readme	2024-02-24 00:38:08 +08:00
Chen, Zhentao	85d13c65de	run one job only if triggered by pr	2024-02-24 00:33:33 +08:00
Chen, Zhentao	a55cc91e1f	fix make_csv.py	2024-02-23 20:25:46 +08:00
Ruonan Wang	28513f3978	LLM: support fp16 embedding & add mlp fusion for iq2_xxs (#10219 ) * add fp16 embed * small fixes * fix style * fix style * fix comment	2024-02-23 17:26:24 +08:00
Yuwen Hu	eeecd9fc08	Python style fix (#10230 )	2024-02-23 17:21:23 +08:00
Chen, Zhentao	a204337cad	Rename results	2024-02-23 17:12:37 +08:00
Chen, Zhentao	4fdf96dc8b	fix ACC_FOLDER	2024-02-23 17:11:03 +08:00
Yuwen Hu	e511bbd8f1	[LLM] Add basic optimization framework for Yuan2 (#10227 ) * Add basic optimization framework for Yuan2 * Small fix * Python style fix * Small fix * Small fix	2024-02-23 17:05:00 +08:00
Xin Qiu	8ef5482da2	update Gemma readme (#10229 ) * Update README.md * Update README.md * Update README.md * Update README.md	2024-02-23 16:57:08 +08:00
Chen, Zhentao	e838ec9e14	remove dependency	2024-02-23 16:33:40 +08:00
Chen, Zhentao	88f7f56980	rewrite html visualization	2024-02-23 16:33:39 +08:00
Chen, Zhentao	6fe5344fa6	separate make_csv from the file	2024-02-23 16:33:38 +08:00
Chen, Zhentao	bfa98666a6	fall back to make_table.py	2024-02-23 16:33:38 +08:00
Chen, Zhentao	02cb96e7f6	fix Run Harness job	2024-02-23 16:33:37 +08:00
Chen, Zhentao	e1fcf54a0c	reformat	2024-02-23 16:33:36 +08:00
Chen, Zhentao	5399343adc	fix harness installation	2024-02-23 16:33:35 +08:00
Chen, Zhentao	9c8e349196	remove harness job output	2024-02-23 16:33:34 +08:00
Chen, Zhentao	8472de90e8	use stable lm to test pr	2024-02-23 16:33:34 +08:00
Ziteng Zhang	e08c74f1d1	Fix build error of bigdl-llm-cpu (#10228 )	2024-02-23 16:30:21 +08:00
Ruonan Wang	19260492c7	LLM: fix action/installation error of mpmath (#10223 ) * fix * test * fix * update	2024-02-23 16:14:53 +08:00
Xin Qiu	aabfc06977	add gemma example (#10224 ) * add gemma gpu example * Update README.md * add cpu example * Update README.md * Update README.md * Update generate.py * Update generate.py	2024-02-23 15:20:57 +08:00
Ziteng Zhang	f7e2591f15	[LLM] change IPEX230 to IPEX220 in dockerfile (#10222 ) * change IPEX230 to IPEX220 in dockerfile	2024-02-23 15:02:08 +08:00
yb-peng	a2c1675546	Add CPU and GPU examples for Yuan2-2B-hf (#9946 ) * Add a new CPU example of Yuan2-2B-hf * Add a new CPU generate.py of Yuan2-2B-hf example * Add a new GPU example of Yuan2-2B-hf * Add Yuan2 to README table * In CPU example:1.Use English as default prompt; 2.Provide modified files in yuan2-2B-instruct * In GPU example:1.Use English as default prompt;2.Provide modified files * GPU example:update README * update Yuan2-2B-hf in README table * Add CPU example for Yuan2-2B in Pytorch-Models * Add GPU example for Yuan2-2B in Pytorch-Models * Add license in generate.py; Modify README * In GPU Add license in generate.py; Modify README * In CPU yuan2 modify README * In GPU yuan2 modify README * In CPU yuan2 modify README * In GPU example, updated the readme for Windows GPU supports * In GPU torch example, updated the readme for Windows GPU supports * GPU hf example README modified * GPU example README modified	2024-02-23 14:09:30 +08:00
yb-peng	f1f4094a09	Add CPU and GPU examples of phi-2 (#10014 ) * Add CPU and GPU examples of phi-2 * In GPU hf example, updated the readme for Windows GPU supports * In GPU torch example, updated the readme for Windows GPU supports * update the table in BigDL/README.md * update the table in BigDL/python/llm/README.md	2024-02-23 14:05:53 +08:00
Jason Dai	40584dec6d	Update readme (#10214 )	2024-02-23 11:42:16 +08:00
Chen, Zhentao	f315c7f93a	Move harness nightly related files to llm/test folder (#10209 ) * move harness nightly files to test folder * change workflow file path accordingly * use arc01 when pr * fix path * fix fp16 csv path	2024-02-23 11:12:36 +08:00
Xin Qiu	30795bdfbc	Gemma optimization: rms_norm, kv_cache, fused_rope, fused_rope+qkv (#10212 ) * gemma optimization * update * update * fix style * meet code review	2024-02-23 10:07:24 +08:00
Guoqiong Song	63681af97e	falcon for transformers 4.36 (#9960 ) * falcon for transformers 4.36	2024-02-22 17:04:40 -08:00
Jason Dai	84d5f40936	Update README.md (#10213 )	2024-02-22 17:22:59 +08:00
Yina Chen	ce5840a8b7	GPT-J rope optimization on xpu (#10182 ) * optimize * update * fix style & move use_fuse_rope * add ipex version check * fix style * update * fix style * meet comments * address comments * fix style	2024-02-22 16:25:12 +08:00
Xiangyu Tian	f445217d02	LLM: Update IPEX to 2.2.0+cpu and Refactor for _ipex_optimize (#10189 ) Update IPEX to 2.2.0+cpu and refactor for _ipex_optimize.	2024-02-22 16:01:11 +08:00
Heyang Sun	c876d9b5ca	Support for MPT rotary embedding (#10208 )	2024-02-22 15:16:31 +08:00
Ruonan Wang	5e1fee5e05	LLM: add GGUF-IQ2 examples (#10207 ) * add iq2 examples * small fix * meet code review * fix * meet review * small fix	2024-02-22 14:18:45 +08:00
Yuwen Hu	21de2613ce	[LLM] Add model loading time record for all-in-one benchmark (#10201 ) * Add model loading time record in csv for all-in-one benchmark * Small fix * Small fix to number after .	2024-02-22 13:57:18 +08:00
Ovo233	60e11b6739	LLM: Add mlp layer unit tests (#10200 ) * add mlp layer unit tests * add download baichuan-13b * exclude llama for now * install additional packages * rename bash file * switch to Baichuan2 * delete attention related code * fix name errors in yml file	2024-02-22 13:44:45 +08:00
SONG Ge	ca1166a0e5	[LLM] Add quantize kv_cache for Baichuan2-13B (#10203 ) * add quantize kv_cache for baichuan2-13b * style fix	2024-02-22 13:43:35 +08:00
Ruonan Wang	34ee1aa91f	LLM: add esimd sdp support for chatglm3 (#10205 ) * add esimd sdp support * fix style	2024-02-22 13:37:16 +08:00

... 3 4 5 6 7 ...

2453 commits