ipex-llm

Author	SHA1	Message	Date
Jiao Wang	d1eaea509f	update chatglm readme (#10659 )	2024-04-09 14:24:46 -07:00
Jiao Wang	878a97077b	Fix llava example to support transformerds 4.36 (#10614 ) * fix llava example * update	2024-04-09 13:47:07 -07:00
Jiao Wang	1e817926ba	Fix low memory generation example issue in transformers 4.36 (#10702 ) * update cache in low memory generate * update	2024-04-09 09:56:52 -07:00
Shengsheng Huang	6e7da0d92c	small fix in document	2024-04-09 23:04:26 +08:00
Shengsheng Huang	8924dbc3f9	revise open webui quickstart and some indexes (#10715 ) * update readme * update openwebui readme and update index	2024-04-09 22:44:03 +08:00
Yuwen Hu	a0244527aa	Small updates to langchain-chatchat quickstart readme (#10714 )	2024-04-09 19:37:41 +08:00
Yuwen Hu	fde6ab50d0	Further fix to python 3.11 document (#10712 )	2024-04-09 19:13:01 +08:00
yb-peng	447f48499a	Init commit of open-webui quickstart (#10682 ) * init commit of open-webui quickstart * add links into open-webui quickstart * Update open_webui_with_ollama_quickstart.md	2024-04-09 18:21:42 +08:00
Yuwen Hu	97db2492c8	Update setup.py for `bigdl-core-xe-esimd-21` on Windows (#10705 ) * Support bigdl-core-xe-esimd-21 for windows in setup.py * Update setup-llm-env accordingly	2024-04-09 18:21:21 +08:00
Zhicun	b4147a97bb	Fix dtype mismatch error (#10609 ) * fix llama * fix * fix code style * add torch type in model.py --------- Co-authored-by: arda <arda@arda-arc19.sh.intel.com>	2024-04-09 17:50:33 +08:00
Shaojun Liu	f37a1f2a81	Upgrade to python 3.11 (#10711 ) * create conda env with python 3.11 * recommend to use Python 3.11 * update	2024-04-09 17:41:17 +08:00
Yishuo Wang	8f45e22072	fix llama2 (#10710 )	2024-04-09 17:28:37 +08:00
Shaojun Liu	e10040b7f1	upgrade to python 3.11 (#10695 )	2024-04-09 17:04:42 +08:00
Yishuo Wang	e438f941f2	disable rwkv5 fp16 (#10699 )	2024-04-09 16:42:11 +08:00
Cengguang Zhang	6a32216269	LLM: add llama2 8k input example. (#10696 ) * LLM: add llama2-32K example. * refactor name. * fix comments. * add IPEX_LLM_LOW_MEM notes and update sample output.	2024-04-09 16:02:37 +08:00
Wenjing Margaret Mao	289cc99cd6	Update README.md (#10700 ) Edit "summarize the results"	2024-04-09 16:01:12 +08:00
Jason Dai	3e4fbee87c	Update readme & quickstart (#10685 )	2024-04-09 15:59:17 +08:00
Ikko Eltociear Ashimine	39ff586454	docs: update README.md (#10662 ) inital -> initial	2024-04-09 15:55:57 +08:00
Wenjing Margaret Mao	d3116de0db	Update README.md (#10701 ) edit "summarize the results"	2024-04-09 15:50:25 +08:00
Chen, Zhentao	d59e0cce5c	Migrate harness to ipexllm (#10703 ) * migrate to ipexlm * fix workflow * fix run_multi * fix precision map * rename ipexlm to ipexllm * rename bigdl to ipex in comments	2024-04-09 15:48:53 +08:00
yb-peng	8cf26d8d08	Update ollama_quickstart.md (#10708 )	2024-04-09 15:47:41 +08:00
Keyan (Kyrie) Zhang	1e27e08322	Modify example from fp32 to fp16 (#10528 ) * Modify example from fp32 to fp16 * Remove Falcon from fp16 example for now * Remove MPT from fp16 example	2024-04-09 15:45:49 +08:00
binbin Deng	44922bb5c2	LLM: support baichuan2-13b using AutoTP (#10691 )	2024-04-09 14:06:01 +08:00
Yina Chen	c7422712fc	mistral 4.36 use fp16 sdp (#10704 )	2024-04-09 13:50:33 +08:00
Ovo233	dcb2038aad	Enable optimization for sentence_transformers (#10679 ) * enable optimization for sentence_transformers * fix python style check failure	2024-04-09 12:33:46 +08:00
Zhicun	f03c029914	pydantic version>=2.0.0 for llamaindex (#10694 ) * pydantic version * pydantic version * upgrade version	2024-04-09 09:48:42 +08:00
Yang Wang	5a1f446d3c	support fp8 in xetla (#10555 ) * support fp8 in xetla * change name * adjust model file * support convert back to cpu * factor * fix bug * fix style	2024-04-08 13:22:09 -07:00
Cengguang Zhang	7c43ac0164	LLM: optimize llama natvie sdp for split qkv tensor (#10693 ) * LLM: optimize llama natvie sdp for split qkv tensor. * fix block real size. * fix comment. * fix style. * refactor.	2024-04-08 17:48:11 +08:00
Xin Qiu	1274cba79b	stablelm fp8 kv cache (#10672 ) * stablelm fp8 kvcache * update * fix * change to fp8 matmul * fix style * fix * fix * meet code review * add comment	2024-04-08 15:16:46 +08:00
Yishuo Wang	65127622aa	fix UT threshold (#10689 )	2024-04-08 14:58:20 +08:00
Cengguang Zhang	c0cd238e40	LLM: support llama2 8k input with w4a16. (#10677 ) * LLM: support llama2 8k input with w4a16. * fix comment and style. * fix style. * fix comments and split tensor to quantized attention forward. * fix style. * refactor name. * fix style. * fix style. * fix style. * refactor checker name. * refactor native sdp split qkv tensor name. * fix style. * fix comment rename variables. * fix co-exist of intermedia results.	2024-04-08 11:43:15 +08:00
Shaojun Liu	db7c5cb78f	update model path for spr perf test (#10687 ) * update model path for spr perf test * revert	2024-04-08 10:21:56 +08:00
Zhicun	321bc69307	Fix llamaindex ut (#10673 ) * fix llamaindex ut * add GPU ut	2024-04-08 09:47:51 +08:00
Keyan (Kyrie) Zhang	a11b708135	Modify the .md link in chatchat readthedoc (#10681 )	2024-04-07 16:33:32 +08:00
yb-peng	2d88bb9b4b	add test api transformer_int4_fp16_gpu (#10627 ) * add test api transformer_int4_fp16_gpu * update config.yaml and README.md in all-in-one * modify run.py in all-in-one * re-order test-api * re-order test-api in config * modify README.md in all-in-one * modify README.md in all-in-one * modify config.yaml --------- Co-authored-by: pengyb2001 <arda@arda-arc21.sh.intel.com> Co-authored-by: ivy-lv11 <zhicunlv@gmail.com>	2024-04-07 15:47:17 +08:00
Wang, Jian4	47cabe8fcc	LLM: Fix no return_last_logit running bigdl_ipex chatglm3 (#10678 ) * fix no return_last_logits * update only for chatglm	2024-04-07 15:27:58 +08:00
Shengsheng Huang	33f90beda0	fix quickstart docs (#10676 )	2024-04-07 14:26:59 +08:00
Wang, Jian4	9ad4b29697	LLM: CPU benchmark using tcmalloc (#10675 )	2024-04-07 14:17:01 +08:00
binbin Deng	d9a1153b4e	LLM: upgrade deepspeed in AutoTP on GPU (#10647 )	2024-04-07 14:05:19 +08:00
Jin Qiao	56dfcb2ade	Migrate portable zip to ipex-llm (#10617 ) * change portable zip prompt to ipex-llm * fix chat with ui * add no proxy	2024-04-07 13:58:58 +08:00
Zhicun	9d8ba64c0d	Llamaindex: add tokenizer_id and support chat (#10590 ) * add tokenizer_id * fix * modify * add from_model_id and from_mode_id_low_bit * fix typo and add comment * fix python code style --------- Co-authored-by: pengyb2001 <284261055@qq.com>	2024-04-07 13:51:34 +08:00
Jin Qiao	10ee786920	Replace with IPEX-LLM in example comments (#10671 ) * Replace with IPEX-LLM in example comments * More replacement * revert some changes	2024-04-07 13:29:51 +08:00
Xiangyu Tian	08018a18df	Remove not-imported MistralConfig (#10670 )	2024-04-07 10:32:05 +08:00
Cengguang Zhang	1a9b8204a4	LLM: support int4 fp16 chatglm2-6b 8k input. (#10648 )	2024-04-07 09:39:21 +08:00
Jason Dai	ab87b6ab21	Update readme (#10669 )	2024-04-07 09:13:45 +08:00
Jiao Wang	69bdbf5806	Fix vllm print error message issue (#10664 ) * update chatglm readme * Add condition to invalidInputError * update * update * style	2024-04-05 15:08:13 -07:00
Jason Dai	29d97e4678	Update readme (#10665 )	2024-04-05 18:01:57 +08:00
Yang Wang	ac65ab65c6	Update llama_cpp_quickstart.md (#10663 )	2024-04-04 11:00:50 -07:00
Jason Dai	6699d86192	Update index.rst (#10660 )	2024-04-04 20:37:33 +08:00
Tom Aarsen	8abf4da1bc	README: Fix typo: tansformers -> transformers (#10657 )	2024-04-04 08:54:48 +08:00

1 2 3 4 5 ...

2570 commits