ipex-llm

Author	SHA1	Message	Date
binbin Deng	d9a1153b4e	LLM: upgrade deepspeed in AutoTP on GPU (#10647 )	2024-04-07 14:05:19 +08:00
Jin Qiao	56dfcb2ade	Migrate portable zip to ipex-llm (#10617 ) * change portable zip prompt to ipex-llm * fix chat with ui * add no proxy	2024-04-07 13:58:58 +08:00
Zhicun	9d8ba64c0d	Llamaindex: add tokenizer_id and support chat (#10590 ) * add tokenizer_id * fix * modify * add from_model_id and from_mode_id_low_bit * fix typo and add comment * fix python code style --------- Co-authored-by: pengyb2001 <284261055@qq.com>	2024-04-07 13:51:34 +08:00
Jin Qiao	10ee786920	Replace with IPEX-LLM in example comments (#10671 ) * Replace with IPEX-LLM in example comments * More replacement * revert some changes	2024-04-07 13:29:51 +08:00
Xiangyu Tian	08018a18df	Remove not-imported MistralConfig (#10670 )	2024-04-07 10:32:05 +08:00
Cengguang Zhang	1a9b8204a4	LLM: support int4 fp16 chatglm2-6b 8k input. (#10648 )	2024-04-07 09:39:21 +08:00
Jason Dai	ab87b6ab21	Update readme (#10669 )	2024-04-07 09:13:45 +08:00
Jiao Wang	69bdbf5806	Fix vllm print error message issue (#10664 ) * update chatglm readme * Add condition to invalidInputError * update * update * style	2024-04-05 15:08:13 -07:00
Jason Dai	29d97e4678	Update readme (#10665 )	2024-04-05 18:01:57 +08:00
Yang Wang	ac65ab65c6	Update llama_cpp_quickstart.md (#10663 )	2024-04-04 11:00:50 -07:00
Jason Dai	6699d86192	Update index.rst (#10660 )	2024-04-04 20:37:33 +08:00
Tom Aarsen	8abf4da1bc	README: Fix typo: tansformers -> transformers (#10657 )	2024-04-04 08:54:48 +08:00
Xin Qiu	4c3e493b2d	fix stablelm2 1.6b (#10656 ) * fix stablelm2 1.6b * meet code review	2024-04-03 22:15:32 +08:00
Shengsheng Huang	22f09f618a	update the video demo (#10655 )	2024-04-03 20:51:01 +08:00
Jason Dai	7c08d83d9e	Update quickstart (#10654 )	2024-04-03 20:43:22 +08:00
Shengsheng Huang	f84e72e7af	revise ollama quickstart (#10653 )	2024-04-03 20:35:34 +08:00
yb-peng	f789c2eee4	add ollama quickstart (#10649 ) Co-authored-by: arda <arda@arda-arc12.sh.intel.com>	2024-04-03 19:33:39 +08:00
Shengsheng Huang	1ae519ec69	add langchain-chatchat quickstart (#10652 )	2024-04-03 19:23:09 +08:00
Shengsheng Huang	45437ddc9a	update indexes, move some sections in coding quickstart to webui (#10651 )	2024-04-03 18:18:49 +08:00
Shengsheng Huang	c26e06d5cf	update coding quickstart and webui quickstart for warmup note (#10650 )	2024-04-03 17:18:28 +08:00
Yuwen Hu	5b096c39a6	Change style for video rendering (#10646 )	2024-04-03 16:31:02 +08:00
Jin Qiao	cc8b3be11c	Add GPU and CPU example for stablelm-zephyr-3b (#10643 ) * Add example for StableLM * fix * add to readme	2024-04-03 16:28:31 +08:00
Heyang Sun	6000241b10	Add Deepspeed Example of FLEX Mistral (#10640 )	2024-04-03 16:04:17 +08:00
Shaojun Liu	d18dbfb097	update spr perf test (#10644 )	2024-04-03 15:53:55 +08:00
Heyang Sun	4f6df37805	fix wrong cpu core num seen by docker (#10645 )	2024-04-03 15:52:25 +08:00
Yishuo Wang	702e686901	optimize starcoder normal kv cache (#10642 )	2024-04-03 15:27:02 +08:00
Xin Qiu	3a9ab8f1ae	fix stablelm logits diff (#10636 ) * fix logits diff * Small fixes --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2024-04-03 15:08:12 +08:00
Ovo233	97c626d76f	add continue quickstart (#10610 ) Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>	2024-04-03 14:50:11 +08:00
Zhicun	b827f534d5	Add tokenizer_id in Langchain (#10588 ) * fix low-bit * fix * fix style --------- Co-authored-by: arda <arda@arda-arc12.sh.intel.com>	2024-04-03 14:25:35 +08:00
Zhicun	f6fef09933	fix prompt format for llama-2 in langchain (#10637 )	2024-04-03 14:17:34 +08:00
Jiao Wang	330d4b4f4b	update readme (#10631 )	2024-04-02 23:08:02 -07:00
Kai Huang	c875b3c858	Add seq len check for llama softmax upcast to fp32 (#10629 )	2024-04-03 12:05:13 +08:00
Shaojun Liu	1aef3bc0ab	verify and refine ipex-llm-finetune-qlora-xpu docker document (#10638 ) * verify and refine finetune-xpu document * update export_merged_model.py link * update link	2024-04-03 11:33:13 +08:00
Shaojun Liu	0779ca3db0	Bump ossf/scorecard-action to v2.3.1 (#10639 ) * Bump ossf/scorecard-action to v2.3.1 * revert	2024-04-03 11:14:18 +08:00
Jiao Wang	4431134ec5	update readme (#10632 )	2024-04-02 19:54:30 -07:00
Shaojun Liu	dfcf08c58a	update ossf/scorecard-action to fix TUF invalid key bug (#10635 )	2024-04-03 09:55:32 +08:00
Jiao Wang	23e33a0ca1	Fix qwen-vl style (#10633 ) * update * update	2024-04-02 18:41:38 -07:00
binbin Deng	2bbd8a1548	LLM: fix llama2 FP16 & bs>1 & autotp on PVC and ARC (#10611 )	2024-04-03 09:28:04 +08:00
Jiao Wang	654dc5ba57	Fix Qwen-VL example problem (#10582 ) * update * update * update * update	2024-04-02 12:17:30 -07:00
Heyang Sun	b8b923ed04	move chown step to behind add script in qlora Dockerfile	2024-04-02 23:04:51 +08:00
Jason Dai	e184c480d2	Update WebUI Quickstart (#10630 )	2024-04-02 21:49:19 +08:00
Yuwen Hu	fd384ddfb8	Optimize StableLM (#10619 ) * Initial commit for stablelm optimizations * Small style fix * add dependency * Add mlp optimizations * Small fix * add attention forward * Remove quantize kv for now as head_dim=80 * Add merged qkv * fix lisence * Python style fix --------- Co-authored-by: qiuxin2012 <qiuxin2012cs@gmail.com>	2024-04-02 18:58:38 +08:00
binbin Deng	27be448920	LLM: add `cpu_embedding` and peak memory record for deepspeed autotp script (#10621 )	2024-04-02 17:32:50 +08:00
Yishuo Wang	ba8cc6bd68	optimize starcoder2-3b (#10625 )	2024-04-02 17:16:29 +08:00
Shaojun Liu	a10f5a1b8d	add python style check (#10620 ) * add python style check * fix style checks * update runner * add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow * update tag to 2.1.0-SNAPSHOT	2024-04-02 16:17:56 +08:00
Cengguang Zhang	58b57177e3	LLM: support bigdl quantize kv cache env and add warning. (#10623 ) * LLM: support bigdl quantize kv cache env and add warnning. * fix style. * fix comments.	2024-04-02 15:41:08 +08:00
Shaojun Liu	20a5e72da0	refine and verify ipex-llm-serving-xpu docker document (#10615 ) * refine serving on cpu/xpu * minor fix * replace localhost with 0.0.0.0 so that service can be accessed through ip address	2024-04-02 11:45:45 +08:00
Yuwen Hu	89d780f2e9	Small fix to install guide (#10618 )	2024-04-02 11:10:55 +08:00
Kai Huang	0a95c556a1	Fix starcoder first token perf (#10612 ) * add bias check * update	2024-04-02 09:21:38 +08:00
Cengguang Zhang	e567956121	LLM: add memory optimization for llama. (#10592 ) * add initial memory optimization. * fix logic. * fix logic, * remove env var check in mlp split.	2024-04-02 09:07:50 +08:00

1 2 3 4 5 ...

2532 commits