ipex-llm

Author	SHA1	Message	Date
Wang	fdc0e838df	Merge remote-tracking branch 'upstream/main'	2023-09-26 15:45:31 +08:00
Wang	b17e536a1b	recover file	2023-09-26 15:45:03 +08:00
Cengguang Zhang	ad62c58b33	LLM: Enable jemalloc in benchmark scripts. (#9058 ) * enable jemalloc. * fix readme.	2023-09-26 15:37:49 +08:00
Wang	9e03c5c7fc	manually build	2023-09-26 15:28:01 +08:00
Wang	2dc76dc358	manually build	2023-09-26 15:15:15 +08:00
Lilac09	ecee02b34d	Add bigdl llm xpu image build (#9062 ) * modify Dockerfile * add README.md * add README.md * Modify Dockerfile * Add bigdl inference cpu image build * Add bigdl llm cpu image build * Add bigdl llm cpu image build * Add bigdl llm cpu image build * Modify Dockerfile * Add bigdl inference cpu image build * Add bigdl inference cpu image build * Add bigdl llm xpu image build	2023-09-26 14:29:03 +08:00
Wang	d0ac0941a2	Add bigdl llm xpu image build	2023-09-26 14:25:10 +08:00
Wang	781bc5bc8d	Add bigdl inference cpu image build	2023-09-26 14:07:36 +08:00
Wang	390c90551e	Add bigdl inference cpu image build	2023-09-26 14:03:55 +08:00
Wang	7a69bee8d0	Modify Dockerfile	2023-09-26 13:58:42 +08:00
Wang	47996c29e4	Merge remote-tracking branch 'upstream/main'	2023-09-26 13:56:27 +08:00
Lilac09	9ac950fa52	Add bigdl llm cpu image build (#9047 ) * modify Dockerfile * add README.md * add README.md * Modify Dockerfile * Add bigdl inference cpu image build * Add bigdl llm cpu image build * Add bigdl llm cpu image build * Add bigdl llm cpu image build	2023-09-26 13:22:11 +08:00
Wang	a50c11d326	Modify Dockerfile	2023-09-26 11:19:13 +08:00
Ziteng Zhang	a717352c59	Replace Llama 7b to Llama2-7b in README.md (#9055 ) * Replace Llama 7b with Llama2-7b in README.md Need to replace the base model to Llama2-7b as we are operating on Llama2 here. * Replace Llama 7b to Llama2-7b in README.md a llama 7b in the 1st line is missed * Update architecture graph --------- Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com>	2023-09-26 09:56:46 +08:00
Guancheng Fu	cc84ed70b3	Create serving images (#9048 ) * Finished & Tested * Install latest pip from base images * Add blank line * Delete unused comment * fix typos	2023-09-25 15:51:45 +08:00
Wang	847af63e8e	Add bigdl llm cpu image build	2023-09-25 15:33:39 +08:00
Wang	7f2d2a5238	Add bigdl llm cpu image build	2023-09-25 15:14:23 +08:00
Wang	9cae4600da	Add bigdl llm cpu image build	2023-09-25 14:45:30 +08:00
Wang	ceed895c31	Add bigdl inference cpu image build	2023-09-25 14:31:43 +08:00
Cengguang Zhang	b4a1266ef0	[WIP] LLM: add kv cache support for internlm. (#9036 ) * LLM: add kv cache support for internlm * add internlm apply_rotary_pos_emb * fix. * fix style.	2023-09-25 14:16:59 +08:00
Wang	fc8bf6b0d5	Modify Dockerfile	2023-09-25 14:05:08 +08:00
Wang	e8f436453d	Merge remote-tracking branch 'upstream/main'	2023-09-25 13:59:19 +08:00
Ruonan Wang	975da86e00	LLM: fix gptneox kv cache (#9044 )	2023-09-25 13:03:57 +08:00
Heyang Sun	4b843d1dbf	change lora-model output behavior on k8s (#9038 ) Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>	2023-09-25 09:28:44 +08:00
Cengguang Zhang	26213a5829	LLM: Change benchmark bf16 load format. (#9035 ) * LLM: Change benchmark bf16 load format. * comment on bf16 chatglm. * fix.	2023-09-22 17:38:38 +08:00
JinBridge	023555fb1f	LLM: Add one-click installer for Windows (#8999 ) * LLM: init one-click installer for windows * LLM: fix typo in one-click installer readme * LLM: one-click installer try except logic * LLM: one-click installer add dependency * LLM: one-click installer adjust README.md * LLM: one-click installer split README and add zip compress in setup.bat * LLM: one-click installer verified internlm and llama2 and replace gif * LLM: remove one-click installer images * LLM: finetune the one-click installer README.md * LLM: fix typo in one-click installer README.md * LLM: rename one-click installer to protable executable * LLM: rename other places to protable executable * LLM: rename the zip filename to executable * LLM: update .gitignore * LLM: add colorama to setup.bat	2023-09-22 14:46:30 +08:00
Jiao Wang	028a6d9383	MPT model optimize for long sequence (#9020 ) * mpt_long_seq * update * update * update * style * style2 * update	2023-09-21 21:27:23 -07:00
Lilac09	9126abdf9b	add README.md for bigdl-llm-cpu image (#9026 ) * modify Dockerfile * add README.md * add README.md	2023-09-22 09:03:57 +08:00
Ruonan Wang	b943d73844	LLM: refactor kv cache (#9030 ) * refactor utils * meet code review; update all models * small fix	2023-09-21 21:28:03 +08:00
Cengguang Zhang	868511cf02	LLM: fix kv cache issue of bloom and falcon. (#9029 )	2023-09-21 18:12:20 +08:00
Ruonan Wang	bf51ec40b2	LLM: Fix empty cache (#9024 ) * fix * fix * update example	2023-09-21 17:16:07 +08:00
Wang	f985068491	add README.md	2023-09-21 16:58:37 +08:00
Yina Chen	714884414e	fix error (#9025 )	2023-09-21 16:42:11 +08:00
Wang	8ca46d004f	add README.md	2023-09-21 16:34:07 +08:00
Wang	ca0e86062e	modify Dockerfile	2023-09-21 16:32:09 +08:00
binbin Deng	edb225530b	add bark (#9016 )	2023-09-21 12:24:58 +08:00
SONG Ge	fa47967583	[LLM] Optimize kv_cache for gptj model family (#9010 ) * optimize gptj model family attention * add license and comment for dolly-model * remove xpu mentioned * remove useless info * code sytle * style fix * code style in gptj fix * remove gptj arch * move apply_rotary_pos_emb into utils * kv_seq_length update * use hidden_states instead of query layer to reach batch size	2023-09-21 10:42:08 +08:00
Guancheng Fu	3913ba4577	add README.md (#9004 )	2023-09-21 10:32:56 +08:00
Cengguang Zhang	b3cad7de57	LLM: add bloom kv cache support (#9012 ) * LLM: add bloom kv cache support * fix style.	2023-09-20 21:10:53 +08:00
Kai Huang	156af15d1e	Add NF3 (#9008 ) * add nf3 * grammar	2023-09-20 20:03:07 +08:00
Kai Huang	6981745fe4	Optimize kv_cache for gpt-neox model family (#9015 ) * override gptneox * style * move to utils * revert	2023-09-20 19:59:19 +08:00
JinBridge	48b503c630	LLM: add example of aquila (#9006 ) * LLM: add example of aquila * LLM: replace AquilaChat with Aquila * LLM: shorten prompt of aquila example	2023-09-20 15:52:56 +08:00
Cengguang Zhang	735a17f7b4	LLM: add kv cache to falcon family. (#8995 ) * add kv cache to falcon family. * fix: import error. * refactor * update comments. * add two version falcon attention forward. * fix * fix. * fix. * fix. * fix style. * fix style.	2023-09-20 15:36:30 +08:00
Ruonan Wang	94a7f8917b	LLM: fix optimized kv cache for baichuan-13b (#9009 ) * fix baichuan 13b * fix style * fix * fix style	2023-09-20 15:30:14 +08:00
Yang Wang	c88f6ec457	Experiment XPU QLora Finetuning (#8937 ) * Support xpu finetuning * support xpu finetuning * fix style * fix style * fix style * refine example * add readme * refine readme * refine api * fix fp16 * fix example * refactor * fix style * fix compute type * add qlora * refine training args * fix example * fix style * fast path forinference * address comments * refine readme * revert lint	2023-09-19 10:15:44 -07:00
Jason Dai	51518e029d	Update llm readme (#9005 )	2023-09-19 20:01:33 +08:00
Ruonan Wang	249386261c	LLM: add Baichuan2 cpu example (#9002 ) * add baichuan2 cpu examples * add link * update prompt	2023-09-19 18:08:30 +08:00
Yuwen Hu	c389e1323d	fix xpu performance tests by making sure that latest bigdl-core-xe is installed (#9001 )	2023-09-19 17:33:30 +08:00
Guancheng Fu	b6c9198d47	Add xpu image for bigdl-llm (#9003 ) * Add xpu image * fix * fix * fix format	2023-09-19 16:56:22 +08:00
Ruonan Wang	004c45c2be	LLM: Support optimized kv_cache for baichuan family (#8997 ) * add initial support for baichuan attantion * support baichuan1 * update based on comment * update based on comment * support baichuan2 * update link, change how to jusge baichuan2 * fix style * add model parameter for pob emb * update based on comment	2023-09-19 15:38:54 +08:00

1 2 3 4 5 ...

1499 commits