ipex-llm

Author	SHA1	Message	Date
Guoqiong Song	e8c5645067	add LLM example of aquila on GPU (#9056 ) * aquila, dolly-v1, dolly-v2, vacuna	2023-10-10 17:01:35 -07:00
binbin Deng	5e9962b60e	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
Yang Wang	88565c76f6	add export merged model example (#9018 ) * add export merged model example * add sources * add script * fix style	2023-10-04 21:18:52 -07:00
Ruonan Wang	b943d73844	LLM: refactor kv cache (#9030 ) * refactor utils * meet code review; update all models * small fix	2023-09-21 21:28:03 +08:00
Ruonan Wang	bf51ec40b2	LLM: Fix empty cache (#9024 ) * fix * fix * update example	2023-09-21 17:16:07 +08:00
binbin Deng	edb225530b	add bark (#9016 )	2023-09-21 12:24:58 +08:00
JinBridge	48b503c630	LLM: add example of aquila (#9006 ) * LLM: add example of aquila * LLM: replace AquilaChat with Aquila * LLM: shorten prompt of aquila example	2023-09-20 15:52:56 +08:00
Yang Wang	c88f6ec457	Experiment XPU QLora Finetuning (#8937 ) * Support xpu finetuning * support xpu finetuning * fix style * fix style * fix style * refine example * add readme * refine readme * refine api * fix fp16 * fix example * refactor * fix style * fix compute type * add qlora * refine training args * fix example * fix style * fast path forinference * address comments * refine readme * revert lint	2023-09-19 10:15:44 -07:00
Jason Dai	51518e029d	Update llm readme (#9005 )	2023-09-19 20:01:33 +08:00
Ruonan Wang	249386261c	LLM: add Baichuan2 cpu example (#9002 ) * add baichuan2 cpu examples * add link * update prompt	2023-09-19 18:08:30 +08:00
binbin Deng	c1d25a51a8	LLM: add `optimize_model` example for bert (#8975 )	2023-09-18 16:18:35 +08:00
Ruonan Wang	cabe7c0358	LLM: add baichuan2 example for arc (#8994 ) * add baichuan2 examples * add link * small fix	2023-09-18 14:32:27 +08:00
JinBridge	c12b8f24b6	LLM: add use_cache=True for all gpu examples (#8971 )	2023-09-15 09:54:38 +08:00
binbin Deng	be29c75c18	LLM: refactor gpu examples (#8963 ) * restructure * change to hf-transformers-models/	2023-09-13 14:47:47 +08:00
Ruonan Wang	4de73f592e	LLM: add gpu example of chinese-llama-2-7b (#8960 ) * add gpu example of chinese -llama2 * update model name and link * update name	2023-09-13 10:16:51 +08:00
binbin Deng	2d81521019	LLM: add `optimize_model` examples for llama2 and chatglm (#8894 ) * add llama2 and chatglm optimize_model examples * update default usage * update command and some descriptions * move folder and remove general_int4 descriptions * change folder name	2023-09-12 10:36:29 +08:00
Yuwen Hu	ca35c93825	[LLM] Fix langchain UT (#8929 ) * Change dependency version for langchain uts * Downgrade pandas version instead; and update example readme accordingly	2023-09-08 13:51:04 +08:00
Zhao Changmin	8bc1d8a17c	LLM: Fix discards in `optimize_model` with non-hf models and add openai whisper example (#8877 ) * openai-whisper	2023-09-07 10:35:59 +08:00
Yina Chen	bfc71fbc15	Add known issue in arc voice assistant example (#8902 ) * add known issue in voice assistant example * update cpu	2023-09-07 09:28:26 +08:00
Yina Chen	74a2c2ddf5	Update optimize_model=True in llama2 chatglm2 arc examples (#8878 ) * add optimize_model=True in llama2 chatglm2 examples * add ipex optimize in gpt-j example	2023-09-05 10:35:37 +08:00
Zhao Changmin	9c652fbe95	LLM: Whisper long segment recognize example (#8826 ) * LLM: Long segment recognize example	2023-08-31 16:41:25 +08:00
Yina Chen	3462fd5c96	Add arc gpt-j example (#8840 )	2023-08-30 10:31:24 +08:00
Ruonan Wang	f42c0bad1b	LLM: update GPU doc (#8845 )	2023-08-30 09:24:19 +08:00
Jason Dai	aab7deab1f	Reorganize GPU examples (#8844 )	2023-08-30 08:32:08 +08:00
Yang Wang	a386ad984e	Add Data Center GPU Flex Series to Readme (#8835 ) * Add Data Center GPU Flex Series to Readme * remove * update starcoder	2023-08-29 11:19:09 -07:00
Ruonan Wang	ddff7a6f05	Update readme of GPU to specify oneapi version(#8820 )	2023-08-29 13:14:22 +08:00
Yina Chen	35fdf94031	[LLM]Arc starcoder example (#8814 ) * arc starcoder example init * add log * meet comments	2023-08-28 16:48:00 +08:00
Ruonan Wang	eae92bc7da	llm: quick fix path (#8810 )	2023-08-25 16:02:31 +08:00
Ruonan Wang	0186f3ab2f	llm: update all ARC int4 examples (#8809 ) * update GPU examples * update other examples * fix * update based on comment	2023-08-25 15:26:10 +08:00
Yang Wang	9d0f6a8cce	rename math.py in example to avoid conflict (#8805 )	2023-08-24 21:06:31 -07:00
SONG Ge	d2926c7672	[LLM] Unify Langchain Native and Transformers LLM API (#8752 ) * deprecate BigDLNativeTransformers and add specific LMEmbedding method * deprecate and add LM methods for langchain llms * add native params to native langchain * new imple for embedding * move ut from bigdlnative to casual llm * rename embeddings api and examples update align with usage updating * docqa example hot-fix * add more api docs * add langchain ut for starcoder * support model_kwargs for transformer methods when calling causalLM and add ut * ut fix for transformers embedding * update for langchain causal supporting transformers * remove model_family in readme doc * add model_families params to support more models * update api docs and remove chatglm embeddings for now * remove chatglm embeddings in examples * new refactor for ut to add bloom and transformers llama ut * disable llama transformers embedding ut	2023-08-25 11:14:21 +08:00
binbin Deng	5582872744	LLM: update chatglm example to be more friendly for beginners (#8795 )	2023-08-25 10:55:01 +08:00
Yina Chen	7c37424a63	Fix voice assistant example input error on Linux (#8799 ) * fix linux error * update * remove alsa log	2023-08-25 10:47:27 +08:00
Ruonan Wang	e9aa2bd890	LLM: reduce GPU 1st token latency and update example (#8763 ) * reduce 1st token latency * update example * fix * fix style * update readme of gpu benchmark	2023-08-16 18:01:23 +08:00
binbin Deng	06609d9260	LLM: add qwen example on arc (#8757 )	2023-08-16 17:11:08 +08:00
Song Jiaming	c1f9af6d97	[LLM] chatglm example and transformers low-bit examples (#8751 )	2023-08-16 11:41:44 +08:00
binbin Deng	97283c033c	LLM: add falcon example on arc (#8742 )	2023-08-15 17:38:38 +08:00
binbin Deng	8c55911308	LLM: add baichuan-13B on arc example (#8755 )	2023-08-15 15:07:04 +08:00
binbin Deng	be2ae6eb7c	LLM: fix langchain native int4 voiceasistant example (#8750 )	2023-08-14 17:23:33 +08:00
Ruonan Wang	d28ad8f7db	LLM: add whisper example for arc transformer int4 (#8749 ) * add whisper example for arc int4 * fix	2023-08-14 17:05:48 +08:00
Ruonan Wang	faaccb64a2	LLM: add chatglm2 example for Arc (#8741 ) * add chatglm2 example * update * fix readme	2023-08-14 10:43:08 +08:00
binbin Deng	b10d7e1adf	LLM: add mpt example on arc (#8723 )	2023-08-14 09:40:01 +08:00
binbin Deng	e9a1afffc5	LLM: add internlm example on arc (#8722 )	2023-08-14 09:39:39 +08:00
SONG Ge	aceea4dc29	[LLM] Unify Transformers and Native API (#8713 ) * re-open pr to run on latest runner * re-add examples and ut * rename ut and move deprecate to warning instead of raising an error info * ut fix	2023-08-11 19:45:47 +08:00
Shengsheng Huang	7c56c39e36	Fix GPU examples READ to use bigdl-core-xe (#8714 ) * Update README.md * Update README.md	2023-08-10 12:53:49 +08:00
Yina Chen	6d1ca88aac	add voice assistant example (#8711 )	2023-08-10 12:42:14 +08:00
Ruonan Wang	1a7b698a83	[LLM] support ipex arc int4 & add basic llama2 example (#8700 ) * first support of xpu * make it works on gpu update setup update add GPU llama2 examples add use_optimize flag to disbale optimize for gpu fix style update gpu exmaple readme fix * update example, and update env * fix setup to add cpp files * replace jit with aot to avoid data leak * rename to bigdl-core-xe * update installation in example readme	2023-08-09 22:20:32 +08:00
binbin Deng	4c44153584	LLM: add Qwen transformers int4 example (#8699 )	2023-08-08 11:23:09 +08:00
binbin Deng	6fc31bb4cf	LLM: first update descriptions for ChatGLM transformers int4 example (#8646 )	2023-08-02 11:00:56 +08:00
binbin Deng	39994738d1	LLM: add chat & stream chat example for ChatGLM2 transformers int4 (#8636 )	2023-08-01 14:57:45 +08:00

1 2

97 commits