ipex-llm

Author	SHA1	Message	Date
Jason Dai	dcadd09154	Update llm document (#8784 )	2023-08-21 22:34:44 +08:00
Yishuo Wang	611c1fb628	[LLM] change default n_threads of native int4 langchain API (#8779 )	2023-08-21 13:30:12 +08:00
Yishuo Wang	3d1f2b44f8	LLM: change default n_threads of native int4 models (#8776 )	2023-08-18 15:46:19 +08:00
Yishuo Wang	2ba2133613	fix starcoder chinese output (#8773 )	2023-08-18 13:37:02 +08:00
binbin Deng	548f7a6cf7	LLM: update convert of llama family to support llama2-70B (#8747 )	2023-08-18 09:30:35 +08:00
Yina Chen	4afea496ab	support q8_0 (#8765 )	2023-08-17 15:06:36 +08:00
Ruonan Wang	e9aa2bd890	LLM: reduce GPU 1st token latency and update example (#8763 ) * reduce 1st token latency * update example * fix * fix style * update readme of gpu benchmark	2023-08-16 18:01:23 +08:00
binbin Deng	06609d9260	LLM: add qwen example on arc (#8757 )	2023-08-16 17:11:08 +08:00
SONG Ge	f4164e4492	[BigDL LLM] Update readme for unifying transformers API (#8737 ) * update readme doc * fix readthedocs error * update comment * update exception error info * invalidInputError instead * fix readme typo error and remove import error * fix more typo	2023-08-16 14:22:32 +08:00
Song Jiaming	c1f9af6d97	[LLM] chatglm example and transformers low-bit examples (#8751 )	2023-08-16 11:41:44 +08:00
Ruonan Wang	8805186f2f	LLM: add benchmark tool for gpu (#8760 ) * add benchmark tool for gpu * update	2023-08-16 11:22:10 +08:00
binbin Deng	97283c033c	LLM: add falcon example on arc (#8742 )	2023-08-15 17:38:38 +08:00
binbin Deng	8c55911308	LLM: add baichuan-13B on arc example (#8755 )	2023-08-15 15:07:04 +08:00
binbin Deng	be2ae6eb7c	LLM: fix langchain native int4 voiceasistant example (#8750 )	2023-08-14 17:23:33 +08:00
Ruonan Wang	d28ad8f7db	LLM: add whisper example for arc transformer int4 (#8749 ) * add whisper example for arc int4 * fix	2023-08-14 17:05:48 +08:00
Yishuo Wang	77844125f2	[LLM] Support chatglm cache (#8745 )	2023-08-14 15:10:46 +08:00
Ruonan Wang	faaccb64a2	LLM: add chatglm2 example for Arc (#8741 ) * add chatglm2 example * update * fix readme	2023-08-14 10:43:08 +08:00
binbin Deng	b10d7e1adf	LLM: add mpt example on arc (#8723 )	2023-08-14 09:40:01 +08:00
binbin Deng	e9a1afffc5	LLM: add internlm example on arc (#8722 )	2023-08-14 09:39:39 +08:00
SONG Ge	aceea4dc29	[LLM] Unify Transformers and Native API (#8713 ) * re-open pr to run on latest runner * re-add examples and ut * rename ut and move deprecate to warning instead of raising an error info * ut fix	2023-08-11 19:45:47 +08:00
Yishuo Wang	f91035c298	[LLM] fix chatglm native int4 emoji output (#8739 )	2023-08-11 15:38:41 +08:00
binbin Deng	77efcf7b1d	LLM: fix ChatGLM2 native int4 stream output (#8733 )	2023-08-11 14:51:50 +08:00
Ruonan Wang	ca3e59a1dc	LLM: support stop for starcoder native int4 stream (#8734 )	2023-08-11 14:51:30 +08:00
Song Jiaming	e292dfd970	[WIP] LLM transformers api for langchain (#8642 )	2023-08-11 13:32:35 +08:00
Yishuo Wang	3d5a7484a2	[LLM] fix bloom and starcoder memory release (#8728 )	2023-08-11 11:18:19 +08:00
xingyuan li	02ec01cb48	[LLM] Add bigdl-core-xe dependency when installing bigdl-llm[xpu] (#8716 ) * add bigdl-core-xe dependency	2023-08-10 17:41:42 +09:00
Shengsheng Huang	7c56c39e36	Fix GPU examples READ to use bigdl-core-xe (#8714 ) * Update README.md * Update README.md	2023-08-10 12:53:49 +08:00
Yina Chen	6d1ca88aac	add voice assistant example (#8711 )	2023-08-10 12:42:14 +08:00
Song Jiaming	e717e304a6	LLM first example test and template (#8658 )	2023-08-10 10:03:11 +08:00
Ruonan Wang	1a7b698a83	[LLM] support ipex arc int4 & add basic llama2 example (#8700 ) * first support of xpu * make it works on gpu update setup update add GPU llama2 examples add use_optimize flag to disbale optimize for gpu fix style update gpu exmaple readme fix * update example, and update env * fix setup to add cpp files * replace jit with aot to avoid data leak * rename to bigdl-core-xe * update installation in example readme	2023-08-09 22:20:32 +08:00
Jason Dai	d03218674a	Update llm readme (#8703 )	2023-08-09 14:47:26 +08:00
Kai Huang	1b65288bdb	Add api doc for LLM (#8605 ) * api doc initial * update desc	2023-08-08 18:17:16 +08:00
binbin Deng	4c44153584	LLM: add Qwen transformers int4 example (#8699 )	2023-08-08 11:23:09 +08:00
Yishuo Wang	710b9b8982	[LLM] add linux chatglm pybinding binary file (#8698 )	2023-08-08 11:16:30 +08:00
binbin Deng	ea5d7aff5b	LLM: add chatglm native int4 transformers API (#8695 )	2023-08-07 17:52:47 +08:00
Yishuo Wang	6da830cf7e	[LLM] add chaglm pybinding binary file in setup.py (#8692 )	2023-08-07 09:41:03 +08:00
Cengguang Zhang	ebcf75d506	feat: set transformers lib version. (#8683 )	2023-08-04 15:01:59 +08:00
Yishuo Wang	ef08250c21	[LLM] chatglm pybinding support (#8672 )	2023-08-04 14:27:29 +08:00
Yishuo Wang	5837cc424a	[LLM] add chatglm pybinding binary file release (#8677 )	2023-08-04 11:45:27 +08:00
Yang Wang	b6468bac43	optimize chatglm2 long sequence (#8662 ) * add chatglm2 * optimize a little * optimize chatglm long sequence * fix style * address comments and fix style * fix bug	2023-08-03 17:56:24 -07:00
Yang Wang	3407f87075	Fix llama kv cache bug (#8674 )	2023-08-03 17:54:55 -07:00
Yina Chen	59903ea668	llm linux support avx & avx2 (#8669 )	2023-08-03 17:10:59 +08:00
xingyuan li	110cfb5546	[LLM] Remove old windows nightly test code (#8668 ) Remove old Windows nightly test code triggered by task scheduler Add new Windows nightly workflow for nightly testing	2023-08-03 17:12:23 +09:00
xingyuan li	610084e3c0	[LLM] Complete windows unittest (#8611 ) * add windows nightly test workflow * use github runner to run pr test * model load should use lowbit * remove tmp dir after testing	2023-08-03 14:48:42 +09:00
binbin Deng	a15a2516e6	add (#8659 )	2023-08-03 10:12:10 +08:00
Xin Qiu	0714888705	build windows avx dll (#8657 ) * windows avx * add to actions	2023-08-03 02:06:24 +08:00
Yina Chen	119bf6d710	[LLM] Support linux cpp dynamic load .so (#8655 ) * support linux cpp dynamic load .so * update cli	2023-08-02 20:15:45 +08:00
Zhao Changmin	ca998cc6f2	LLM: Mute shape mismatch output (#8601 ) * LLM: Mute shape mismatch output	2023-08-02 16:46:22 +08:00
Zhao Changmin	04c713ef06	LLM: Disable transformer api `pretraining_tp` (#8645 ) * disable pretraining_tp	2023-08-02 11:26:01 +08:00
binbin Deng	6fc31bb4cf	LLM: first update descriptions for ChatGLM transformers int4 example (#8646 )	2023-08-02 11:00:56 +08:00

1 2 3 4 5

211 commits