ipex-llm

Author	SHA1	Message	Date
Yishuo Wang	710b9b8982	[LLM] add linux chatglm pybinding binary file (#8698 )	2023-08-08 11:16:30 +08:00
binbin Deng	ea5d7aff5b	LLM: add chatglm native int4 transformers API (#8695 )	2023-08-07 17:52:47 +08:00
Yishuo Wang	6da830cf7e	[LLM] add chaglm pybinding binary file in setup.py (#8692 )	2023-08-07 09:41:03 +08:00
Cengguang Zhang	ebcf75d506	feat: set transformers lib version. (#8683 )	2023-08-04 15:01:59 +08:00
Yishuo Wang	ef08250c21	[LLM] chatglm pybinding support (#8672 )	2023-08-04 14:27:29 +08:00
Yishuo Wang	5837cc424a	[LLM] add chatglm pybinding binary file release (#8677 )	2023-08-04 11:45:27 +08:00
Yang Wang	b6468bac43	optimize chatglm2 long sequence (#8662 ) * add chatglm2 * optimize a little * optimize chatglm long sequence * fix style * address comments and fix style * fix bug	2023-08-03 17:56:24 -07:00
Yang Wang	3407f87075	Fix llama kv cache bug (#8674 )	2023-08-03 17:54:55 -07:00
Yina Chen	59903ea668	llm linux support avx & avx2 (#8669 )	2023-08-03 17:10:59 +08:00
xingyuan li	110cfb5546	[LLM] Remove old windows nightly test code (#8668 ) Remove old Windows nightly test code triggered by task scheduler Add new Windows nightly workflow for nightly testing	2023-08-03 17:12:23 +09:00
xingyuan li	610084e3c0	[LLM] Complete windows unittest (#8611 ) * add windows nightly test workflow * use github runner to run pr test * model load should use lowbit * remove tmp dir after testing	2023-08-03 14:48:42 +09:00
binbin Deng	a15a2516e6	add (#8659 )	2023-08-03 10:12:10 +08:00
Xin Qiu	0714888705	build windows avx dll (#8657 ) * windows avx * add to actions	2023-08-03 02:06:24 +08:00
Yina Chen	119bf6d710	[LLM] Support linux cpp dynamic load .so (#8655 ) * support linux cpp dynamic load .so * update cli	2023-08-02 20:15:45 +08:00
Zhao Changmin	ca998cc6f2	LLM: Mute shape mismatch output (#8601 ) * LLM: Mute shape mismatch output	2023-08-02 16:46:22 +08:00
Zhao Changmin	04c713ef06	LLM: Disable transformer api `pretraining_tp` (#8645 ) * disable pretraining_tp	2023-08-02 11:26:01 +08:00
binbin Deng	6fc31bb4cf	LLM: first update descriptions for ChatGLM transformers int4 example (#8646 )	2023-08-02 11:00:56 +08:00
Yang Wang	cbeae97a26	Optimize Llama Attention to to reduce KV cache memory copy (#8580 ) * Optimize llama attention to reduce KV cache memory copy * fix bug * fix style * remove git * fix style * fix style * fix style * fix tests * move llama attention to another file * revert * fix style * remove jit * fix	2023-08-01 16:37:58 -07:00
binbin Deng	39994738d1	LLM: add chat & stream chat example for ChatGLM2 transformers int4 (#8636 )	2023-08-01 14:57:45 +08:00
xingyuan li	cdfbe652ca	[LLM] Add chatglm support for llm-cli (#8641 ) * add chatglm build * add llm-cli support * update git * install cmake * add ut for chatglm * add files to setup * fix bug cause permission error when sf lack file	2023-08-01 14:30:17 +09:00
Zhao Changmin	d6cbfc6d2c	LLM: Add requirements in whisper example (#8644 ) * LLM: Add requirements in whisper example	2023-08-01 12:07:14 +08:00
Zhao Changmin	3e10260c6d	LLM: llm-convert support chatglm family (#8643 ) * convert chatglm	2023-08-01 11:16:18 +08:00
Yina Chen	a607972c0b	[LLM]LLM windows load -api.dll (#8631 ) * temp * update * revert setup.py	2023-07-31 13:47:20 +08:00
xingyuan li	3361b66449	[LLM] Revert llm-cli to disable selecting executables on Windows (#8630 ) * revert vnni file select * revert setup.py * add model-api.dll	2023-07-31 11:15:44 +09:00
binbin Deng	3dbab9087b	LLM: add llama2-7b native int4 example (#8629 )	2023-07-28 10:56:16 +08:00
binbin Deng	fb32fefcbe	LLM: support tensor input of native int4 `generate` (#8620 )	2023-07-27 17:59:49 +08:00
Zhao Changmin	5b484ab48d	LLM: Support load_low_bit loading models in shards format (#8612 ) * shards_model --------- Co-authored-by: leonardozcm <leonaordo1997zcm@gmail.com>	2023-07-26 13:30:01 +08:00
binbin Deng	fcf8c085e3	LLM: add llama2-13b native int4 example (#8613 )	2023-07-26 10:12:52 +08:00
Song Jiaming	650b82fa6e	[LLM] add CausalLM and Speech UT (#8597 )	2023-07-25 11:22:36 +08:00
Zhao Changmin	af201052db	avoid malloc all missing keys in fp32 (#8600 )	2023-07-25 09:48:51 +08:00
binbin Deng	3f24202e4c	[LLM] Add more transformers int4 example (Llama 2) (#8602 )	2023-07-25 09:21:12 +08:00
Jason Dai	0f8201c730	llm readme update (#8595 )	2023-07-24 09:47:49 +08:00
Yuwen Hu	ba42a6da63	[LLM] Set torch_dtype default value to 'auto' for transformers low bit from_pretrained API	2023-07-21 17:55:00 +08:00
Yuwen Hu	bbde423349	[LLM] Add current Linux UT inference tests to nightly tests (#8578 ) * Add current inference uts to nightly tests * Change test model from chatglm-6b to chatglm2-6b * Add thread num env variable for nightly test * Fix urls * Small fix	2023-07-21 13:26:38 +08:00
Yang Wang	feb3af0567	Optimize transformer int4 memory footprint (#8579 )	2023-07-20 20:22:13 -07:00
Yang Wang	57e880f63a	[LLM] use pytorch linear for large input matrix (#8492 ) * use pytorch linear for large input matrix * only works on server * fix style * optimize memory * first check server * revert * address comments * fix style	2023-07-20 09:54:25 -07:00
Yuwen Hu	6504e31a97	Small fix (#8577 )	2023-07-20 16:37:04 +08:00
Yuwen Hu	2266ca7d2b	[LLM] Small updates to transformers int4 ut (#8574 ) * Small fix to transformers int4 ut * Small fix	2023-07-20 13:20:25 +08:00
xingyuan li	7b8d9c1b0d	[LLM] Add dependency file check in setup.py (#8565 ) * add package file check	2023-07-20 14:20:08 +09:00
Song Jiaming	411d896636	LLM first transformers UT (#8514 ) * ut * transformers api first ut * name * dir issue * use chatglm instead of chatglm2 * omp * set omp in sh * source * taskset * test * test omp * add test	2023-07-20 10:16:27 +08:00
Yuwen Hu	cad78740a7	[LLM] Small fixes to the Whisper transformers INT4 example (#8573 ) * Small fixes to the whisper example * Small fix * Small fix	2023-07-20 10:11:33 +08:00
binbin Deng	7a9fdf74df	[LLM] Add more transformers int4 example (Dolly v2) (#8571 ) * add * add trust_remote_mode	2023-07-19 18:20:16 +08:00
Zhao Changmin	e680af45ea	LLM: Optimize Langchain Pipeline (#8561 ) * LLM: Optimize Langchain Pipeline * load in low bit	2023-07-19 17:43:13 +08:00
Shengsheng Huang	616b7cb0a2	add more langchain examples (#8542 ) * update langchain descriptions * add mathchain example * update readme * update readme	2023-07-19 17:42:18 +08:00
binbin Deng	457571b44e	[LLM] Add more transformers int4 example (InternLM) (#8557 )	2023-07-19 15:15:38 +08:00
xingyuan li	b6510fa054	fix move/download dll step (#8564 )	2023-07-19 12:17:07 +09:00
xingyuan li	c52ed37745	fix starcoder dll name (#8563 )	2023-07-19 11:55:06 +09:00
Zhao Changmin	3dbe3bf18e	transformer_int4 (#8553 )	2023-07-19 08:33:58 +08:00
Zhao Changmin	49d636e295	[LLM] whisper model transformer int4 verification and example (#8511 ) * LLM: transformer api support * va * example * revert * pep8 * pep8	2023-07-19 08:33:20 +08:00
Yina Chen	9a7bc17ca1	[LLM] llm supports vnni link on windows (#8543 ) * support win vnni link * fix style * fix style * use isa_checker * fix * typo * fix * update	2023-07-18 16:43:45 +08:00

1 2 3 4

178 commits