ipex-llm

Author	SHA1	Message	Date
xingyuan li	e57db777e0	[LLM] Setup.py & llm-cli update for windows vnni binary files (#8537 ) * update setup.py * update llm-cli	2023-07-17 12:28:38 +09:00
binbin Deng	f56b5ade4c	[LLM] Add more transformers int4 example (chatglm2) (#8539 )	2023-07-14 17:58:33 +08:00
binbin Deng	92d33cf35a	[LLM] Add more transformers int4 example (phoenix) (#8520 )	2023-07-14 17:58:04 +08:00
Yuwen Hu	e0f0def279	Remove unused example for now (#8538 )	2023-07-14 17:32:50 +08:00
binbin Deng	b397e40015	[LLM] Add more transformers int4 example (RedPajama) (#8523 )	2023-07-14 17:30:28 +08:00
Yuwen Hu	7bf3e10415	[LLM] Add more int4 transformers examples (MOSS) (#8532 ) * Add Moss example * Small fix	2023-07-14 16:41:41 +08:00
Yuwen Hu	59b7287ef5	[LLM] Add more transformers int4 example (Baichuan) (#8522 ) * Add example model Baichuan * Small updates to client windows settings * Small refactor * Small fix	2023-07-14 16:41:29 +08:00
Yuwen Hu	ca6e38607c	[LLM] Add more transformers examples (ChatGLM) (#8521 ) * Add example for chatglm v1 and other small fixes * Small fix * Small further fix * Small fix * Update based on comments & updates for client windows recommended settingts * Small fix * Small refactor * Small fix * Small fix * Small fix to dolly v1 * Small fix	2023-07-14 16:41:13 +08:00
xingyuan li	c87853233b	[LLM] Add windows vnni binary build step (#8518 ) * add windows vnni build step * update build info * add download command	2023-07-14 17:24:39 +09:00
Yishuo Wang	6320bf201e	LLM: fix memory access violation (#8519 )	2023-07-13 17:08:08 +08:00
xingyuan li	60c2c0c3dc	Bug fix for merged pr #8503 (#8516 )	2023-07-13 17:26:30 +09:00
Yuwen Hu	349bcb4bae	[LLM] Add more transformers int4 example (Dolly v1) (#8517 ) * Initial commit for dolly v1 * Add example for Dolly v1 and other small fix * Small output updates * Small fix * fix based on comments	2023-07-13 16:13:47 +08:00
Xin Qiu	90e3d86bce	rename low bit type name (#8512 ) * change qx_0 to sym_intx * update * fix typo * update * fix type * fix style * add python doc * meet code review * fix style	2023-07-13 15:53:31 +08:00
xingyuan li	4f152b4e3a	[LLM] Merge the llm.cpp build and the pypi release (#8503 ) * checkout llm.cpp to build new binary * use artifact to get latest built binary files * rename quantize * modify all release workflow	2023-07-13 16:34:24 +09:00
Yuwen Hu	bcde8ec83e	[LLM] Small fix to MPT Example (#8513 )	2023-07-13 14:33:21 +08:00
Zhao Changmin	ba0da17b40	LLM: Support AutoModelForSeq2SeqLM transformer API (#8449 ) * LLM: support AutoModelForSeq2SeqLM transformer API	2023-07-13 13:33:51 +08:00
Yishuo Wang	86b5938075	LLM: fix llm pybinding (#8509 )	2023-07-13 10:27:08 +08:00
Yuwen Hu	fcc352eee3	[LLM] Add more transformers_int4 examples (MPT) (#8498 ) * Update transformers_int4 readme, and initial commit for mpt * Update example for mpt * Small fix and recover transformers_int4_pipeline_readme.md for now * Update based on comments * Small fix * Small fix * Update based on comments	2023-07-13 09:41:16 +08:00
Zhao Changmin	23f6a4c21f	LLM: Optimize transformer int4 loading (#8499 ) * LLM: Optimize transformer int4 loading	2023-07-12 15:25:42 +08:00
Yishuo Wang	dd3f953288	Support vnni check (#8497 )	2023-07-12 10:11:15 +08:00
Xin Qiu	cd7a980ec4	Transformer int4 add qtype, support q4_1 q5_0 q5_1 q8_0 (#8481 ) * quant in Q4 5 8 * meet code review * update readme * style * update * fix error * fix error * update * fix style * update * Update README.md * Add load_in_low_bit	2023-07-12 08:23:08 +08:00
Yishuo Wang	db39d0a6b3	LLM: disable mmap by default for better performance (#8467 )	2023-07-11 09:26:26 +08:00
Yuwen Hu	52c6b057d6	Initial LLM Transformers example refactor (#8491 )	2023-07-10 17:53:57 +08:00
Junwei Deng	254a7aa3c4	bigdl-llm: add voice-assistant example that are migrated from langchain use-case document (#8468 )	2023-07-10 16:51:45 +08:00
Yishuo Wang	98bac815e4	specify numpy version (#8489 )	2023-07-10 16:50:16 +08:00
Zhao Changmin	81d655cda9	LLM: transformer int4 save and load (#8462 ) * LLM: transformer int4 save and load	2023-07-10 16:34:41 +08:00
binbin Deng	d489775d2c	LLM: fix inconsistency between output token number and `max_new_token` (#8479 )	2023-07-07 17:31:05 +08:00
Jason Dai	bcc1eae322	Llm readme update (#8472 )	2023-07-06 20:04:04 +08:00
Ruonan Wang	2f77d485d8	Llm: Initial support of langchain transformer int4 API (#8459 ) * first commit of transformer int4 and pipeline * basic examples temp save for embeddings support embeddings and docqa exaple * fix based on comment * small fix	2023-07-06 17:50:05 +08:00
binbin Deng	14626fe05b	LLM: refactor transformers and langchain class name (#8470 )	2023-07-06 17:16:44 +08:00
binbin Deng	70bc8ea8ae	LLM: update langchain and cpp-python style API examples (#8456 )	2023-07-06 14:36:42 +08:00
Ruonan Wang	64b38e1dc8	llm: benchmark tool for transformers int4 (separate 1st token and rest) (#8460 ) * add benchmark utils * fix * fix bug and add readme * hidden latency data	2023-07-06 09:49:52 +08:00
binbin Deng	77808fa124	LLM: fix n_batch in starcoder pybinding (#8461 )	2023-07-05 17:06:50 +08:00
Yina Chen	f2bb469847	[WIP] LLm llm-cli chat mode (#8440 ) * fix timezone * temp * Update linux interactive mode * modify init text for interactive mode * meet comments * update * win script * meet comments	2023-07-05 14:04:17 +08:00
binbin Deng	1970bcf14e	LLM: add readme for transformer examples (#8444 )	2023-07-04 17:25:58 +08:00
binbin Deng	e54e52b438	LLM: fix n_batch in bloom pybinding (#8454 )	2023-07-04 15:10:32 +08:00
Yuwen Hu	372c775cb4	[LLM] Change default runner for LLM Linux tests to the ones with AVX512 (#8448 ) * Basic change for AVX512 runner * Remove conda channel and action rename * Small fix * Small fix and reduce peak convert disk space * Define n_threads based on runner status * Small thread num fix * Define thread_num for cli * test * Add self-hosted label and other small fix	2023-07-04 14:53:03 +08:00
Jason Dai	edf23a95be	Update llm readme (#8446 )	2023-07-03 16:58:44 +08:00
Jason Dai	a38f927fc0	Update README.md (#8439 )	2023-07-03 14:59:55 +08:00
binbin Deng	c956a46c40	LLM: first fix example/transformers (#8438 )	2023-07-03 14:13:33 +08:00
Jason Dai	e5b384aaa2	Update README.md (#8437 )	2023-07-03 10:54:29 +08:00
Yang Wang	449aea7ffc	Optimize transformer int4 loading memory (#8400 ) * Optimize transformer int4 loading memory * move cast to convert * default settting low_cpu_mem_usage	2023-06-30 20:12:12 -07:00
Jason Dai	2da21163f8	Update llm README.md (#8431 )	2023-06-30 19:41:17 +08:00
Junwei Deng	2fd751de7a	LLM: add a dev tool for getting glibc/glibcxx requirement (#8399 ) * add a dev tool * pep8 change	2023-06-30 11:09:50 +08:00
binbin Deng	146662bc0d	LLM: fix langchain windows failure (#8417 )	2023-06-30 09:59:10 +08:00
Yina Chen	6251ad8934	[LLM]Windows unittest (#8356 ) * win-unittest * update * update * try llama 7b * delete llama * update * add red-3b * only test red-3b * revert * add langchain * add dependency * delete langchain	2023-06-29 14:03:12 +08:00
Yina Chen	783aea3309	[LLM] LLM windows daily test (#8328 ) * llm-win-init * test action * test * add types * update for schtasks * update pytests * update * update * update doc * use stable ckpt from ftp instead of the converted model * download using batch -> manually * add starcoder test	2023-06-28 15:02:11 +08:00
binbin Deng	ca5a4b6e3a	LLM: update bloom and starcoder usage in transformers_int4_pipeline (#8406 )	2023-06-28 13:15:50 +08:00
Zhao Changmin	cc76ec809a	check out dir (#8395 )	2023-06-27 21:28:39 +08:00
Ruonan Wang	4be784a49d	LLM: add UT for starcoder (convert, inference) update examples and readme (#8379 ) * first commit to add path * update example and readme * update path * fix * update based on comment	2023-06-27 12:12:11 +08:00

1 2 3

120 commits