ipex-llm

Author	SHA1	Message	Date
Yishuo Wang	a232c5aa21	[LLM] add protobuf in bigdl-llm dependency (#8861 )	2023-08-31 15:23:31 +08:00
Heyang Sun	b1ac8dc1bc	BF16 Lora Finetuning on K8S with OneCCL and Intel MPI (#8775 ) * BF16 Lora Finetuning on K8S with OneCCL and Intel MPI * Update README.md * format * refine * Update README.md * refine * Update README.md * increase nfs volume size to improve IO performance * fix bugs * Update README.md * Update README.md * fix permission * move output destination * Update README.md * fix wrong base model name in doc * fix output path in entrypoint * add a permission-precreated output dir * format * move output logs to a persistent storage	2023-08-31 14:56:23 +08:00
xingyuan li	de6c6bb17f	[LLM] Downgrade amx build gcc version and remove avx flag display (#8856 ) * downgrade to gcc 11 * remove avx display	2023-08-31 14:08:13 +09:00
Yang Wang	3b4f4e1c3d	Fix llama attention optimization for XPU (#8855 ) * Fix llama attention optimization fo XPU * fix chatglm2 * fix typo	2023-08-30 21:30:49 -07:00
Shengsheng Huang	7b566bf686	[LLM] add new API for optimize any pytorch models (#8827 ) * add new API for optimize any pytorch models * change test util name * revise API and update UT * fix python style * update ut config, change default value * change defaults, disable ut transcribe	2023-08-30 19:41:53 +08:00
Xin Qiu	8eca982301	windows add env (#8852 )	2023-08-30 15:54:52 +08:00
Zhao Changmin	731916c639	LLM: Enable attempting loading method automatically (#8841 ) * enable auto load method * warning error * logger info --------- Co-authored-by: leonardozcm <leonardozcm@gmail.com>	2023-08-30 15:41:55 +08:00
Yishuo Wang	bba73ec9d2	[LLM] change chatglm native int4 checkpoint name (#8851 )	2023-08-30 15:05:19 +08:00
Wang Jian	954ef954b6	[PPML] Add occlum llm image munually build (#8849 )	2023-08-30 11:31:47 +08:00
Yina Chen	55e705a84c	[LLM] Support the rest of AutoXXX classes in Transformers API (#8815 ) * add transformers auto models * fix	2023-08-30 11:16:14 +08:00
Zhao Changmin	887018b0f2	Update ut save&load (#8847 ) Co-authored-by: leonardozcm <leonardozcm@gmail.com>	2023-08-30 10:32:57 +08:00
Yina Chen	3462fd5c96	Add arc gpt-j example (#8840 )	2023-08-30 10:31:24 +08:00
Ruonan Wang	f42c0bad1b	LLM: update GPU doc (#8845 )	2023-08-30 09:24:19 +08:00
Jason Dai	aab7deab1f	Reorganize GPU examples (#8844 )	2023-08-30 08:32:08 +08:00
Yang Wang	a386ad984e	Add Data Center GPU Flex Series to Readme (#8835 ) * Add Data Center GPU Flex Series to Readme * remove * update starcoder	2023-08-29 11:19:09 -07:00
Yishuo Wang	7429ea0606	[LLM] support transformer int4 + amx int4 (#8838 )	2023-08-29 17:27:18 +08:00
Ruonan Wang	ddff7a6f05	Update readme of GPU to specify oneapi version(#8820 )	2023-08-29 13:14:22 +08:00
xingyuan li	67052198eb	[LLM] Build with multiprocess (#8797 ) * build with multiprocess	2023-08-29 10:49:52 +09:00
Zhao Changmin	bb31d4fe80	LLM: Implement hf `low_cpu_mem_usage` with 1xbinary file peak memory on transformer int4 (#8731 ) * 1x peak memory	2023-08-29 09:33:17 +08:00
Jiao Wang	5d90ca2dac	Update MPI Estimator to support Pytorch IPEX training (#8303 ) * update * update * update * update * update * update with comments * update * update * style * style * add doc * style * style	2023-08-28 11:03:29 -07:00
Yina Chen	35fdf94031	[LLM]Arc starcoder example (#8814 ) * arc starcoder example init * add log * meet comments	2023-08-28 16:48:00 +08:00
xingyuan li	6a902b892e	[LLM] Add amx build step (#8822 ) * add amx build step	2023-08-28 17:41:18 +09:00
Ruonan Wang	eae92bc7da	llm: quick fix path (#8810 )	2023-08-25 16:02:31 +08:00
Ruonan Wang	0186f3ab2f	llm: update all ARC int4 examples (#8809 ) * update GPU examples * update other examples * fix * update based on comment	2023-08-25 15:26:10 +08:00
Song Jiaming	b8b1b6888b	[LLM] Performance test (#8796 )	2023-08-25 14:31:45 +08:00
Yang Wang	9d0f6a8cce	rename math.py in example to avoid conflict (#8805 )	2023-08-24 21:06:31 -07:00
SONG Ge	d2926c7672	[LLM] Unify Langchain Native and Transformers LLM API (#8752 ) * deprecate BigDLNativeTransformers and add specific LMEmbedding method * deprecate and add LM methods for langchain llms * add native params to native langchain * new imple for embedding * move ut from bigdlnative to casual llm * rename embeddings api and examples update align with usage updating * docqa example hot-fix * add more api docs * add langchain ut for starcoder * support model_kwargs for transformer methods when calling causalLM and add ut * ut fix for transformers embedding * update for langchain causal supporting transformers * remove model_family in readme doc * add model_families params to support more models * update api docs and remove chatglm embeddings for now * remove chatglm embeddings in examples * new refactor for ut to add bloom and transformers llama ut * disable llama transformers embedding ut	2023-08-25 11:14:21 +08:00
binbin Deng	5582872744	LLM: update chatglm example to be more friendly for beginners (#8795 )	2023-08-25 10:55:01 +08:00
Yina Chen	7c37424a63	Fix voice assistant example input error on Linux (#8799 ) * fix linux error * update * remove alsa log	2023-08-25 10:47:27 +08:00
xingyuan li	9537194b4b	[LLM] Fix llm test workflow repeatedly download model files	2023-08-25 11:20:46 +09:00
Yang Wang	bf3591e2ff	Optimize chatglm2 for bf16 (#8725 ) * make chatglm works with bf16 * fix style * support chatglm v1 * fix style * fix style * add chatglm2 file	2023-08-24 10:04:25 -07:00
Jin Hanyu	a73a3e5ff9	Fix bugs in manually_build_for_testing.yml. (#8792 )	2023-08-23 15:49:23 +08:00
xingyuan li	c94bdd3791	[LLM] Merge windows & linux nightly test (#8756 ) * fix download statement * add check before build wheel * use curl to upload files * windows unittest won't upload converted model * split llm-cli test into windows & linux versions * update tempdir create way * fix nightly converted model name * windows llm-cli starcoder test temply disabled * remove taskset dependency * rename llm_unit_tests_linux to llm_unit_tests	2023-08-23 12:48:41 +09:00
Jason Dai	dcadd09154	Update llm document (#8784 )	2023-08-21 22:34:44 +08:00
Yishuo Wang	611c1fb628	[LLM] change default n_threads of native int4 langchain API (#8779 )	2023-08-21 13:30:12 +08:00
Yishuo Wang	3d1f2b44f8	LLM: change default n_threads of native int4 models (#8776 )	2023-08-18 15:46:19 +08:00
Yishuo Wang	2ba2133613	fix starcoder chinese output (#8773 )	2023-08-18 13:37:02 +08:00
binbin Deng	548f7a6cf7	LLM: update convert of llama family to support llama2-70B (#8747 )	2023-08-18 09:30:35 +08:00
Yina Chen	4afea496ab	support q8_0 (#8765 )	2023-08-17 15:06:36 +08:00
Shaojun Liu	394304b918	Re organize llm test (#8766 ) * run llm-example-test in llm-nightly-test.yml * comment out the schedule event	2023-08-17 09:42:25 +08:00
Ruonan Wang	e9aa2bd890	LLM: reduce GPU 1st token latency and update example (#8763 ) * reduce 1st token latency * update example * fix * fix style * update readme of gpu benchmark	2023-08-16 18:01:23 +08:00
binbin Deng	06609d9260	LLM: add qwen example on arc (#8757 )	2023-08-16 17:11:08 +08:00
SONG Ge	f4164e4492	[BigDL LLM] Update readme for unifying transformers API (#8737 ) * update readme doc * fix readthedocs error * update comment * update exception error info * invalidInputError instead * fix readme typo error and remove import error * fix more typo	2023-08-16 14:22:32 +08:00
Song Jiaming	c1f9af6d97	[LLM] chatglm example and transformers low-bit examples (#8751 )	2023-08-16 11:41:44 +08:00
Ruonan Wang	8805186f2f	LLM: add benchmark tool for gpu (#8760 ) * add benchmark tool for gpu * update	2023-08-16 11:22:10 +08:00
binbin Deng	97283c033c	LLM: add falcon example on arc (#8742 )	2023-08-15 17:38:38 +08:00
binbin Deng	8c55911308	LLM: add baichuan-13B on arc example (#8755 )	2023-08-15 15:07:04 +08:00
Shaojie Cui	0a8db3abe0	[PPML]refactor python toolkit (#8740 ) * add dependency and example * fix stage 3 * downgrade protobuf * reduce epc memory * add script * Readme reduction * delete unused note	2023-08-15 10:11:53 +08:00
binbin Deng	be2ae6eb7c	LLM: fix langchain native int4 voiceasistant example (#8750 )	2023-08-14 17:23:33 +08:00
Ruonan Wang	d28ad8f7db	LLM: add whisper example for arc transformer int4 (#8749 ) * add whisper example for arc int4 * fix	2023-08-14 17:05:48 +08:00

... 3 4 5 6 7 ...

1530 commits