ipex-llm

Author	SHA1	Message	Date
binbin Deng	770ac70b00	LLM: add `low_bit` option in benchmark scripts (#9257 )	2023-10-25 10:27:48 +08:00
WeiguangHan	ec9195da42	LLM: using html to visualize the perf result for Arc (#9228 ) * LLM: using html to visualize the perf result for Arc * deploy the html file * add python license * reslove some comments	2023-10-24 18:05:25 +08:00
Ruonan Wang	b15656229e	LLM: fix benchmark issue (#9255 )	2023-10-24 14:15:05 +08:00
WeiguangHan	b9194c5786	LLM: skip some model tests using certain api (#9163 ) * LLM: Skip some model tests using certain api * initialize variable named result	2023-10-18 09:39:27 +08:00
Ruonan Wang	4f34557224	LLM: support num_beams in all-in-one benchmark (#9141 ) * support num_beams * fix	2023-10-12 13:35:12 +08:00
Ruonan Wang	62ac7ae444	LLM: fix inaccurate input / output tokens of current all-in-one benchmark (#9137 ) * first fix * fix all apis * fix	2023-10-11 17:13:34 +08:00
Ruonan Wang	1c8d5da362	LLM: fix llama tokenizer for all-in-one benchmark (#9129 ) * fix tokenizer for gpu benchmark * fix ipex fp16 * meet code review * fix	2023-10-11 13:39:39 +08:00
Ruonan Wang	1363e666fc	LLM: update benchmark_util.py for beam search (#9126 ) * update reorder_cache * fix	2023-10-11 09:41:53 +08:00
Yuwen Hu	0e09dd926b	[LLM] Fix example test (#9118 ) * Update llm example test link due to example layout change * Add better change detect	2023-10-10 13:24:18 +08:00
Ruonan Wang	ad7d9231f5	LLM: add benchmark script for Max gpu and ipex fp16 gpu (#9112 ) * add pvc bash * meet code review * rename to run-max-gpu.sh	2023-10-10 10:18:41 +08:00
Yuwen Hu	65212451cc	[LLM] Small update to performance tests (#9106 ) * small updates to llm performance tests regarding model handling * Small fix	2023-10-09 16:55:25 +08:00
Kai Huang	78ea7ddb1c	Combine apply_rotary_pos_emb for gpt-neox (#9074 )	2023-10-07 16:27:46 +08:00
Cengguang Zhang	ad62c58b33	LLM: Enable jemalloc in benchmark scripts. (#9058 ) * enable jemalloc. * fix readme.	2023-09-26 15:37:49 +08:00
Cengguang Zhang	26213a5829	LLM: Change benchmark bf16 load format. (#9035 ) * LLM: Change benchmark bf16 load format. * comment on bf16 chatglm. * fix.	2023-09-22 17:38:38 +08:00
Kai Huang	6981745fe4	Optimize kv_cache for gpt-neox model family (#9015 ) * override gptneox * style * move to utils * revert	2023-09-20 19:59:19 +08:00
Xin Qiu	37bb0cbf8f	Speed up gpt-j in gpubenchmark (#9000 ) * Speedup gpt-j in gpubenchmark * meet code review	2023-09-19 14:22:28 +08:00
Cengguang Zhang	8299b68fea	update readme. (#8996 )	2023-09-18 17:06:15 +08:00
Cengguang Zhang	74338fd291	LLM: add auto torch dtype in benchmark. (#8981 )	2023-09-18 15:48:25 +08:00
Ruonan Wang	32716106e0	update use_cahce=True (#8986 )	2023-09-18 07:59:33 +08:00
Xin Qiu	64ee1d7689	update run_transformer_int4_gpu (#8983 ) * xpuperf * update run.py * clean upo * uodate * update * meet code review	2023-09-15 15:10:04 +08:00
Cengguang Zhang	cca84b0a64	LLM: update llm benchmark scripts. (#8943 ) * update llm benchmark scripts. * change tranformer_bf16 to pytorch_autocast_bf16. * add autocast in transformer int4. * revert autocast. * add "pytorch_autocast_bf16" to doc * fix comments.	2023-09-13 12:23:28 +08:00
Xin Qiu	ea0853c0b5	update benchmark_utils readme (#8925 ) * update readme * meet code review	2023-09-08 10:30:26 +08:00
Cengguang Zhang	3d2efe9608	LLM: update llm latency benchmark. (#8922 )	2023-09-07 19:00:19 +08:00
binbin Deng	7897eb4b51	LLM: add benchmark scripts on GPU (#8916 )	2023-09-07 18:08:17 +08:00
Xin Qiu	d8a01d7c4f	fix chatglm in run.pu (#8919 )	2023-09-07 16:44:10 +08:00
Xin Qiu	e9de9d9950	benchmark for native int4 (#8918 ) * native4 * update * update * update	2023-09-07 15:56:15 +08:00
Ruonan Wang	057e77e229	LLM: update benchmark_utils.py to handle do_sample=True (#8903 )	2023-09-07 14:20:47 +08:00
Xin Qiu	5d9942a3ca	transformer int4 and native int4's benchmark script for 32 256 1k 2k input (#8871 ) * transformer * move * update * add header * update all-in-one * clean up	2023-09-07 09:49:55 +08:00
Xin Qiu	49a39452c6	update benchmark (#8899 )	2023-09-06 15:11:43 +08:00
Song Jiaming	7b3ac66e17	[LLM] auto performance test fix specific settings to template (#8876 )	2023-09-01 15:49:04 +08:00
Song Jiaming	c06f1ca93e	[LLM] auto perf test to output to csv (#8846 )	2023-09-01 10:48:00 +08:00
Song Jiaming	b8b1b6888b	[LLM] Performance test (#8796 )	2023-08-25 14:31:45 +08:00
Ruonan Wang	e9aa2bd890	LLM: reduce GPU 1st token latency and update example (#8763 ) * reduce 1st token latency * update example * fix * fix style * update readme of gpu benchmark	2023-08-16 18:01:23 +08:00
Song Jiaming	c1f9af6d97	[LLM] chatglm example and transformers low-bit examples (#8751 )	2023-08-16 11:41:44 +08:00
Ruonan Wang	8805186f2f	LLM: add benchmark tool for gpu (#8760 ) * add benchmark tool for gpu * update	2023-08-16 11:22:10 +08:00
Song Jiaming	e717e304a6	LLM first example test and template (#8658 )	2023-08-10 10:03:11 +08:00
Ruonan Wang	64b38e1dc8	llm: benchmark tool for transformers int4 (separate 1st token and rest) (#8460 ) * add benchmark utils * fix * fix bug and add readme * hidden latency data	2023-07-06 09:49:52 +08:00
Junwei Deng	2fd751de7a	LLM: add a dev tool for getting glibc/glibcxx requirement (#8399 ) * add a dev tool * pep8 change	2023-06-30 11:09:50 +08:00
Shengsheng Huang	02c583144c	[LLM] langchain integrations and examples (#8256 ) * langchain intergrations and examples * add licences and rename * add licences * fix license issues and change backbone to model_family * update examples to use model_family param * fix linting * fix code style * exclude langchain integration from stylecheck * update langchain examples and update integrations based on latets changes * update simple llama-cpp-python style API example * remove bloom in README * change default n_threads to 2 and remove redundant code --------- Co-authored-by: leonardozcm <changmin.zhao@intel.com>	2023-06-12 19:22:07 +08:00
Pingchuan Ma (Henry)	773255e009	[LLM] Add dev wheel building and basic UT script for LLM package on Linux (#8264 ) * add wheel build for linux * test fix * test self-hosted runner * test fix * update runner * update runner * update fix * init cicd * init cicd * test conda * update fix * update no need manual python deps * test fix bugs * test fix bugs * test fix bugs * fix bugs	2023-06-08 00:49:57 +08:00
Pingchuan Ma (Henry)	2ed5842448	[LLM] add convert's python deps for LLM (#8260 ) * add python deps for LLM * update release.sh * change deps group name * update all * fix update * test fix * update	2023-06-06 16:01:17 +08:00
Pingchuan Ma (Henry)	c48d5f7cff	[LLM] Enable UT workflow logics for LLM (#8243 ) * check push connection * enable UT workflow logics for LLM * test fix * add licenses * test fix according to suggestions * test fix * update changes	2023-06-02 17:06:35 +08:00
Pingchuan Ma (Henry)	141febec1f	Add dev wheel building script for LLM package on Windows (#8238 ) * Add dev wheel building script for LLM package on Windows * delete conda * delete python version check * minor adjust * wheel name fixed * test check * test fix * change wheel name	2023-06-01 11:55:26 +08:00
binbin Deng	8421af51ae	LLM: support converting to ggml format (#8235 ) * add convert * fix * fix * fix * try * test * update check * fix * fix	2023-05-31 15:20:06 +08:00
Pingchuan Ma (Henry)	1f913a6941	[LLM] Add LLM pep8 coding style checking (#8233 ) * add LLM pep8 coding checking * resolve bugs in testing scripts and code style revision	2023-05-30 15:58:14 +08:00

1 2 3

145 commits