ipex-llm

Author	SHA1	Message	Date
binbin Deng	f97cce2642	Fix import error of ds autotp (#11307 )	2024-06-13 16:22:52 +08:00
Ruonan Wang	986af21896	fix perf test(#11295 )	2024-06-13 10:35:48 +08:00
Ruonan Wang	14b1e6b699	Fix gguf_q4k (#11293 ) * udpate embedding parameter * update benchmark	2024-06-12 20:43:08 +08:00
Yuwen Hu	fac49f15e3	Remove manual importing ipex in all-in-one benchmark (#11272 )	2024-06-11 09:32:13 +08:00
Shaojun Liu	85df5e7699	fix nightly perf test (#11251 )	2024-06-07 09:33:14 +08:00
hxsz1997	b6234eb4e2	Add task in allinone (#11226 ) * add task * update prompt * modify typos * add more cases in summarize * Make the summarize & QA prompt preprocessing as a util function	2024-06-06 17:22:40 +08:00
Wenjing Margaret Mao	231b968aba	Modify the check_results.py to support batch 2&4 (#11133 ) * add batch 2&4 and exclude to perf_test * modify the perf-test&437 yaml * modify llm_performance_test.yml * remove batch 4 * modify check_results.py to support batch 2&4 * change the batch_size format * remove genxir * add str(batch_size) * change actual_test_casese in check_results file to support batch_size * change html highlight * less models to test html and html_path * delete the moe model * split batch html * split * use installing from pypi * use installing from pypi - batch2 * revert cpp * revert cpp * merge two jobs into one, test batch_size in one job * merge two jobs into one, test batch_size in one job * change file directory in workflow * try catch deal with odd file without batch_size * modify pandas version * change the dir * organize the code * organize the code * remove Qwen-MOE * modify based on feedback * modify based on feedback * modify based on second round of feedback * modify based on second round of feedback + change run-arc.sh mode * modify based on second round of feedback + revert config * modify based on second round of feedback + revert config * modify based on second round of feedback + remove comments * modify based on second round of feedback + remove comments * modify based on second round of feedback + revert arc-perf-test * modify based on third round of feedback * change error type * change error type * modify check_results.html * split batch into two folders * add all models * move csv_name * revert pr test * revert pr test --------- Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>	2024-06-05 15:04:55 +08:00
Kai Huang	f93664147c	Update config.yaml (#11208 ) * update config.yaml * fix * minor * style	2024-06-04 19:58:18 +08:00
Yina Chen	711fa0199e	Fix fp6k phi3 ppl core dump (#11204 )	2024-06-04 16:44:27 +08:00
Cengguang Zhang	3eb13ccd8c	LLM: fix input length condition in deepspeed all-in-one benchmark. (#11185 )	2024-06-03 10:05:43 +08:00
ZehuaCao	4127b99ed6	Fix null pointer dereferences error. (#11125 ) * delete unused function on tgi_server * update * update * fix style	2024-05-30 16:16:10 +08:00
hxsz1997	62b2d8af6b	Add lookahead in all-in-one (#11142 ) * add lookahead in allinone * delete save to csv in run_transformer_int4_gpu * change lookup to lookahead * fix the error of add model.peak_memory * Set transformer_int4_gpu as the default option * add comment of transformer_int4_fp16_lookahead_gpu	2024-05-28 15:39:58 +08:00
Zhao Changmin	15d906a97b	Update linux igpu run script (#11098 ) * update run script	2024-05-22 17:18:07 +08:00
Kai Huang	f63172ef63	Align ppl with llama.cpp (#11055 ) * update script * remove * add header * update readme	2024-05-22 16:43:11 +08:00
Wang, Jian4	74950a152a	Fix tgi_api_server error file name (#11075 )	2024-05-20 16:48:40 +08:00
Wang, Jian4	d9f71f1f53	Update benchmark util for example using (#11027 ) * mv benchmark_util.py to utils/ * remove * update	2024-05-15 14:16:35 +08:00
binbin Deng	4053a6ef94	Update environment variable setting in AutoTP with arc (#11018 )	2024-05-15 10:23:58 +08:00
Shaojun Liu	7f8c5b410b	Quickstart: Run PyTorch Inference on Intel GPU using Docker (on Linux or WSL) (#10970 ) * add entrypoint.sh * add quickstart * remove entrypoint * update * Install related library of benchmarking * update * print out results * update docs * minor update * update * update quickstart * update * update * update * update * update * update * add chat & example section * add more details * minor update * rename quickstart * update * minor update * update * update config.yaml * update readme * use --gpu * add tips * minor update * update	2024-05-14 12:58:31 +08:00
ZehuaCao	99255fe36e	fix ppl (#10996 )	2024-05-13 13:57:19 +08:00
Xin Qiu	dfa3147278	update (#10944 )	2024-05-08 14:28:05 +08:00
Cengguang Zhang	0edef1f94c	LLM: add min_new_tokens to all in one benchmark. (#10911 )	2024-05-06 09:32:59 +08:00
Yuwen Hu	1a8a93d5e0	Further fix nightly perf (#10901 )	2024-04-28 10:18:58 +08:00
Yuwen Hu	ddfdaec137	Fix nightly perf (#10899 ) * Fix nightly perf by adding default value in benchmark for use_fp16_torch_dtype * further fixes	2024-04-28 09:39:29 +08:00
binbin Deng	f51bf018eb	Add benchmark script for pipeline parallel inference (#10873 )	2024-04-26 15:28:11 +08:00
Cengguang Zhang	cd369c2715	LLM: add device id to benchmark utils. (#10877 )	2024-04-25 14:01:51 +08:00
Cengguang Zhang	eb39c61607	LLM: add min new token to perf test. (#10869 )	2024-04-24 14:32:02 +08:00
yb-peng	c9dee6cd0e	Update 8192.txt (#10824 ) * Update 8192.txt * Update 8192.txt with original text	2024-04-23 14:02:09 +08:00
Wang, Jian4	23c6a52fb0	LLM: Fix ipex torchscript=True error (#10832 ) * remove * update * remove torchscript	2024-04-22 15:53:09 +08:00
Kai Huang	053ec30737	Transformers ppl evaluation on wikitext (#10784 ) * tranformers code * cache	2024-04-18 15:27:18 +08:00
hxsz1997	0d518aab8d	Merge pull request #10697 from MargarettMao/ceval combine english and chinese, remove nan	2024-04-12 14:37:47 +08:00
jenniew	cdbb1de972	Mark Color Modification	2024-04-12 14:00:50 +08:00
yb-peng	2685c41318	Modify all-in-one benchmark (#10726 ) * Update 8192 prompt in all-in-one * Add cpu_embedding param for linux api * Update run.py * Update README.md	2024-04-11 13:38:50 +08:00
Wenjing Margaret Mao	289cc99cd6	Update README.md (#10700 ) Edit "summarize the results"	2024-04-09 16:01:12 +08:00
Wenjing Margaret Mao	d3116de0db	Update README.md (#10701 ) edit "summarize the results"	2024-04-09 15:50:25 +08:00
Chen, Zhentao	d59e0cce5c	Migrate harness to ipexllm (#10703 ) * migrate to ipexlm * fix workflow * fix run_multi * fix precision map * rename ipexlm to ipexllm * rename bigdl to ipex in comments	2024-04-09 15:48:53 +08:00
jenniew	591bae092c	combine english and chinese, remove nan	2024-04-08 19:37:51 +08:00
yb-peng	2d88bb9b4b	add test api transformer_int4_fp16_gpu (#10627 ) * add test api transformer_int4_fp16_gpu * update config.yaml and README.md in all-in-one * modify run.py in all-in-one * re-order test-api * re-order test-api in config * modify README.md in all-in-one * modify README.md in all-in-one * modify config.yaml --------- Co-authored-by: pengyb2001 <arda@arda-arc21.sh.intel.com> Co-authored-by: ivy-lv11 <zhicunlv@gmail.com>	2024-04-07 15:47:17 +08:00
Wang, Jian4	9ad4b29697	LLM: CPU benchmark using tcmalloc (#10675 )	2024-04-07 14:17:01 +08:00
binbin Deng	d9a1153b4e	LLM: upgrade deepspeed in AutoTP on GPU (#10647 )	2024-04-07 14:05:19 +08:00
binbin Deng	27be448920	LLM: add `cpu_embedding` and peak memory record for deepspeed autotp script (#10621 )	2024-04-02 17:32:50 +08:00
Shaojun Liu	a10f5a1b8d	add python style check (#10620 ) * add python style check * fix style checks * update runner * add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow * update tag to 2.1.0-SNAPSHOT	2024-04-02 16:17:56 +08:00
Ruonan Wang	d6af4877dd	LLM: remove ipex.optimize for gpt-j (#10606 ) * remove ipex.optimize * fix * fix	2024-04-01 12:21:49 +08:00
WeiguangHan	fbeb10c796	LLM: Set different env based on different Linux kernels (#10566 )	2024-03-27 17:56:33 +08:00
Ruonan Wang	ea4bc450c4	LLM: add esimd sdp for pvc (#10543 ) * add esimd sdp for pvc * update * fix * fix batch	2024-03-26 19:04:40 +08:00
Shaojun Liu	c563b41491	add nightly_build workflow (#10533 ) * add nightly_build workflow * add create-job-status-badge action * update * update * update * update setup.py * release * revert	2024-03-26 12:47:38 +08:00
Wang, Jian4	16b2ef49c6	Update_document by heyang (#30 )	2024-03-25 10:06:02 +08:00
Wang, Jian4	9df70d95eb	Refactor bigdl.llm to ipex_llm (#24 ) * Rename bigdl/llm to ipex_llm * rm python/llm/src/bigdl * from bigdl.llm to from ipex_llm	2024-03-22 15:41:21 +08:00
binbin Deng	85ef3f1d99	LLM: add empty cache in deepspeed autotp benchmark script (#10488 )	2024-03-21 10:51:23 +08:00
Xiangyu Tian	5a5fd5af5b	LLM: Add speculative benchmark on CPU/XPU (#10464 ) Add speculative benchmark on CPU/XPU.	2024-03-21 09:51:06 +08:00
Xiangyu Tian	cbe24cc7e6	LLM: Enable BigDL IPEX Int8 (#10480 ) Enable BigDL IPEX Int8	2024-03-20 15:59:54 +08:00

1 2 3 4

187 commits