Xiangyu Tian
5a5fd5af5b
LLM: Add speculative benchmark on CPU/XPU ( #10464 )
...
Add speculative benchmark on CPU/XPU.
2024-03-21 09:51:06 +08:00
Xiangyu Tian
cbe24cc7e6
LLM: Enable BigDL IPEX Int8 ( #10480 )
...
Enable BigDL IPEX Int8
2024-03-20 15:59:54 +08:00
Jin Qiao
0451103a43
LLM: add int4+fp16 benchmark script for windows benchmarking ( #10449 )
...
* LLM: add fp16 for benchmark script
* remove transformer_int4_fp16_loadlowbit_gpu_win
2024-03-19 11:11:25 +08:00
Xiangyu Tian
0ded0b4b13
LLM: Enable BigDL IPEX optimization for int4 ( #10319 )
...
Enable BigDL IPEX optimization for int4
2024-03-12 17:08:50 +08:00
binbin Deng
5d996a5caf
LLM: add benchmark script for deepspeed autotp on gpu ( #10380 )
2024-03-12 15:19:57 +08:00
Yuwen Hu
27d9a14989
[LLM] all-on-one update: memory optimize and streaming output ( #10302 )
...
* Memory saving for continous in-out pair run and add support for streaming output on MTL iGPU
* Small fix
* Small fix
* Add things back
2024-03-01 18:02:30 +08:00
Keyan (Kyrie) Zhang
59861f73e5
Add Deepseek-6.7B ( #9991 )
...
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* modify deepseek
* modify deepseek
* Add verified model in README
* Turn cpu_embedding=True in Deepseek example
---------
Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:36:39 +08:00
Yuwen Hu
001c13243e
[LLM] Add support for low_low_bit benchmark on Windows GPU ( #10167 )
...
* Add support for low_low_bit performance test on Windows GPU
* Small fix
* Small fix
* Save memory during converting model process
* Drop the results for first time when loading in low bit on mtl igpu for better performance
* Small fix
2024-02-21 10:51:52 +08:00
Ziteng Zhang
8b08ad408b
Add batch_size in all_in_one ( #9999 )
...
Add batch_size in all_in_one, except run_native_int4
2024-01-25 17:43:49 +08:00
Ruonan Wang
b059a32fff
LLM: add benchmark api for bigdl-llm fp16 on GPU ( #9919 )
...
* add bmk for bigdl fp16
* fix
2024-01-17 14:24:35 +08:00
Ziteng Zhang
4f4ce73f31
[LLM] Add transformer_autocast_bf16 into all-in-one ( #9890 )
...
* Add transformer_autocast_bf16 into all-in-one
2024-01-11 17:51:07 +08:00
Yuwen Hu
3f4ad97929
[LLM] Add performance tests for windows iGPU ( #9584 )
...
* Add support for win gpu benchmark with peak gpu memory monitoring
* Add win igpu tests
* Small fix
* Forward outputs
* Small fix
* Test and small fixes
* Small fix
* Small fix and test
* Small fixes
* Add tests for 512-64 and change back to nightly tests
* Small fix
2023-12-04 20:50:02 +08:00
Heyang Sun
af94058203
[LLM] Support CPU deepspeed distributed inference ( #9259 )
...
* [LLM] Support CPU Deepspeed distributed inference
* Update run_deepspeed.py
* Rename
* fix style
* add new codes
* refine
* remove annotated codes
* refine
* Update README.md
* refine doc and example code
2023-11-06 17:56:42 +08:00
binbin Deng
770ac70b00
LLM: add low_bit option in benchmark scripts ( #9257 )
2023-10-25 10:27:48 +08:00
Ruonan Wang
4f34557224
LLM: support num_beams in all-in-one benchmark ( #9141 )
...
* support num_beams
* fix
2023-10-12 13:35:12 +08:00
Ruonan Wang
ad7d9231f5
LLM: add benchmark script for Max gpu and ipex fp16 gpu ( #9112 )
...
* add pvc bash
* meet code review
* rename to run-max-gpu.sh
2023-10-10 10:18:41 +08:00
Cengguang Zhang
cca84b0a64
LLM: update llm benchmark scripts. ( #8943 )
...
* update llm benchmark scripts.
* change tranformer_bf16 to pytorch_autocast_bf16.
* add autocast in transformer int4.
* revert autocast.
* add "pytorch_autocast_bf16" to doc
* fix comments.
2023-09-13 12:23:28 +08:00
binbin Deng
7897eb4b51
LLM: add benchmark scripts on GPU ( #8916 )
2023-09-07 18:08:17 +08:00
Xin Qiu
e9de9d9950
benchmark for native int4 ( #8918 )
...
* native4
* update
* update
* update
2023-09-07 15:56:15 +08:00
Xin Qiu
5d9942a3ca
transformer int4 and native int4's benchmark script for 32 256 1k 2k input ( #8871 )
...
* transformer
* move
* update
* add header
* update all-in-one
* clean up
2023-09-07 09:49:55 +08:00
Song Jiaming
7b3ac66e17
[LLM] auto performance test fix specific settings to template ( #8876 )
2023-09-01 15:49:04 +08:00
Song Jiaming
c06f1ca93e
[LLM] auto perf test to output to csv ( #8846 )
2023-09-01 10:48:00 +08:00