JIN Qiao
|
1a1ddc4144
|
LLM: Add Replit CPU and GPU example (#9028)
|
2023-10-12 13:42:14 +08:00 |
|
JIN Qiao
|
d74834ff4c
|
LLM: add gpu pytorch-models example llama2 and chatglm2 (#9142)
|
2023-10-12 13:41:48 +08:00 |
|
Ruonan Wang
|
4f34557224
|
LLM: support num_beams in all-in-one benchmark (#9141)
* support num_beams
* fix
|
2023-10-12 13:35:12 +08:00 |
|
Ruonan Wang
|
62ac7ae444
|
LLM: fix inaccurate input / output tokens of current all-in-one benchmark (#9137)
* first fix
* fix all apis
* fix
|
2023-10-11 17:13:34 +08:00 |
|
binbin Deng
|
eb3fb18eb4
|
LLM: improve PyTorch API doc (#9128)
|
2023-10-11 15:03:39 +08:00 |
|
binbin Deng
|
995b0f119f
|
LLM: update some gpu examples (#9136)
|
2023-10-11 14:23:56 +08:00 |
|
Ruonan Wang
|
1c8d5da362
|
LLM: fix llama tokenizer for all-in-one benchmark (#9129)
* fix tokenizer for gpu benchmark
* fix ipex fp16
* meet code review
* fix
|
2023-10-11 13:39:39 +08:00 |
|
binbin Deng
|
2ad67a18b1
|
LLM: add mistral examples (#9121)
|
2023-10-11 13:38:15 +08:00 |
|
Ruonan Wang
|
1363e666fc
|
LLM: update benchmark_util.py for beam search (#9126)
* update reorder_cache
* fix
|
2023-10-11 09:41:53 +08:00 |
|
Guoqiong Song
|
e8c5645067
|
add LLM example of aquila on GPU (#9056)
* aquila, dolly-v1, dolly-v2, vacuna
|
2023-10-10 17:01:35 -07:00 |
|
Ruonan Wang
|
388f688ef3
|
LLM: update setup.py to add bigdl-core-xe package (#9122)
|
2023-10-10 15:02:48 +08:00 |
|
Zhao Changmin
|
1709beba5b
|
LLM: Explicitly close pickle file pointer before removing temporary directory (#9120)
* fp close
|
2023-10-10 14:57:23 +08:00 |
|
Yuwen Hu
|
0e09dd926b
|
[LLM] Fix example test (#9118)
* Update llm example test link due to example layout change
* Add better change detect
|
2023-10-10 13:24:18 +08:00 |
|
Ruonan Wang
|
ad7d9231f5
|
LLM: add benchmark script for Max gpu and ipex fp16 gpu (#9112)
* add pvc bash
* meet code review
* rename to run-max-gpu.sh
|
2023-10-10 10:18:41 +08:00 |
|
binbin Deng
|
e4d1457a70
|
LLM: improve transformers style API doc (#9113)
|
2023-10-10 09:31:00 +08:00 |
|
Yuwen Hu
|
65212451cc
|
[LLM] Small update to performance tests (#9106)
* small updates to llm performance tests regarding model handling
* Small fix
|
2023-10-09 16:55:25 +08:00 |
|
Zhao Changmin
|
edccfb2ed3
|
LLM: Check model device type (#9092)
* check model device
|
2023-10-09 15:49:15 +08:00 |
|
binbin Deng
|
5e9962b60e
|
LLM: update example layout (#9046)
|
2023-10-09 15:36:39 +08:00 |
|
Yina Chen
|
4c4f8d1663
|
[LLM]Fix Arc falcon abnormal output issue (#9096)
* update
* update
* fix error & style
* fix style
* update train
* to input_seq_size
|
2023-10-09 15:09:37 +08:00 |
|
Zhao Changmin
|
548e4dd5fe
|
LLM: Adapt transformers models for optimize model SL (#9022)
* LLM: Adapt transformers model for SL
|
2023-10-09 11:13:44 +08:00 |
|
Ruonan Wang
|
f64257a093
|
LLM: basic api support for esimd fp16 (#9067)
* basic api support for fp16
* fix style
* fix
* fix error and style
* fix style
* meet code review
* update based on comments
|
2023-10-09 11:05:17 +08:00 |
|
JIN Qiao
|
65373d2a8b
|
LLM: adjust portable zip content (#9054)
* LLM: adjust portable zip content
* LLM: adjust portable zip README
|
2023-10-09 10:51:19 +08:00 |
|
Xin Qiu
|
b3e94a32d4
|
change log4error import (#9098)
|
2023-10-08 09:23:28 +08:00 |
|
Kai Huang
|
78ea7ddb1c
|
Combine apply_rotary_pos_emb for gpt-neox (#9074)
|
2023-10-07 16:27:46 +08:00 |
|
Yang Wang
|
36dd4afd61
|
Fix llama when rope scaling is not None (#9086)
* Fix llama when rope scaling is not None
* fix style
* fix style
|
2023-10-06 13:27:37 -07:00 |
|
Yang Wang
|
fcb1c618a0
|
using bigdl-llm fused rope for llama (#9066)
* optimize llama xpu rope
* fix bug
* fix style
* refine append cache
* remove check
* do not cache cos sin
* remove unnecessary changes
* clean up
* fix style
* check for training
|
2023-10-06 09:57:29 -07:00 |
|
Jiao Wang
|
aefa5a5bfe
|
Qwen kv cache (#9079)
* qwen and aquila
* update
* update
* style
|
2023-10-05 11:59:17 -07:00 |
|
Jiao Wang
|
d5ca1f32b6
|
Aquila KV cache optimization (#9080)
* update
* update
* style
|
2023-10-05 11:10:57 -07:00 |
|
Yang Wang
|
88565c76f6
|
add export merged model example (#9018)
* add export merged model example
* add sources
* add script
* fix style
|
2023-10-04 21:18:52 -07:00 |
|
Yang Wang
|
0cd8f1c79c
|
Use ipex fused rms norm for llama (#9081)
* also apply rmsnorm
* fix cpu
|
2023-10-04 21:04:55 -07:00 |
|
Cengguang Zhang
|
fb883100e7
|
LLM: support chatglm-18b convert attention forward in benchmark scripts. (#9072)
* add chatglm-18b convert.
* fix if statement.
* fix
|
2023-09-28 14:04:52 +08:00 |
|
Yishuo Wang
|
6de2189e90
|
[LLM] fix chatglm main choice (#9073)
|
2023-09-28 11:23:37 +08:00 |
|
Cengguang Zhang
|
ad62c58b33
|
LLM: Enable jemalloc in benchmark scripts. (#9058)
* enable jemalloc.
* fix readme.
|
2023-09-26 15:37:49 +08:00 |
|
Cengguang Zhang
|
b4a1266ef0
|
[WIP] LLM: add kv cache support for internlm. (#9036)
* LLM: add kv cache support for internlm
* add internlm apply_rotary_pos_emb
* fix.
* fix style.
|
2023-09-25 14:16:59 +08:00 |
|
Ruonan Wang
|
975da86e00
|
LLM: fix gptneox kv cache (#9044)
|
2023-09-25 13:03:57 +08:00 |
|
Cengguang Zhang
|
26213a5829
|
LLM: Change benchmark bf16 load format. (#9035)
* LLM: Change benchmark bf16 load format.
* comment on bf16 chatglm.
* fix.
|
2023-09-22 17:38:38 +08:00 |
|
JinBridge
|
023555fb1f
|
LLM: Add one-click installer for Windows (#8999)
* LLM: init one-click installer for windows
* LLM: fix typo in one-click installer readme
* LLM: one-click installer try except logic
* LLM: one-click installer add dependency
* LLM: one-click installer adjust README.md
* LLM: one-click installer split README and add zip compress in setup.bat
* LLM: one-click installer verified internlm and llama2 and replace gif
* LLM: remove one-click installer images
* LLM: finetune the one-click installer README.md
* LLM: fix typo in one-click installer README.md
* LLM: rename one-click installer to protable executable
* LLM: rename other places to protable executable
* LLM: rename the zip filename to executable
* LLM: update .gitignore
* LLM: add colorama to setup.bat
|
2023-09-22 14:46:30 +08:00 |
|
Jiao Wang
|
028a6d9383
|
MPT model optimize for long sequence (#9020)
* mpt_long_seq
* update
* update
* update
* style
* style2
* update
|
2023-09-21 21:27:23 -07:00 |
|
Ruonan Wang
|
b943d73844
|
LLM: refactor kv cache (#9030)
* refactor utils
* meet code review; update all models
* small fix
|
2023-09-21 21:28:03 +08:00 |
|
Cengguang Zhang
|
868511cf02
|
LLM: fix kv cache issue of bloom and falcon. (#9029)
|
2023-09-21 18:12:20 +08:00 |
|
Ruonan Wang
|
bf51ec40b2
|
LLM: Fix empty cache (#9024)
* fix
* fix
* update example
|
2023-09-21 17:16:07 +08:00 |
|
Yina Chen
|
714884414e
|
fix error (#9025)
|
2023-09-21 16:42:11 +08:00 |
|
binbin Deng
|
edb225530b
|
add bark (#9016)
|
2023-09-21 12:24:58 +08:00 |
|
SONG Ge
|
fa47967583
|
[LLM] Optimize kv_cache for gptj model family (#9010)
* optimize gptj model family attention
* add license and comment for dolly-model
* remove xpu mentioned
* remove useless info
* code sytle
* style fix
* code style in gptj fix
* remove gptj arch
* move apply_rotary_pos_emb into utils
* kv_seq_length update
* use hidden_states instead of query layer to reach batch size
|
2023-09-21 10:42:08 +08:00 |
|
Cengguang Zhang
|
b3cad7de57
|
LLM: add bloom kv cache support (#9012)
* LLM: add bloom kv cache support
* fix style.
|
2023-09-20 21:10:53 +08:00 |
|
Kai Huang
|
156af15d1e
|
Add NF3 (#9008)
* add nf3
* grammar
|
2023-09-20 20:03:07 +08:00 |
|
Kai Huang
|
6981745fe4
|
Optimize kv_cache for gpt-neox model family (#9015)
* override gptneox
* style
* move to utils
* revert
|
2023-09-20 19:59:19 +08:00 |
|
JinBridge
|
48b503c630
|
LLM: add example of aquila (#9006)
* LLM: add example of aquila
* LLM: replace AquilaChat with Aquila
* LLM: shorten prompt of aquila example
|
2023-09-20 15:52:56 +08:00 |
|
Cengguang Zhang
|
735a17f7b4
|
LLM: add kv cache to falcon family. (#8995)
* add kv cache to falcon family.
* fix: import error.
* refactor
* update comments.
* add two version falcon attention forward.
* fix
* fix.
* fix.
* fix.
* fix style.
* fix style.
|
2023-09-20 15:36:30 +08:00 |
|
Ruonan Wang
|
94a7f8917b
|
LLM: fix optimized kv cache for baichuan-13b (#9009)
* fix baichuan 13b
* fix style
* fix
* fix style
|
2023-09-20 15:30:14 +08:00 |
|