binbin Deng
|
995b0f119f
|
LLM: update some gpu examples (#9136)
|
2023-10-11 14:23:56 +08:00 |
|
Ruonan Wang
|
1c8d5da362
|
LLM: fix llama tokenizer for all-in-one benchmark (#9129)
* fix tokenizer for gpu benchmark
* fix ipex fp16
* meet code review
* fix
|
2023-10-11 13:39:39 +08:00 |
|
binbin Deng
|
2ad67a18b1
|
LLM: add mistral examples (#9121)
|
2023-10-11 13:38:15 +08:00 |
|
Ruonan Wang
|
1363e666fc
|
LLM: update benchmark_util.py for beam search (#9126)
* update reorder_cache
* fix
|
2023-10-11 09:41:53 +08:00 |
|
Guoqiong Song
|
e8c5645067
|
add LLM example of aquila on GPU (#9056)
* aquila, dolly-v1, dolly-v2, vacuna
|
2023-10-10 17:01:35 -07:00 |
|
Yuwen Hu
|
dc70fc7b00
|
Update performance tests for dependency of bigdl-core-xe-esimd (#9124)
|
2023-10-10 19:32:17 +08:00 |
|
Lilac09
|
30e3c196f3
|
Merge pull request #9108 from Zhengjin-Wang/main
Add instruction for chat.py in bigdl-llm-cpu
|
2023-10-10 16:40:52 +08:00 |
|
Lilac09
|
1e78b0ac40
|
Optimize LoRA Docker by Shrinking Image Size (#9110)
* modify dockerfile
* modify dockerfile
|
2023-10-10 15:53:17 +08:00 |
|
Ruonan Wang
|
388f688ef3
|
LLM: update setup.py to add bigdl-core-xe package (#9122)
|
2023-10-10 15:02:48 +08:00 |
|
Zhao Changmin
|
1709beba5b
|
LLM: Explicitly close pickle file pointer before removing temporary directory (#9120)
* fp close
|
2023-10-10 14:57:23 +08:00 |
|
Yuwen Hu
|
0e09dd926b
|
[LLM] Fix example test (#9118)
* Update llm example test link due to example layout change
* Add better change detect
|
2023-10-10 13:24:18 +08:00 |
|
Ruonan Wang
|
ad7d9231f5
|
LLM: add benchmark script for Max gpu and ipex fp16 gpu (#9112)
* add pvc bash
* meet code review
* rename to run-max-gpu.sh
|
2023-10-10 10:18:41 +08:00 |
|
Lilac09
|
6264381f2e
|
Merge pull request #9117 from Zhengjin-Wang/manually_build
add llm-serving-xpu on github action
|
2023-10-10 10:09:06 +08:00 |
|
Zhengjin Wang
|
0dbb3a283e
|
amend manually_build
|
2023-10-10 10:03:23 +08:00 |
|
Zhengjin Wang
|
bb3bb46400
|
add llm-serving-xpu on github action
|
2023-10-10 09:48:58 +08:00 |
|
binbin Deng
|
e4d1457a70
|
LLM: improve transformers style API doc (#9113)
|
2023-10-10 09:31:00 +08:00 |
|
Yuwen Hu
|
65212451cc
|
[LLM] Small update to performance tests (#9106)
* small updates to llm performance tests regarding model handling
* Small fix
|
2023-10-09 16:55:25 +08:00 |
|
Zhao Changmin
|
edccfb2ed3
|
LLM: Check model device type (#9092)
* check model device
|
2023-10-09 15:49:15 +08:00 |
|
Heyang Sun
|
2c0c9fecd0
|
refine LLM containers (#9109)
|
2023-10-09 15:45:30 +08:00 |
|
binbin Deng
|
5e9962b60e
|
LLM: update example layout (#9046)
|
2023-10-09 15:36:39 +08:00 |
|
Yina Chen
|
4c4f8d1663
|
[LLM]Fix Arc falcon abnormal output issue (#9096)
* update
* update
* fix error & style
* fix style
* update train
* to input_seq_size
|
2023-10-09 15:09:37 +08:00 |
|
Wang
|
a1aefdb8f4
|
modify README
|
2023-10-09 13:36:29 +08:00 |
|
Wang
|
3814abf95a
|
add instruction for chat.py
|
2023-10-09 12:57:28 +08:00 |
|
Zhao Changmin
|
548e4dd5fe
|
LLM: Adapt transformers models for optimize model SL (#9022)
* LLM: Adapt transformers model for SL
|
2023-10-09 11:13:44 +08:00 |
|
Ruonan Wang
|
f64257a093
|
LLM: basic api support for esimd fp16 (#9067)
* basic api support for fp16
* fix style
* fix
* fix error and style
* fix style
* meet code review
* update based on comments
|
2023-10-09 11:05:17 +08:00 |
|
Wang
|
a42c25436e
|
Merge remote-tracking branch 'upstream/main'
|
2023-10-09 10:55:18 +08:00 |
|
JIN Qiao
|
65373d2a8b
|
LLM: adjust portable zip content (#9054)
* LLM: adjust portable zip content
* LLM: adjust portable zip README
|
2023-10-09 10:51:19 +08:00 |
|
Guancheng Fu
|
df8df751c4
|
Modify readme for bigdl-llm-serving-cpu (#9105)
|
2023-10-09 09:56:09 +08:00 |
|
Heyang Sun
|
2756f9c20d
|
XPU QLoRA Container (#9082)
* XPU QLoRA Container
* fix apt issue
* refine
|
2023-10-08 11:04:20 +08:00 |
|
ZehuaCao
|
aad68100ae
|
Add trusted-bigdl-llm-serving-tdx image. (#9093)
* add entrypoint in cpu serving
* kubernetes support for fastchat cpu serving
* Update Readme
* add image to manually_build action
* update manually_build.yml
* update README.md
* update manually_build.yaml
* update attestation_cli.py
* update manually_build.yml
* update Dockerfile
* rename
* update trusted-bigdl-llm-serving-tdx Dockerfile
|
2023-10-08 10:13:51 +08:00 |
|
Xin Qiu
|
b3e94a32d4
|
change log4error import (#9098)
|
2023-10-08 09:23:28 +08:00 |
|
Kai Huang
|
78ea7ddb1c
|
Combine apply_rotary_pos_emb for gpt-neox (#9074)
|
2023-10-07 16:27:46 +08:00 |
|
Heyang Sun
|
0b40ef8261
|
separate trusted and native llm cpu finetune from lora (#9050)
* seperate trusted-llm and bigdl from lora finetuning
* add k8s for trusted llm finetune
* refine
* refine
* rename cpu to tdx in trusted llm
* solve conflict
* fix typo
* resolving conflict
* Delete docker/llm/finetune/lora/README.md
* fix
---------
Co-authored-by: Uxito-Ada <seusunheyang@foxmail.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>
|
2023-10-07 15:26:59 +08:00 |
|
Wang
|
4aee952b10
|
Merge remote-tracking branch 'upstream/main'
|
2023-10-07 09:53:52 +08:00 |
|
ZehuaCao
|
b773d67dd4
|
Add Kubernetes support for BigDL-LLM-serving CPU. (#9071)
|
2023-10-07 09:37:48 +08:00 |
|
Yang Wang
|
36dd4afd61
|
Fix llama when rope scaling is not None (#9086)
* Fix llama when rope scaling is not None
* fix style
* fix style
|
2023-10-06 13:27:37 -07:00 |
|
Yang Wang
|
fcb1c618a0
|
using bigdl-llm fused rope for llama (#9066)
* optimize llama xpu rope
* fix bug
* fix style
* refine append cache
* remove check
* do not cache cos sin
* remove unnecessary changes
* clean up
* fix style
* check for training
|
2023-10-06 09:57:29 -07:00 |
|
Jason Dai
|
50044640c0
|
Update README.md (#9085)
|
2023-10-06 21:54:18 +08:00 |
|
Jiao Wang
|
aefa5a5bfe
|
Qwen kv cache (#9079)
* qwen and aquila
* update
* update
* style
|
2023-10-05 11:59:17 -07:00 |
|
Jiao Wang
|
d5ca1f32b6
|
Aquila KV cache optimization (#9080)
* update
* update
* style
|
2023-10-05 11:10:57 -07:00 |
|
Jason Dai
|
7506100bd5
|
Update readme (#9084)
|
2023-10-05 16:54:09 +08:00 |
|
Yang Wang
|
88565c76f6
|
add export merged model example (#9018)
* add export merged model example
* add sources
* add script
* fix style
|
2023-10-04 21:18:52 -07:00 |
|
Yang Wang
|
0cd8f1c79c
|
Use ipex fused rms norm for llama (#9081)
* also apply rmsnorm
* fix cpu
|
2023-10-04 21:04:55 -07:00 |
|
Cengguang Zhang
|
fb883100e7
|
LLM: support chatglm-18b convert attention forward in benchmark scripts. (#9072)
* add chatglm-18b convert.
* fix if statement.
* fix
|
2023-09-28 14:04:52 +08:00 |
|
Yishuo Wang
|
6de2189e90
|
[LLM] fix chatglm main choice (#9073)
|
2023-09-28 11:23:37 +08:00 |
|
binbin Deng
|
760183bac6
|
LLM: update key feature and installation page of document (#9068)
|
2023-09-27 15:44:34 +08:00 |
|
Lilac09
|
c91b2bd574
|
fix:modify indentation (#9070)
* modify Dockerfile
* add README.md
* add README.md
* Modify Dockerfile
* Add bigdl inference cpu image build
* Add bigdl llm cpu image build
* Add bigdl llm cpu image build
* Add bigdl llm cpu image build
* Modify Dockerfile
* Add bigdl inference cpu image build
* Add bigdl inference cpu image build
* Add bigdl llm xpu image build
* manually build
* recover file
* manually build
* recover file
* modify indentation
|
2023-09-27 14:53:52 +08:00 |
|
Wang
|
ddcd9e7d0a
|
modify indentation
|
2023-09-27 14:49:58 +08:00 |
|
Wang
|
9935772f24
|
recover file
|
2023-09-26 15:50:51 +08:00 |
|
Wang
|
efc2158215
|
manually build
|
2023-09-26 15:47:47 +08:00 |
|