Commit graph

1566 commits

Author SHA1 Message Date
Wang
4aee952b10 Merge remote-tracking branch 'upstream/main' 2023-10-07 09:53:52 +08:00
ZehuaCao
b773d67dd4 Add Kubernetes support for BigDL-LLM-serving CPU. (#9071) 2023-10-07 09:37:48 +08:00
Yang Wang
36dd4afd61 Fix llama when rope scaling is not None (#9086)
* Fix llama when rope scaling is not None

* fix style

* fix style
2023-10-06 13:27:37 -07:00
Yang Wang
fcb1c618a0 using bigdl-llm fused rope for llama (#9066)
* optimize llama xpu rope

* fix bug

* fix style

* refine append cache

* remove check

* do not cache cos sin

* remove unnecessary changes

* clean up

* fix style

* check for training
2023-10-06 09:57:29 -07:00
Jason Dai
50044640c0 Update README.md (#9085) 2023-10-06 21:54:18 +08:00
Jiao Wang
aefa5a5bfe Qwen kv cache (#9079)
* qwen and aquila

* update

* update

* style
2023-10-05 11:59:17 -07:00
Jiao Wang
d5ca1f32b6 Aquila KV cache optimization (#9080)
* update

* update

* style
2023-10-05 11:10:57 -07:00
Jason Dai
7506100bd5 Update readme (#9084) 2023-10-05 16:54:09 +08:00
Yang Wang
88565c76f6 add export merged model example (#9018)
* add export merged model example

* add sources

* add script

* fix style
2023-10-04 21:18:52 -07:00
Yang Wang
0cd8f1c79c Use ipex fused rms norm for llama (#9081)
* also apply rmsnorm

* fix cpu
2023-10-04 21:04:55 -07:00
Cengguang Zhang
fb883100e7 LLM: support chatglm-18b convert attention forward in benchmark scripts. (#9072)
* add chatglm-18b convert.

* fix if statement.

* fix
2023-09-28 14:04:52 +08:00
Yishuo Wang
6de2189e90 [LLM] fix chatglm main choice (#9073) 2023-09-28 11:23:37 +08:00
binbin Deng
760183bac6 LLM: update key feature and installation page of document (#9068) 2023-09-27 15:44:34 +08:00
Lilac09
c91b2bd574 fix:modify indentation (#9070)
* modify Dockerfile

* add README.md

* add README.md

* Modify Dockerfile

* Add bigdl inference cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build

* Modify Dockerfile

* Add bigdl inference cpu image build

* Add bigdl inference cpu image build

* Add bigdl llm xpu image build

* manually build

* recover file

* manually build

* recover file

* modify indentation
2023-09-27 14:53:52 +08:00
Wang
ddcd9e7d0a modify indentation 2023-09-27 14:49:58 +08:00
Wang
9935772f24 recover file 2023-09-26 15:50:51 +08:00
Wang
efc2158215 manually build 2023-09-26 15:47:47 +08:00
Wang
fdc0e838df Merge remote-tracking branch 'upstream/main' 2023-09-26 15:45:31 +08:00
Wang
b17e536a1b recover file 2023-09-26 15:45:03 +08:00
Cengguang Zhang
ad62c58b33 LLM: Enable jemalloc in benchmark scripts. (#9058)
* enable jemalloc.

* fix readme.
2023-09-26 15:37:49 +08:00
Wang
9e03c5c7fc manually build 2023-09-26 15:28:01 +08:00
Wang
2dc76dc358 manually build 2023-09-26 15:15:15 +08:00
Lilac09
ecee02b34d Add bigdl llm xpu image build (#9062)
* modify Dockerfile

* add README.md

* add README.md

* Modify Dockerfile

* Add bigdl inference cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build

* Modify Dockerfile

* Add bigdl inference cpu image build

* Add bigdl inference cpu image build

* Add bigdl llm xpu image build
2023-09-26 14:29:03 +08:00
Wang
d0ac0941a2 Add bigdl llm xpu image build 2023-09-26 14:25:10 +08:00
Wang
781bc5bc8d Add bigdl inference cpu image build 2023-09-26 14:07:36 +08:00
Wang
390c90551e Add bigdl inference cpu image build 2023-09-26 14:03:55 +08:00
Wang
7a69bee8d0 Modify Dockerfile 2023-09-26 13:58:42 +08:00
Wang
47996c29e4 Merge remote-tracking branch 'upstream/main' 2023-09-26 13:56:27 +08:00
Lilac09
9ac950fa52 Add bigdl llm cpu image build (#9047)
* modify Dockerfile

* add README.md

* add README.md

* Modify Dockerfile

* Add bigdl inference cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build

* Add bigdl llm cpu image build
2023-09-26 13:22:11 +08:00
Wang
a50c11d326 Modify Dockerfile 2023-09-26 11:19:13 +08:00
Ziteng Zhang
a717352c59 Replace Llama 7b to Llama2-7b in README.md (#9055)
* Replace Llama 7b with Llama2-7b in README.md

Need to replace the base model to Llama2-7b as we are operating on Llama2 here.

* Replace Llama 7b to Llama2-7b in README.md

a llama 7b in the 1st line is missed

* Update architecture graph

---------

Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com>
2023-09-26 09:56:46 +08:00
Guancheng Fu
cc84ed70b3 Create serving images (#9048)
* Finished & Tested

* Install latest pip from base images

* Add blank line

* Delete unused comment

* fix typos
2023-09-25 15:51:45 +08:00
Wang
847af63e8e Add bigdl llm cpu image build 2023-09-25 15:33:39 +08:00
Wang
7f2d2a5238 Add bigdl llm cpu image build 2023-09-25 15:14:23 +08:00
Wang
9cae4600da Add bigdl llm cpu image build 2023-09-25 14:45:30 +08:00
Wang
ceed895c31 Add bigdl inference cpu image build 2023-09-25 14:31:43 +08:00
Cengguang Zhang
b4a1266ef0 [WIP] LLM: add kv cache support for internlm. (#9036)
* LLM: add kv cache support for internlm

* add internlm apply_rotary_pos_emb

* fix.

* fix style.
2023-09-25 14:16:59 +08:00
Wang
fc8bf6b0d5 Modify Dockerfile 2023-09-25 14:05:08 +08:00
Wang
e8f436453d Merge remote-tracking branch 'upstream/main' 2023-09-25 13:59:19 +08:00
Ruonan Wang
975da86e00 LLM: fix gptneox kv cache (#9044) 2023-09-25 13:03:57 +08:00
Heyang Sun
4b843d1dbf change lora-model output behavior on k8s (#9038)
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>
2023-09-25 09:28:44 +08:00
Cengguang Zhang
26213a5829 LLM: Change benchmark bf16 load format. (#9035)
* LLM: Change benchmark bf16 load format.

* comment on bf16 chatglm.

* fix.
2023-09-22 17:38:38 +08:00
JinBridge
023555fb1f LLM: Add one-click installer for Windows (#8999)
* LLM: init one-click installer for windows

* LLM: fix typo in one-click installer readme

* LLM: one-click installer try except logic

* LLM: one-click installer add dependency

* LLM: one-click installer adjust README.md

* LLM: one-click installer split README and add zip compress in setup.bat

* LLM: one-click installer verified internlm and llama2 and replace gif

* LLM: remove one-click installer images

* LLM: finetune the one-click installer README.md

* LLM: fix typo in one-click installer README.md

* LLM: rename one-click installer to protable executable

* LLM: rename other places to protable executable

* LLM: rename the zip filename to executable

* LLM: update .gitignore

* LLM: add colorama to setup.bat
2023-09-22 14:46:30 +08:00
Jiao Wang
028a6d9383 MPT model optimize for long sequence (#9020)
* mpt_long_seq

* update

* update

* update

* style

* style2

* update
2023-09-21 21:27:23 -07:00
Lilac09
9126abdf9b add README.md for bigdl-llm-cpu image (#9026)
* modify Dockerfile

* add README.md

* add README.md
2023-09-22 09:03:57 +08:00
Ruonan Wang
b943d73844 LLM: refactor kv cache (#9030)
* refactor utils

* meet code review; update all models

* small fix
2023-09-21 21:28:03 +08:00
Cengguang Zhang
868511cf02 LLM: fix kv cache issue of bloom and falcon. (#9029) 2023-09-21 18:12:20 +08:00
Ruonan Wang
bf51ec40b2 LLM: Fix empty cache (#9024)
* fix

* fix

* update example
2023-09-21 17:16:07 +08:00
Wang
f985068491 add README.md 2023-09-21 16:58:37 +08:00
Yina Chen
714884414e fix error (#9025) 2023-09-21 16:42:11 +08:00