Jiao Wang
33b9e7744d
fix dimension ( #10097 )
2024-02-05 15:07:38 -08:00
SONG Ge
4b02ff188b
[WebUI] Add prompt format and stopping words for Qwen ( #10066 )
...
* add prompt format and stopping_words for qwen mdoel
* performance optimization
* optimize
* update
* meet comments
2024-02-05 18:23:13 +08:00
WeiguangHan
0aecd8637b
LLM: small fix for the html script ( #10094 )
2024-02-05 17:27:34 +08:00
Zhicun
7d2be7994f
add phixtral and optimize phi-moe ( #10052 )
2024-02-05 11:12:47 +08:00
Zhicun
676d6923f2
LLM: modify transformersembeddings.embed() in langchain ( #10051 )
2024-02-05 10:42:10 +08:00
Jin Qiao
ad050107b3
LLM: fix mpt load_low_bit issue ( #10075 )
...
* fix
* retry
* retry
2024-02-05 10:17:07 +08:00
Lilac09
f8dcaff7f4
use default python ( #10070 )
2024-02-05 09:06:59 +08:00
SONG Ge
9050991e4e
fix gradio check issue temply ( #10082 )
2024-02-04 16:46:29 +08:00
WeiguangHan
c2e562d037
LLM: add batch_size to the csv and html ( #10080 )
...
* LLM: add batch_size to the csv and html
* small fix
2024-02-04 16:35:44 +08:00
Yuwen Hu
136f042f84
[LLM] Make sure python 310-311 tests only happen for nightly tests ( #10081 )
...
* Make sure python 310-311 tests only happen for nightly tests
* Use default runner for setup-python-version
* Small fixes
2024-02-04 16:14:48 +08:00
binbin Deng
7e49fbc5dd
LLM: make finetuning examples more common for other models ( #10078 )
2024-02-04 16:03:52 +08:00
Heyang Sun
90f004b80b
remove benchmarkwrapper form deepspeed example ( #10079 )
2024-02-04 15:42:15 +08:00
Jin Qiao
f9a468a2c7
LLM: conditionally choose python version for unit test ( #10062 )
...
* conditional python version
* retry
* temporary skip llm-cpp-build
* apply on llm-unit-test-on-arc
* fix
* add llm-cpp-build dependency
* use GITHUB_OUTPUT instead of set-output
* check nightly build
* fix quote
* fix quote
* add llm-cpp-build dependency
* test nightly build
* test pull request
2024-02-04 13:37:34 +08:00
Ruonan Wang
8e33cb0f38
LLM: support speecht5_tts ( #10077 )
...
* support speecht5_tts
* fix
2024-02-04 13:26:42 +08:00
yb-peng
738275761d
In llm-harness-evaluation, add new models and change schedule to nightly ( #10072 )
...
* add new models and change schedule to nightly
* correct syntax error
* modify env set up and job
* change label and schedule time
* change schedule time
* change label
2024-02-04 13:12:09 +08:00
Shaojun Liu
698f84648c
split stable version tests ( #10076 )
...
Co-authored-by: Your Name <Your Email>
2024-02-04 11:08:12 +08:00
ivy-lv11
428b7105f6
Add HF and PyTorch example InternLM2 ( #10061 )
2024-02-04 10:25:55 +08:00
binbin Deng
91cf9d41d0
LLM: add solutions of some frequently asked questions ( #10068 )
2024-02-04 09:28:20 +08:00
Yina Chen
77be19bb97
LLM: Support gpt-j in speculative decoding ( #10067 )
...
* gptj
* support gptj in speculative decoding
* fix
* update readme
* small fix
2024-02-02 14:54:55 +08:00
Jason Dai
2927c77d7f
Update readme ( #10071 )
2024-02-01 20:40:20 -08:00
SONG Ge
19183ef476
[WebUI] Reset bigdl-llm loader options with default value ( #10064 )
...
* reset bigdl-llm loader options with default value
* remove options which maybe complex for naive users
2024-02-01 15:45:39 +08:00
Xin Qiu
6e0f1a1e92
use apply_rotary_pos_emb_cache_freq_xpu in mixtral ( #10060 )
...
* use apply_rotary_pos_emb_cache_freq_xpu in mixtral
* fix style
2024-02-01 15:40:49 +08:00
binbin Deng
aae20d728e
LLM: Add initial DPO finetuning example ( #10021 )
2024-02-01 14:18:08 +08:00
Heyang Sun
601024f418
Mistral CPU example of speculative decoding ( #10024 )
...
* Mistral CPU example of speculative decoding
* update transformres version
* update example
* Update README.md
2024-02-01 10:52:32 +08:00
Heyang Sun
968e70544d
Enable IPEX Mistral in Speculative ( #10059 )
2024-02-01 10:48:16 +08:00
Yina Chen
3ca03d4e97
Add deepmind sample into bigdl-llm speculative decoding ( #10041 )
...
* migrate deepmind sample
* update
* meet comments
* fix style
* fix style
2024-02-01 09:57:02 +08:00
Lilac09
72e67eedbb
Add speculative support in docker ( #10058 )
...
* add speculative environment
* add speculative environment
* add speculative environment
2024-02-01 09:53:53 +08:00
binbin Deng
4b92235bdb
LLM: add initial FAQ page ( #10055 )
2024-02-01 09:43:39 +08:00
WeiguangHan
d2d3f6b091
LLM: ensure the result of daily arc perf test ( #10016 )
...
* ensure the result of daily arc perf test
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* concat more csvs
* small fix
* revert some files
2024-01-31 18:26:21 +08:00
WeiguangHan
9724939499
temporarily disable bloom 2k input ( #10056 )
2024-01-31 17:49:12 +08:00
Jin Qiao
8c8fc148c9
LLM: add rwkv 5 ( #10048 )
2024-01-31 15:54:55 +08:00
WeiguangHan
a9018a0e95
LLM: modify the GPU example for redpajama model ( #10044 )
...
* LLM: modify the GPU example for redpajama model
* small fix
2024-01-31 14:32:08 +08:00
Yuxuan Xia
95636cad97
Add AutoGen CPU and XPU Example ( #9980 )
...
* Add AutoGen example
* Adjust AutoGen README
* Adjust AutoGen README
* Change AutoGen README
* Change AutoGen README
2024-01-31 11:31:18 +08:00
Heyang Sun
7284edd9b7
Vicuna CPU example of speculative decoding ( #10018 )
...
* Vicuna CPU example of speculative decoding
* Update speculative.py
* Update README.md
* add requirements for ipex
* Update README.md
* Update speculative.py
* Update speculative.py
2024-01-31 11:23:50 +08:00
Wang, Jian4
7e5cd42a5c
LLM : Update optimize ipex bf16 ( #10038 )
...
* use 4.35.2 and remove
* update rmsnorm
* remove
* remove
* update python style
* update
* update python style
* update
* fix style
* update
* remove whitespace
2024-01-31 10:59:55 +08:00
Wang, Jian4
fb53b994f8
LLM : Add llama ipex optimized ( #10046 )
...
* init ipex
* remove padding
2024-01-31 10:38:46 +08:00
Ruonan Wang
3685622f29
LLM: fix llama 4.36 forward( #10047 )
2024-01-31 10:31:10 +08:00
Yishuo Wang
53a5140eff
Optimize rwkv v5 rest token again ( #10043 )
2024-01-31 10:01:11 +08:00
Heyang Sun
b1ff28ceb6
LLama2 CPU example of speculative decoding ( #9962 )
...
* LLama2 example of speculative decoding
* add docs
* Update speculative.py
* Update README.md
* Update README.md
* Update speculative.py
* remove autocast
2024-01-31 09:45:20 +08:00
WeiguangHan
0fcad6ce14
LLM: add gpu example for redpajama models ( #10040 )
2024-01-30 19:39:28 +08:00
Shaojun Liu
2a0d65a009
Bump aiohttp from 3.9.0 to 3.9.2 to resolve security issues ( #10030 )
...
* Bump aiohttp from 3.9.0 to 3.9.2 to resolve security issues
* update url
* remove aiohttp version from requirements.txt
* revert aiohttp to 3.9.0
* trigger tests
* revert
2024-01-30 19:36:02 +08:00
Yuwen Hu
863c3f94d0
[LLM] Change nightly perf to install from pypi ( #10027 )
...
* Change to install from pypi and have a check to make sure the installed bigdl-llm version is as expected
* Make sure result date is the same as tested bigdl-llm version
* Small fixes
* Small fix
* Small fixes
* Small fix
* Small fixes
* Small updates
2024-01-30 18:15:44 +08:00
Xiangyu Tian
9978089796
[LLM] Enable BIGDL_OPT_IPEX in speculative baichuan2 13b example ( #10028 )
...
Enable BIGDL_OPT_IPEX in speculative baichuan2 13b example
2024-01-30 17:11:37 +08:00
Ovo233
226f398c2a
fix ppl test errors ( #10036 )
2024-01-30 16:26:21 +08:00
Xin Qiu
13e61738c5
hide detail memory for each token in benchmark_utils.py ( #10037 )
2024-01-30 16:04:17 +08:00
Ruonan Wang
6b63ba23d1
LLM: add full module name during convert ( #10035 )
2024-01-30 14:43:07 +08:00
Yishuo Wang
7dfa6dbe46
add rwkv time shift optimization ( #10032 )
2024-01-30 14:10:55 +08:00
Xiangyu Tian
f57d0fda8b
[LLM] Use IPEX Optimization for Self Speculative Decoding ( #9997 )
...
Use IPEX Optimization for Self Speculative Decoding
2024-01-30 09:11:06 +08:00
Ruonan Wang
ccf8f613fb
LLM: update fp16 Linear on ARC/FLEX ( #10023 )
2024-01-29 18:25:26 +08:00
Yuwen Hu
a5c9dfdf91
[LLM] Main readme gpu installation related updates ( #9868 )
...
* Main readme gpu installation related updates
* Small updates for readthedocs main page
2024-01-29 16:33:27 +08:00