Qiyuan Gong
|
f2e923b3ca
|
Axolotl v0.4.0 support (#10773)
* Add Axolotl 0.4.0, remove legacy 0.3.0 support.
* replace is_torch_bf16_gpu_available
* Add HF_HUB_OFFLINE=1
* Move transformers out of requirement
* Refine readme and qlora.yml
|
2024-04-17 09:49:11 +08:00 |
|
Heyang Sun
|
26cae0a39c
|
Update FLEX in Deepspeed README (#10774)
* Update FLEX in Deepspeed README
* Update README.md
|
2024-04-17 09:28:24 +08:00 |
|
Wenjing Margaret Mao
|
c41730e024
|
edit 'ppl_result does not exist' issue, delete useless code (#10767)
* edit ppl_result not exist issue, delete useless code
* delete nonzero_min function
---------
Co-authored-by: jenniew <jenniewang123@gmail.com>
|
2024-04-16 18:11:56 +08:00 |
|
Yina Chen
|
899d392e2f
|
Support prompt lookup in ipex-llm (#10768)
* lookup init
* add lookup
* fix style
* remove redundant code
* change param name
* fix style
|
2024-04-16 16:52:38 +08:00 |
|
Qiyuan Gong
|
d30b22a81b
|
Refine axolotl 0.3.0 documents and links (#10764)
* Refine axolotl 0.3 based on comments
* Rename requirements to requirement-xpu
* Add comments for paged_adamw_32bit
* change lora_r from 8 to 16
|
2024-04-16 14:47:45 +08:00 |
|
ZehuaCao
|
599a88db53
|
Add deepsped-autoTP-Fastapi serving (#10748)
* add deepsped-autoTP-Fastapi serving
* add readme
* add license
* update
* update
* fix
|
2024-04-16 14:03:23 +08:00 |
|
binbin Deng
|
0a62933d36
|
LLM: fix qwen AutoTP (#10766)
|
2024-04-16 09:56:17 +08:00 |
|
Cengguang Zhang
|
3e2662c87e
|
LLM: fix get env KV_CACHE_ALLOC_BLOCK_LENGTH type. (#10771)
|
2024-04-16 09:32:30 +08:00 |
|
Jin Qiao
|
73a67804a4
|
GPU configuration update for examples (windows pip installer, etc.) (#10762)
* renew chatglm3-6b gpu example readme
fix
fix
fix
* fix for comments
* fix
* fix
* fix
* fix
* fix
* apply on HF-Transformers-AutoModels
* apply on PyTorch-Models
* fix
* fix
|
2024-04-15 17:42:52 +08:00 |
|
yb-peng
|
b5209d3ec1
|
Update example/GPU/PyTorch-Models/Model/llava/README.md (#10757)
* Update example/GPU/PyTorch-Models/Model/llava/README.md
* Update README.md
fix path in windows installation
|
2024-04-15 13:01:37 +08:00 |
|
binbin Deng
|
3d561b60ac
|
LLM: add enable_xetla parameter for optimize_model API (#10753)
|
2024-04-15 12:18:25 +08:00 |
|
Jiao Wang
|
a9a6b6b7af
|
Fix baichuan-13b issue on portable zip under transformers 4.36 (#10746)
* fix baichuan-13b issue
* update
* update
|
2024-04-12 16:27:01 -07:00 |
|
Jiao Wang
|
9e668a5bf0
|
fix_internlm-chat-7b-8k repo name in examples (#10747)
|
2024-04-12 10:15:48 -07:00 |
|
binbin Deng
|
c3fc8f4b90
|
LLM: add bs limitation for llama softmax upcast to fp32 (#10752)
|
2024-04-12 15:40:25 +08:00 |
|
hxsz1997
|
0d518aab8d
|
Merge pull request #10697 from MargarettMao/ceval
combine english and chinese, remove nan
|
2024-04-12 14:37:47 +08:00 |
|
jenniew
|
dd0d2df5af
|
Change fp16.csv mistral-7b-v0.1 into Mistral-7B-v0.1
|
2024-04-12 14:28:46 +08:00 |
|
jenniew
|
7309f1ddf9
|
Mofidy Typos
|
2024-04-12 14:23:13 +08:00 |
|
jenniew
|
cb594e1fc5
|
Mofidy Typos
|
2024-04-12 14:22:09 +08:00 |
|
jenniew
|
382c18e600
|
Mofidy Typos
|
2024-04-12 14:15:48 +08:00 |
|
jenniew
|
1a360823ce
|
Mofidy Typos
|
2024-04-12 14:13:21 +08:00 |
|
jenniew
|
cdbb1de972
|
Mark Color Modification
|
2024-04-12 14:00:50 +08:00 |
|
jenniew
|
9bbfcaf736
|
Mark Color Modification
|
2024-04-12 13:30:16 +08:00 |
|
jenniew
|
bb34c6e325
|
Mark Color Modification
|
2024-04-12 13:26:36 +08:00 |
|
Yishuo Wang
|
8086554d33
|
use new fp16 sdp in llama and mistral (#10734)
|
2024-04-12 10:49:02 +08:00 |
|
Yang Wang
|
019293e1b9
|
Fuse MOE indexes computation (#10716)
* try moe
* use c++ cpu to compute indexes
* fix style
|
2024-04-11 10:12:55 -07:00 |
|
jenniew
|
b151a9b672
|
edit csv_to_html to combine en & zh
|
2024-04-11 17:35:36 +08:00 |
|
binbin Deng
|
70ed9397f9
|
LLM: fix AttributeError of FP16Linear (#10740)
|
2024-04-11 17:03:56 +08:00 |
|
Keyan (Kyrie) Zhang
|
1256a2cc4e
|
Add chatglm3 long input example (#10739)
* Add long context input example for chatglm3
* Small fix
* Small fix
* Small fix
|
2024-04-11 16:33:43 +08:00 |
|
hxsz1997
|
fd473ddb1b
|
Merge pull request #10730 from MargarettMao/MargarettMao-parent_folder
Edit ppl update_HTML_parent_folder
|
2024-04-11 15:45:24 +08:00 |
|
Qiyuan Gong
|
2d64630757
|
Remove transformers version in axolotl example (#10736)
* Remove transformers version in axolotl requirements.txt
|
2024-04-11 14:02:31 +08:00 |
|
yb-peng
|
2685c41318
|
Modify all-in-one benchmark (#10726)
* Update 8192 prompt in all-in-one
* Add cpu_embedding param for linux api
* Update run.py
* Update README.md
|
2024-04-11 13:38:50 +08:00 |
|
Xiangyu Tian
|
301504aa8d
|
Fix transformers version warning (#10732)
|
2024-04-11 13:12:49 +08:00 |
|
Wenjing Margaret Mao
|
9bec233e4d
|
Delete python/llm/test/benchmark/perplexity/update_html_in_parent_folder.py
Delete due to repetition
|
2024-04-11 07:21:12 +08:00 |
|
Cengguang Zhang
|
4b024b7aac
|
LLM: optimize chatglm2 8k input. (#10723)
* LLM: optimize chatglm2 8k input.
* rename.
|
2024-04-10 16:59:06 +08:00 |
|
Yuxuan Xia
|
cd22cb8257
|
Update Env check Script (#10709)
* Update env check bash file
* Update env-check
|
2024-04-10 15:06:00 +08:00 |
|
Shaojun Liu
|
29bf28bd6f
|
Upgrade python to 3.11 in Docker Image (#10718)
* install python 3.11 for cpu-inference docker image
* update xpu-inference dockerfile
* update cpu-serving image
* update qlora image
* update lora image
* update document
|
2024-04-10 14:41:27 +08:00 |
|
Qiyuan Gong
|
b727767f00
|
Add axolotl v0.3.0 with ipex-llm on Intel GPU (#10717)
* Add axolotl v0.3.0 support on Intel GPU.
* Add finetune example on llama-2-7B with Alpaca dataset.
|
2024-04-10 14:38:29 +08:00 |
|
Wang, Jian4
|
c9e6d42ad1
|
LLM: Fix chatglm3-6b-32k error (#10719)
* fix chatglm3-6b-32k
* update style
|
2024-04-10 11:24:06 +08:00 |
|
Keyan (Kyrie) Zhang
|
585c174e92
|
Read the value of KV_CACHE_ALLOC_BLOCK_LENGTH from the environment variables (#10707)
* Read the value of KV_CACHE_ALLOC_BLOCK_LENGTH from the environment variables.
* Fix style
|
2024-04-10 10:48:46 +08:00 |
|
Jiao Wang
|
d1eaea509f
|
update chatglm readme (#10659)
|
2024-04-09 14:24:46 -07:00 |
|
Jiao Wang
|
878a97077b
|
Fix llava example to support transformerds 4.36 (#10614)
* fix llava example
* update
|
2024-04-09 13:47:07 -07:00 |
|
Jiao Wang
|
1e817926ba
|
Fix low memory generation example issue in transformers 4.36 (#10702)
* update cache in low memory generate
* update
|
2024-04-09 09:56:52 -07:00 |
|
Yuwen Hu
|
97db2492c8
|
Update setup.py for bigdl-core-xe-esimd-21 on Windows (#10705)
* Support bigdl-core-xe-esimd-21 for windows in setup.py
* Update setup-llm-env accordingly
|
2024-04-09 18:21:21 +08:00 |
|
Zhicun
|
b4147a97bb
|
Fix dtype mismatch error (#10609)
* fix llama
* fix
* fix code style
* add torch type in model.py
---------
Co-authored-by: arda <arda@arda-arc19.sh.intel.com>
|
2024-04-09 17:50:33 +08:00 |
|
Shaojun Liu
|
f37a1f2a81
|
Upgrade to python 3.11 (#10711)
* create conda env with python 3.11
* recommend to use Python 3.11
* update
|
2024-04-09 17:41:17 +08:00 |
|
Yishuo Wang
|
8f45e22072
|
fix llama2 (#10710)
|
2024-04-09 17:28:37 +08:00 |
|
Yishuo Wang
|
e438f941f2
|
disable rwkv5 fp16 (#10699)
|
2024-04-09 16:42:11 +08:00 |
|
Cengguang Zhang
|
6a32216269
|
LLM: add llama2 8k input example. (#10696)
* LLM: add llama2-32K example.
* refactor name.
* fix comments.
* add IPEX_LLM_LOW_MEM notes and update sample output.
|
2024-04-09 16:02:37 +08:00 |
|
Wenjing Margaret Mao
|
289cc99cd6
|
Update README.md (#10700)
Edit "summarize the results"
|
2024-04-09 16:01:12 +08:00 |
|
Wenjing Margaret Mao
|
d3116de0db
|
Update README.md (#10701)
edit "summarize the results"
|
2024-04-09 15:50:25 +08:00 |
|