Heyang Sun
581ebf6104
GaLore Finetuning Example ( #10722 )
...
* GaLore Finetuning Example
* Update README.md
* Update README.md
* change data to HuggingFaceH4/helpful_instructions
* Update README.md
* Update README.md
* shrink train size and delete cache before starting training to save memory
* Update README.md
* Update galore_finetuning.py
* change model to llama2 3b
* Update README.md
2024-04-18 13:47:41 +08:00
Yang Wang
952e517db9
use config rope_theta ( #10787 )
...
* use config rope_theta
* fix style
2024-04-17 20:39:11 -07:00
Guancheng Fu
31ea2f9a9f
Fix wrong output for Llama models on CPU ( #10742 )
2024-04-18 11:07:27 +08:00
Xin Qiu
e764f9b1b1
Disable fast fused rope on UHD ( #10780 )
...
* use decoding fast path
* update
* update
* cleanup
2024-04-18 10:03:53 +08:00
Yina Chen
ea5b373a97
Add lookahead GPU example ( #10785 )
...
* Add lookahead example
* fix style & attn mask
* fix typo
* address comments
2024-04-17 17:41:55 +08:00
Wang, Jian4
a20271ffe4
LLM: Fix yi-6b fp16 error on pvc ( #10781 )
...
* updat for yi fp16
* update
* update
2024-04-17 16:49:59 +08:00
ZehuaCao
0646e2c062
Fix short prompt for IPEX_CPU speculative decoding cause no_attr error ( #10783 )
2024-04-17 16:19:57 +08:00
Cengguang Zhang
7ec82c6042
LLM: add README.md for Long-Context examples. ( #10765 )
...
* LLM: add readme to long-context examples.
* add precision.
* update wording.
* add GPU type.
* add Long-Context example to GPU examples.
* fix comments.
* update max input length.
* update max length.
* add output length.
* fix wording.
2024-04-17 15:34:59 +08:00
Yina Chen
766fe45222
Fix spec error caused by lookup pr ( #10777 )
...
* Fix spec error
* remove
* fix style
2024-04-17 11:27:35 +08:00
Qiyuan Gong
9e5069437f
Fix gradio version in axolotl example ( #10776 )
...
* Change to gradio>=4.19.2
2024-04-17 10:23:43 +08:00
Qiyuan Gong
f2e923b3ca
Axolotl v0.4.0 support ( #10773 )
...
* Add Axolotl 0.4.0, remove legacy 0.3.0 support.
* replace is_torch_bf16_gpu_available
* Add HF_HUB_OFFLINE=1
* Move transformers out of requirement
* Refine readme and qlora.yml
2024-04-17 09:49:11 +08:00
Heyang Sun
26cae0a39c
Update FLEX in Deepspeed README ( #10774 )
...
* Update FLEX in Deepspeed README
* Update README.md
2024-04-17 09:28:24 +08:00
Wenjing Margaret Mao
c41730e024
edit 'ppl_result does not exist' issue, delete useless code ( #10767 )
...
* edit ppl_result not exist issue, delete useless code
* delete nonzero_min function
---------
Co-authored-by: jenniew <jenniewang123@gmail.com>
2024-04-16 18:11:56 +08:00
Yina Chen
899d392e2f
Support prompt lookup in ipex-llm ( #10768 )
...
* lookup init
* add lookup
* fix style
* remove redundant code
* change param name
* fix style
2024-04-16 16:52:38 +08:00
Qiyuan Gong
d30b22a81b
Refine axolotl 0.3.0 documents and links ( #10764 )
...
* Refine axolotl 0.3 based on comments
* Rename requirements to requirement-xpu
* Add comments for paged_adamw_32bit
* change lora_r from 8 to 16
2024-04-16 14:47:45 +08:00
ZehuaCao
599a88db53
Add deepsped-autoTP-Fastapi serving ( #10748 )
...
* add deepsped-autoTP-Fastapi serving
* add readme
* add license
* update
* update
* fix
2024-04-16 14:03:23 +08:00
ZehuaCao
a7c12020b4
Add fastchat quickstart ( #10688 )
...
* add fastchat quickstart
* update
* update
* update
2024-04-16 14:02:38 +08:00
Ruonan Wang
ea5e46c8cb
Small update of quickstart ( #10772 )
2024-04-16 10:46:58 +08:00
binbin Deng
0a62933d36
LLM: fix qwen AutoTP ( #10766 )
2024-04-16 09:56:17 +08:00
Cengguang Zhang
3e2662c87e
LLM: fix get env KV_CACHE_ALLOC_BLOCK_LENGTH type. ( #10771 )
2024-04-16 09:32:30 +08:00
Shaojun Liu
7297036c03
upgrade python ( #10769 )
2024-04-16 09:28:10 +08:00
Yuwen Hu
1abd77507e
Small update for GPU configuration related doc ( #10770 )
...
* Small doc fix for dGPU type name
* Further fixes
* Further fix
* Small fix
2024-04-15 18:43:29 +08:00
Jin Qiao
73a67804a4
GPU configuration update for examples (windows pip installer, etc.) ( #10762 )
...
* renew chatglm3-6b gpu example readme
fix
fix
fix
* fix for comments
* fix
* fix
* fix
* fix
* fix
* apply on HF-Transformers-AutoModels
* apply on PyTorch-Models
* fix
* fix
2024-04-15 17:42:52 +08:00
Ruonan Wang
1bd431976d
Update ollama quickstart ( #10756 )
...
* update windows part
* update ollama quickstart
* update ollama
* update
* small fix
* update
* meet review
2024-04-15 16:37:55 +08:00
Kai Huang
47622c6a92
Fix missing export typo in linux quickstart ( #10750 )
2024-04-15 14:16:40 +08:00
Yuwen Hu
486df2764a
Update gpu configuration ( #10760 )
2024-04-15 13:27:15 +08:00
yb-peng
b5209d3ec1
Update example/GPU/PyTorch-Models/Model/llava/README.md ( #10757 )
...
* Update example/GPU/PyTorch-Models/Model/llava/README.md
* Update README.md
fix path in windows installation
2024-04-15 13:01:37 +08:00
binbin Deng
3d561b60ac
LLM: add enable_xetla parameter for optimize_model API ( #10753 )
2024-04-15 12:18:25 +08:00
Shaojun Liu
3590e1be83
revert python to 3.9 for finetune image ( #10758 )
2024-04-15 10:37:10 +08:00
Jiao Wang
a9a6b6b7af
Fix baichuan-13b issue on portable zip under transformers 4.36 ( #10746 )
...
* fix baichuan-13b issue
* update
* update
2024-04-12 16:27:01 -07:00
Jiao Wang
9e668a5bf0
fix_internlm-chat-7b-8k repo name in examples ( #10747 )
2024-04-12 10:15:48 -07:00
binbin Deng
c3fc8f4b90
LLM: add bs limitation for llama softmax upcast to fp32 ( #10752 )
2024-04-12 15:40:25 +08:00
hxsz1997
0d518aab8d
Merge pull request #10697 from MargarettMao/ceval
...
combine english and chinese, remove nan
2024-04-12 14:37:47 +08:00
jenniew
dd0d2df5af
Change fp16.csv mistral-7b-v0.1 into Mistral-7B-v0.1
2024-04-12 14:28:46 +08:00
jenniew
7309f1ddf9
Mofidy Typos
2024-04-12 14:23:13 +08:00
jenniew
cb594e1fc5
Mofidy Typos
2024-04-12 14:22:09 +08:00
jenniew
382c18e600
Mofidy Typos
2024-04-12 14:15:48 +08:00
jenniew
1a360823ce
Mofidy Typos
2024-04-12 14:13:21 +08:00
jenniew
cdbb1de972
Mark Color Modification
2024-04-12 14:00:50 +08:00
jenniew
9bbfcaf736
Mark Color Modification
2024-04-12 13:30:16 +08:00
jenniew
bb34c6e325
Mark Color Modification
2024-04-12 13:26:36 +08:00
Yishuo Wang
8086554d33
use new fp16 sdp in llama and mistral ( #10734 )
2024-04-12 10:49:02 +08:00
Yang Wang
019293e1b9
Fuse MOE indexes computation ( #10716 )
...
* try moe
* use c++ cpu to compute indexes
* fix style
2024-04-11 10:12:55 -07:00
jenniew
b151a9b672
edit csv_to_html to combine en & zh
2024-04-11 17:35:36 +08:00
binbin Deng
70ed9397f9
LLM: fix AttributeError of FP16Linear ( #10740 )
2024-04-11 17:03:56 +08:00
Keyan (Kyrie) Zhang
1256a2cc4e
Add chatglm3 long input example ( #10739 )
...
* Add long context input example for chatglm3
* Small fix
* Small fix
* Small fix
2024-04-11 16:33:43 +08:00
hxsz1997
fd473ddb1b
Merge pull request #10730 from MargarettMao/MargarettMao-parent_folder
...
Edit ppl update_HTML_parent_folder
2024-04-11 15:45:24 +08:00
Qiyuan Gong
2d64630757
Remove transformers version in axolotl example ( #10736 )
...
* Remove transformers version in axolotl requirements.txt
2024-04-11 14:02:31 +08:00
yb-peng
2685c41318
Modify all-in-one benchmark ( #10726 )
...
* Update 8192 prompt in all-in-one
* Add cpu_embedding param for linux api
* Update run.py
* Update README.md
2024-04-11 13:38:50 +08:00
Xiangyu Tian
301504aa8d
Fix transformers version warning ( #10732 )
2024-04-11 13:12:49 +08:00