Yuwen Hu
|
8fdc36c140
|
Optimize with new batch kernel when batch_size=1 on LNL (#12419)
* Add use batch kernel condition for LNL
* Fix for other device judgement
* Fix based on comment
|
2024-11-21 16:21:35 +08:00 |
|
Heyang Sun
|
70c828b87c
|
deepspeed zero3 QLoRA finetuning (#11625)
* deepspeed zero3 QLoRA finetuning
* Update convert.py
* Update low_bit_linear.py
* Update utils.py
* Update qlora_finetune_llama2_13b_arch_2_card.sh
* Update low_bit_linear.py
* Update alpaca_qlora_finetuning.py
* Update low_bit_linear.py
* Update utils.py
* Update convert.py
* Update alpaca_qlora_finetuning.py
* Update alpaca_qlora_finetuning.py
* Update low_bit_linear.py
* Update deepspeed_zero3.json
* Update qlora_finetune_llama2_13b_arch_2_card.sh
* Update low_bit_linear.py
* Update low_bit_linear.py
* Update utils.py
* fix style
* fix style
* Update alpaca_qlora_finetuning.py
* Update qlora_finetune_llama2_13b_arch_2_card.sh
* Update convert.py
* Update low_bit_linear.py
* Update model.py
* Update alpaca_qlora_finetuning.py
* Update low_bit_linear.py
* Update low_bit_linear.py
* Update low_bit_linear.py
|
2024-08-13 16:15:29 +08:00 |
|
Ruonan Wang
|
54bf3a23a6
|
add fallback for unsupported k-quants (#11691)
* add fallback
* fix style
* fix
|
2024-07-31 11:39:58 +08:00 |
|
Ruonan Wang
|
f1156e6b20
|
support gguf_q4k_m / gguf_q4k_s (#10887)
* initial commit
* UPDATE
* fix style
* fix style
* add gguf_q4k_s
* update comment
* fix
|
2024-05-17 14:30:09 +08:00 |
|
Xin Qiu
|
e764f9b1b1
|
Disable fast fused rope on UHD (#10780)
* use decoding fast path
* update
* update
* cleanup
|
2024-04-18 10:03:53 +08:00 |
|
Yina Chen
|
766fe45222
|
Fix spec error caused by lookup pr (#10777)
* Fix spec error
* remove
* fix style
|
2024-04-17 11:27:35 +08:00 |
|
Qiyuan Gong
|
f2e923b3ca
|
Axolotl v0.4.0 support (#10773)
* Add Axolotl 0.4.0, remove legacy 0.3.0 support.
* replace is_torch_bf16_gpu_available
* Add HF_HUB_OFFLINE=1
* Move transformers out of requirement
* Refine readme and qlora.yml
|
2024-04-17 09:49:11 +08:00 |
|
Ruonan Wang
|
bfc1caa5e5
|
LLM: support iq1s for llama2-70b-hf (#10596)
|
2024-04-01 13:13:13 +08:00 |
|
Ruonan Wang
|
0136fad1d4
|
LLM: support iq1_s (#10564)
* init version
* update utils
* remove unsed code
|
2024-03-29 09:43:55 +08:00 |
|
Wang, Jian4
|
9df70d95eb
|
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm
* rm python/llm/src/bigdl
* from bigdl.llm to from ipex_llm
|
2024-03-22 15:41:21 +08:00 |
|