..
awq
Refactor bigdl.llm to ipex_llm ( #24 )
2024-03-22 15:41:21 +08:00
gguf
IPEX Duplicate importer V2 ( #11310 )
2024-06-19 16:29:19 +08:00
layers
Divide core-xe packages ( #11131 )
2024-05-28 12:00:18 +08:00
models
support new model ( #12523 )
2024-12-11 13:41:15 +08:00
npu_models
[NPU] fix transpose_value = False for NPU optimize_model=True ( #12525 )
2024-12-11 15:51:39 +08:00
npu_pipeline_model
[NPU] Fix minicpm-2B error ( #12527 )
2024-12-11 16:49:32 +08:00
__init__.py
Refactor fastapi-serving and add one card serving( #11581 )
2024-07-17 11:12:43 +08:00
bmm.py
Divide core-xe packages ( #11131 )
2024-05-28 12:00:18 +08:00
convert.py
support new model ( #12523 )
2024-12-11 13:41:15 +08:00
convert_ipex.py
LLM: Fix bigdl_ipex_int8 warning ( #10890 )
2024-04-26 11:18:44 +08:00
embedding.py
add save_low_bit support for DiskEmbedding ( #11621 )
2024-07-19 10:34:53 +08:00
kv.py
llama 3.1/3.2 support compresskv ( #12347 )
2024-11-06 17:33:43 +08:00
lisa.py
LISA Finetuning Example ( #10743 )
2024-04-18 13:48:10 +08:00
load_config.yaml
Adding load_low_bit interface for ipex_llm_worker ( #11000 )
2024-05-13 15:30:19 +08:00
loader.py
Add half precision for fastchat models ( #11130 )
2024-05-24 15:41:14 +08:00
lookup.py
Optimize with new batch kernel when batch_size=1 on LNL ( #12419 )
2024-11-21 16:21:35 +08:00
low_bit_linear.py
Enable use_batch_forward Optimization on Battlemage GPU ( #12516 )
2024-12-12 12:44:36 +08:00
model.py
Patch sdpa check function in specific module attributes table ( #12285 )
2024-10-29 18:41:09 +08:00
modelling_bigdl.py
Remove chatglm_C Module to Eliminate LGPL Dependency ( #11178 )
2024-05-31 17:03:11 +08:00
npu_model.py
[NPU] Support glm-edge models ( #12511 )
2024-12-09 14:06:27 +08:00
patches.py
Patch sdpa check function in specific module attributes table ( #12285 )
2024-10-29 18:41:09 +08:00
pipeline_parallel.py
Add missing arguments in pipeline parallel generate method ( #12142 )
2024-11-18 13:50:18 +08:00
qlora.py
Upgrade Peft version to 0.10.0 for LLM finetune ( #10886 )
2024-05-07 15:09:14 +08:00
relora.py
Refactor bigdl.llm to ipex_llm ( #24 )
2024-03-22 15:41:21 +08:00
speculative.py
Support performance mode of GLM4 model ( #12401 )
2024-11-18 18:46:52 +08:00
streamer.py
[LLM]Reopen autotp generate_stream ( #11120 )
2024-05-24 17:16:14 +08:00
training_patch.py
Fix error during merging adapter ( #11145 )
2024-05-27 19:41:42 +08:00
utils.py
Enable use_batch_forward Optimization on Battlemage GPU ( #12516 )
2024-12-12 12:44:36 +08:00
xpu_customize_fwd.py
Refactor bigdl.llm to ipex_llm ( #24 )
2024-03-22 15:41:21 +08:00