ipex-llm/python/llm
Jun Wang 1efb6ebe93
[ADD] add transformer_int4_fp16_loadlowbit_gpu_win api (#11511)
* [ADD] add transformer_int4_fp16_loadlowbit_gpu_win api

* [UPDATE] add int4_fp16_lowbit config and description

* [FIX] fix run.py mistake

* [FIX] fix run.py mistake

* [FIX] fix indent; change dtype=float16 to model.half()
2024-07-05 16:38:41 +08:00
..
dev [ADD] add transformer_int4_fp16_loadlowbit_gpu_win api (#11511) 2024-07-05 16:38:41 +08:00
example LLM: Partial Prefilling for Pipeline Parallel Serving (#11457) 2024-07-05 13:10:35 +08:00
portable-zip Fix null pointer dereferences error. (#11125) 2024-05-30 16:16:10 +08:00
scripts Miniconda/Anaconda -> Miniforge update in examples (#11194) 2024-06-04 10:14:02 +08:00
src/ipex_llm Clean npu dtype branch (#11515) 2024-07-05 15:45:26 +08:00
test [REMOVE] remove all useless repo-id in benchmark/igpu-perf (#11508) 2024-07-04 16:38:34 +08:00
tpp OSPDT: add tpp licenses (#11165) 2024-06-06 10:59:06 +08:00
.gitignore [LLM] add chatglm pybinding binary file release (#8677) 2023-08-04 11:45:27 +08:00
setup.py Upgrade accelerate to 0.23.0 (#11331) 2024-06-17 15:03:11 +08:00
version.txt Update setup.py and add new actions and add compatible mode (#25) 2024-03-22 15:44:59 +08:00