Yishuo Wang
|
be132c4209
|
fix and optimize sd (#12436)
|
2024-11-25 14:09:48 +08:00 |
|
Yishuo Wang
|
77af9bc5fa
|
support passing None to low_bit in optimize_model (#12121)
|
2024-09-26 11:09:35 +08:00 |
|
Jiao Wang
|
0a06a6e1d4
|
Update tests for transformers 4.36 (#10858)
* update unit test
* update
* update
* update
* update
* update
* fix gpu attention test
* update
* update
* update
* update
* update
* update
* update example test
* replace replit code
* update
* update
* update
* update
* set safe_serialization false
* perf test
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* delete
* update
* update
* update
* update
* update
* update
* revert
* update
|
2024-05-24 10:26:38 +08:00 |
|
Guancheng Fu
|
cbe7b5753f
|
Add vLLM[xpu] related code (#10779)
* Add ipex-llm side change
* add runable offline_inference
* refactor to call vllm2
* Verified async server
* add new v2 example
* add README
* fix
* change dir
* refactor readme.md
* add experimental
* fix
|
2024-04-18 15:29:20 +08:00 |
|
binbin Deng
|
3d561b60ac
|
LLM: add enable_xetla parameter for optimize_model API (#10753)
|
2024-04-15 12:18:25 +08:00 |
|
binbin Deng
|
fc8c7904f0
|
LLM: fix torch_dtype setting of apply fp16 optimization through optimize_model (#10556)
|
2024-03-27 14:18:45 +08:00 |
|
Wang, Jian4
|
9df70d95eb
|
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm
* rm python/llm/src/bigdl
* from bigdl.llm to from ipex_llm
|
2024-03-22 15:41:21 +08:00 |
|