ipex-llm/python
Wang, Jian4 9bff84e6fd LLM: Convert draft_model kv_cache from bf16 to fp32 (#9964)
* convert bf16 to fp32

* update

* change when init

* init first and cut off after

* init and exchange

* update python type

* update

* fix bug

* update

* update
2024-01-25 11:20:27 +08:00
..
llm LLM: Convert draft_model kv_cache from bf16 to fp32 (#9964) 2024-01-25 11:20:27 +08:00