ipex-llm/python
Kai Huang 689889482c Reduce max_cache_pos to reduce Baichuan2-13B memory (#9694)
* optimize baichuan2 memory

* fix

* style

* fp16 mask

* disable fp16

* fix style

* empty cache

* revert empty cache
2023-12-26 19:51:25 +08:00
..
llm Reduce max_cache_pos to reduce Baichuan2-13B memory (#9694) 2023-12-26 19:51:25 +08:00