ipex-llm/python/llm/example/NPU
Yang Wang 51bcac1229
follow up on experimental support of fused decoder layer for llama2 (#11785)
* clean up and support transpose value cache

* refine

* fix style

* fix style
2024-08-13 18:53:55 -07:00
..
HF-Transformers-AutoModels follow up on experimental support of fused decoder layer for llama2 (#11785) 2024-08-13 18:53:55 -07:00