ipex-llm

History

SONG Ge 284e7697b1 [LLM] Optimize ChatGLM2 kv_cache to support beam_search on ARC (#9579 ) * optimize kv_cache to support beam_search on Arc * correctness test update * fix query_length issue * simplify implementation * only enable the optimization on gpu device * limit the beam_search support only enabled with gpu device and batch_size > 1 * add comments for beam_search case and revert ut change * meet comments * add more comments to describe the differece between multi-cases	2023-12-13 11:02:14 +08:00
..
llm	[LLM] Optimize ChatGLM2 kv_cache to support beam_search on ARC (#9579 )	2023-12-13 11:02:14 +08:00

SONG Ge 284e7697b1 [LLM] Optimize ChatGLM2 kv_cache to support beam_search on ARC (#9579 )

* optimize kv_cache to support beam_search on Arc

* correctness test update

* fix query_length issue

* simplify implementation

* only enable the optimization on gpu device

* limit the beam_search support only enabled with gpu device and batch_size > 1

* add comments for beam_search case and revert ut change

* meet comments

* add more comments to describe the differece between multi-cases

2023-12-13 11:02:14 +08:00

llm

[LLM] Optimize ChatGLM2 kv_cache to support beam_search on ARC (#9579 )

2023-12-13 11:02:14 +08:00