ipex-llm

History

Cengguang Zhang c0cd238e40 LLM: support llama2 8k input with w4a16. (#10677 ) * LLM: support llama2 8k input with w4a16. * fix comment and style. * fix style. * fix comments and split tensor to quantized attention forward. * fix style. * refactor name. * fix style. * fix style. * fix style. * refactor checker name. * refactor native sdp split qkv tensor name. * fix style. * fix comment rename variables. * fix co-exist of intermedia results.	2024-04-08 11:43:15 +08:00
..
llm	LLM: support llama2 8k input with w4a16. (#10677 )	2024-04-08 11:43:15 +08:00

Cengguang Zhang c0cd238e40

LLM: support llama2 8k input with w4a16. (#10677 )

* LLM: support llama2 8k input with w4a16.

* fix comment and style.

* fix style.

* fix comments and split tensor to quantized attention forward.

* fix style.

* refactor name.

* fix style.

* fix style.

* fix style.

* refactor checker name.

* refactor native sdp split qkv tensor name.

* fix style.

* fix comment rename variables.

* fix co-exist of intermedia results.

2024-04-08 11:43:15 +08:00

llm

LLM: support llama2 8k input with w4a16. (#10677 )

2024-04-08 11:43:15 +08:00