ipex-llm

History

Ruonan Wang e9aa2bd890 LLM: reduce GPU 1st token latency and update example (#8763 ) * reduce 1st token latency * update example * fix * fix style * update readme of gpu benchmark		2023-08-16 18:01:23 +08:00
..
native_int4	[LLM] Unify Transformers and Native API (#8713 )	2023-08-11 19:45:47 +08:00
transformers_int4	LLM: reduce GPU 1st token latency and update example (#8763 )	2023-08-16 18:01:23 +08:00
transformers_low_bit	[LLM] chatglm example and transformers low-bit examples (#8751 )	2023-08-16 11:41:44 +08:00