ipex-llm/python/llm/example/transformers
Ruonan Wang e9aa2bd890 LLM: reduce GPU 1st token latency and update example (#8763)
* reduce 1st token latency

* update example

* fix

* fix style

* update readme of gpu benchmark
2023-08-16 18:01:23 +08:00
..
native_int4 [LLM] Unify Transformers and Native API (#8713) 2023-08-11 19:45:47 +08:00
transformers_int4 LLM: reduce GPU 1st token latency and update example (#8763) 2023-08-16 18:01:23 +08:00
transformers_low_bit [LLM] chatglm example and transformers low-bit examples (#8751) 2023-08-16 11:41:44 +08:00