ipex-llm

History

Cengguang Zhang c0cd238e40 LLM: support llama2 8k input with w4a16. (#10677 ) * LLM: support llama2 8k input with w4a16. * fix comment and style. * fix style. * fix comments and split tensor to quantized attention forward. * fix style. * refactor name. * fix style. * fix style. * fix style. * refactor checker name. * refactor native sdp split qkv tensor name. * fix style. * fix comment rename variables. * fix co-exist of intermedia results.		2024-04-08 11:43:15 +08:00
..
dev	add test api transformer_int4_fp16_gpu (#10627 )	2024-04-07 15:47:17 +08:00
example	LLM: upgrade deepspeed in AutoTP on GPU (#10647 )	2024-04-07 14:05:19 +08:00
portable-zip	Migrate portable zip to ipex-llm (#10617 )	2024-04-07 13:58:58 +08:00
scripts	LLM: check user env (#10580 )	2024-03-29 17:19:34 +08:00
src/ipex_llm	LLM: support llama2 8k input with w4a16. (#10677 )	2024-04-08 11:43:15 +08:00
test	Fix llamaindex ut (#10673 )	2024-04-08 09:47:51 +08:00
.gitignore	[LLM] add chatglm pybinding binary file release (#8677 )	2023-08-04 11:45:27 +08:00
setup.py	Update pip install to use --extra-index-url for ipex package (#10557 )	2024-03-28 09:56:23 +08:00
version.txt	Update setup.py and add new actions and add compatible mode (#25 )	2024-03-22 15:44:59 +08:00