| .. |
|
__init__.py
|
optimize llama npu perf (#11426)
|
2024-06-25 17:43:20 +08:00 |
|
baichuan.py
|
fix baichuan (#11606)
|
2024-07-18 09:43:36 +08:00 |
|
baichuan_mp.py
|
[NPU] change attention_mask to fp16 (#12400)
|
2024-11-14 17:20:29 +08:00 |
|
chatglm.py
|
fix chatglm3 npu output (#11590)
|
2024-07-16 18:16:30 +08:00 |
|
chatglm4.py
|
support npu glm4 (#11539)
|
2024-07-09 15:46:49 +08:00 |
|
common.py
|
[NPU] Support Baichuan groupwise & gw code refactor (#12337)
|
2024-11-08 11:42:42 +08:00 |
|
convert.py
|
[NPU] Update C++ example with repetition_penalty & update Python code accordingly (#12528)
|
2024-12-12 13:42:55 +08:00 |
|
convert_mp.py
|
[NPU] Fix MTL and ARL support (#12580)
|
2024-12-19 16:55:30 +08:00 |
|
glm_edge.py
|
[NPU] Support glm-edge models (#12511)
|
2024-12-09 14:06:27 +08:00 |
|
kv.py
|
[NPU] Add Optimized Support for Llama3.2-1B/3B on NPU (#12339)
|
2024-11-06 19:21:40 +08:00 |
|
linear.py
|
[NPU] initial support of asym_int4_rtn (#12484)
|
2024-12-05 17:40:36 +08:00 |
|
llama.py
|
remove obselete npu code (#11967)
|
2024-08-29 14:16:44 -07:00 |
|
llama_mp.py
|
[NPU] support asym_int4 for llama (#12556)
|
2024-12-17 14:01:17 +08:00 |
|
lm_head.py
|
[NPU] initial support of asym_int4_rtn (#12484)
|
2024-12-05 17:40:36 +08:00 |
|
minicpm.py
|
add minicpm 1B/2B npu support (#11507)
|
2024-07-04 16:31:04 +08:00 |
|
minicpm_mp.py
|
[NPU] support asym_int4 for minicpm (#12567)
|
2024-12-18 10:55:35 +08:00 |
|
minicpmv_mp.py
|
Fix MiniCPM-V-2_6 running on NPU (#12486)
|
2024-12-03 16:16:29 +08:00 |
|
mistral.py
|
add mistral npu support (#11523)
|
2024-07-08 13:17:15 +08:00 |
|
mp_models_base.py
|
[NPU] further fix of new_value_states (#12538)
|
2024-12-13 13:42:00 +08:00 |
|
npu_llm_cpp.py
|
[NPU] Update C++ example with repetition_penalty & update Python code accordingly (#12528)
|
2024-12-12 13:42:55 +08:00 |
|
paraformer_mp.py
|
Fix speech_paraformer issue with unexpected changes (#12416)
|
2024-11-19 15:01:20 +08:00 |
|
phi3.py
|
add npu sdp (#11562)
|
2024-07-11 16:57:35 +08:00 |
|
phi3_v.py
|
optimize phi3-v encoder npu performance and add multimodal example (#11553)
|
2024-07-11 13:59:14 +08:00 |
|
qwen2.py
|
add qwen2 npu support (#11504)
|
2024-07-04 11:01:25 +08:00 |
|
qwen2_mp.py
|
[NPU] initial support of asym_int4_rtn (#12484)
|
2024-12-05 17:40:36 +08:00 |
|
stablelm.py
|
Optimize stablelm on NPU (#11512)
|
2024-07-05 14:21:57 +08:00 |
|
xlm_mp.py
|
Hotfix of BCE-Emdedding model (#12490)
|
2024-12-03 18:16:04 +08:00 |