Yang Wang
|
51bcac1229
|
follow up on experimental support of fused decoder layer for llama2 (#11785)
* clean up and support transpose value cache
* refine
* fix style
* fix style
|
2024-08-13 18:53:55 -07:00 |
|
binbin Deng
|
23d3acdc77
|
Add experimental support of fused decoder layer for llama2 (#11768)
|
2024-08-13 14:41:36 +08:00 |
|
Jin, Qiao
|
c28b3389e6
|
Update npu multimodal example (#11773)
|
2024-08-13 14:14:59 +08:00 |
|
Jin, Qiao
|
05989ad0f9
|
Update npu example and all in one benckmark (#11766)
|
2024-08-12 16:46:46 +08:00 |
|
Jin, Qiao
|
a44ab32153
|
Switch to conhost when running on NPU (#11687)
|
2024-07-30 17:08:06 +08:00 |
|
Zhao Changmin
|
06745e5742
|
Add npu benchmark all-in-one script (#11571)
* npu benchmark
|
2024-07-15 10:42:37 +08:00 |
|
Zhao Changmin
|
b9c66994a5
|
add npu sdp (#11562)
|
2024-07-11 16:57:35 +08:00 |
|
Zhao Changmin
|
105e124752
|
optimize phi3-v encoder npu performance and add multimodal example (#11553)
* phi3-v
* readme
|
2024-07-11 13:59:14 +08:00 |
|
Zhao Changmin
|
3c16c9f725
|
Optimize baichuan on NPU (#11548)
* baichuan_npu
|
2024-07-10 13:18:48 +08:00 |
|
Zhao Changmin
|
76a5802acf
|
update NPU examples (#11540)
* update NPU examples
|
2024-07-09 17:19:42 +08:00 |
|
Yishuo Wang
|
319a3b36b2
|
fix npu llama2 (#11471)
|
2024-07-01 10:14:11 +08:00 |
|
Yishuo Wang
|
cf0f5c4322
|
change npu document (#11446)
|
2024-06-27 13:59:59 +08:00 |
|
Yishuo Wang
|
3b23de684a
|
update npu examples (#11422)
|
2024-06-25 13:32:53 +08:00 |
|
Zijie Li
|
ae452688c2
|
Add NPU HF example (#11358)
|
2024-06-19 18:07:28 +08:00 |
|