Ruonan Wang
|
0e23bd779f
|
Add support of llama3.2 for NPU C++ (#12442)
* initial support of llama3.2
* update
* update
* fix style
* fix style
* fix
* small fix
|
2024-11-26 09:26:55 +08:00 |
|
Ruonan Wang
|
b9abb8a285
|
Support qwen2.5 3B for NPU & update related examples (#12438)
* update qwen2.5-3B
* update convert
* small fix
* replace load_in_low_bit with low_bit
* small fix
|
2024-11-25 16:38:31 +08:00 |
|
Kai Huang
|
c8679ad592
|
Qwen layernorm as input (#12309)
* qwen layernorm as input
* add group size
|
2024-11-04 09:51:15 +08:00 |
|
binbin Deng
|
d409d9d0eb
|
[NPU L0] Update streaming mode of example (#12312)
|
2024-11-01 15:38:10 +08:00 |
|
binbin Deng
|
4892df61c9
|
Add qwen2-1.5b in l0 pipeline example (#12306)
|
2024-10-31 16:44:25 +08:00 |
|
Kai Huang
|
416c19165c
|
Add Qwen pipeline and example (#12292)
* support qwen pipeline
* update error msg
* style
* meet review
* minor
|
2024-10-31 11:25:25 +08:00 |
|