* add more model exampels for pipelien parallel inference * add mixtral and vicuna models * add yi model and past_kv supprot for chatglm family * add docs * doc update * add license * update |
||
|---|---|---|
| .. | ||
| CPU | ||
| GPU | ||
| NPU/HF-Transformers-AutoModels/Model/llama2 | ||