Ruonan Wang
|
d4f65a6033
|
LLM: add mistral speculative example (#9976)
* add mistral example
* update
|
2024-01-24 17:35:15 +08:00 |
|
Yina Chen
|
b176cad75a
|
LLM: Add baichuan2 gpu spec example (#9973)
* add baichuan2 gpu spec example
* update readme & example
* remove print
* fix typo
* meet comments
* revert
* update
|
2024-01-24 16:40:16 +08:00 |
|
Yina Chen
|
5aa4b32c1b
|
LLM: Add qwen spec gpu example (#9965)
* add qwen spec gpu example
* update readme
---------
Co-authored-by: rnwang04 <ruonan1.wang@intel.com>
|
2024-01-23 15:59:43 +08:00 |
|
Ruonan Wang
|
60b35db1f1
|
LLM: add chatglm3 speculative decoding example (#9966)
* add chatglm3 example
* update
* fix
|
2024-01-23 15:54:12 +08:00 |
|
Ruonan Wang
|
27b19106f3
|
LLM: add readme for speculative decoding gpu examples (#9961)
* add readme
* add readme
* meet code review
|
2024-01-23 12:54:19 +08:00 |
|
Ruonan Wang
|
3e601f9a5d
|
LLM: Support speculative decoding in bigdl-llm (#9951)
* first commit
* fix error, add llama example
* hidden print
* update api usage
* change to api v3
* update
* meet code review
* meet code review, fix style
* add reference, fix style
* fix style
* fix first token time
|
2024-01-22 19:14:56 +08:00 |
|