Ruonan Wang
e9aa2bd890
LLM: reduce GPU 1st token latency and update example ( #8763 )
...
* reduce 1st token latency
* update example
* fix
* fix style
* update readme of gpu benchmark
2023-08-16 18:01:23 +08:00
binbin Deng
06609d9260
LLM: add qwen example on arc ( #8757 )
2023-08-16 17:11:08 +08:00
binbin Deng
97283c033c
LLM: add falcon example on arc ( #8742 )
2023-08-15 17:38:38 +08:00
binbin Deng
8c55911308
LLM: add baichuan-13B on arc example ( #8755 )
2023-08-15 15:07:04 +08:00
Ruonan Wang
d28ad8f7db
LLM: add whisper example for arc transformer int4 ( #8749 )
...
* add whisper example for arc int4
* fix
2023-08-14 17:05:48 +08:00
Ruonan Wang
faaccb64a2
LLM: add chatglm2 example for Arc ( #8741 )
...
* add chatglm2 example
* update
* fix readme
2023-08-14 10:43:08 +08:00
binbin Deng
b10d7e1adf
LLM: add mpt example on arc ( #8723 )
2023-08-14 09:40:01 +08:00
binbin Deng
e9a1afffc5
LLM: add internlm example on arc ( #8722 )
2023-08-14 09:39:39 +08:00
Shengsheng Huang
7c56c39e36
Fix GPU examples READ to use bigdl-core-xe ( #8714 )
...
* Update README.md
* Update README.md
2023-08-10 12:53:49 +08:00
Yina Chen
6d1ca88aac
add voice assistant example ( #8711 )
2023-08-10 12:42:14 +08:00
Ruonan Wang
1a7b698a83
[LLM] support ipex arc int4 & add basic llama2 example ( #8700 )
...
* first support of xpu
* make it works on gpu
update setup
update
add GPU llama2 examples
add use_optimize flag to disbale optimize for gpu
fix style
update gpu exmaple readme
fix
* update example, and update env
* fix setup to add cpp files
* replace jit with aot to avoid data leak
* rename to bigdl-core-xe
* update installation in example readme
2023-08-09 22:20:32 +08:00