Commit graph

11 commits

Author SHA1 Message Date
Wang, Jian4
b3b2cd64b4
Support lightweight-serving glm-4v-9b (#11994)
* enable glm-4v-9b serving

* update readme

* update for no image input
2024-09-05 09:25:08 +08:00
Wang, Jian4
5c4ed00593
Add lightweight-serving whisper asr example (#11847)
* add asr init

* update for pp

* update style

* update readme

* update reamde
2024-08-22 15:46:28 +08:00
Wang, Jian4
5a80fd2633
Fix lightweight-serving no streaming resp on mtl (#11822) 2024-08-16 09:43:03 +08:00
Wang, Jian4
245dba0abc
Fix lightweight-serving codegeex error (#11759) 2024-08-12 10:35:37 +08:00
Zijie Li
8fb36b9f4a
add new benchmark_util.py (#11713)
* add new benchmark_util.py
2024-08-05 16:18:48 +08:00
Wang, Jian4
493cbd9a36
Support lightweight-serving with internlm-xcomposer2-vl-7b multimodal input (#11703)
* init image_list

* enable internlm-xcomposer2 image input

* update style

* add readme

* update model

* update readme
2024-08-05 09:36:04 +08:00
Xiangyu Tian
1baa3efe0e
Optimizations for Pipeline Parallel Serving (#11702)
Optimizations for Pipeline Parallel Serving
2024-08-02 12:06:59 +08:00
Wang, Jian4
b119825152
Remove tgi parameter validation (#11688)
* remove validation

* add min warm up

* remove no need source
2024-07-30 16:37:44 +08:00
Wang, Jian4
23681fbf5c
Support codegeex4-9b for lightweight-serving (#11648)
* add options, support prompt and not return end_token

* enable openai parameter

* set do_sample None and update style
2024-07-26 09:41:03 +08:00
Wang, Jian4
1eed0635f2
Add lightweight serving and support tgi parameter (#11600)
* init tgi request

* update openai api

* update for pp

* update and add readme

* add to docker

* add start bash

* update

* update

* update
2024-07-19 13:15:56 +08:00
Wang, Jian4
9c15abf825
Refactor fastapi-serving and add one card serving(#11581)
* init fastapi-serving one card

* mv api code to source

* update worker

* update for style-check

* add worker

* update bash

* update

* update worker name and add readme

* rename update

* rename to fastapi
2024-07-17 11:12:43 +08:00