Commit graph

19 commits

Author SHA1 Message Date
Yuwen Hu
d11f257ee7
Add GPU example for MiniCPM-o-2_6 (#12735)
* Add init example for omni mode

* Small fix

* Small fix

* Add chat example

* Remove lagecy link

* Further update link

* Add readme

* Small fix

* Update main readme link

* Update based on comments

* Small fix

* Small fix

* Small fix
2025-01-23 16:10:19 +08:00
Wang, Jian4
1eed0635f2
Add lightweight serving and support tgi parameter (#11600)
* init tgi request

* update openai api

* update for pp

* update and add readme

* add to docker

* add start bash

* update

* update

* update
2024-07-19 13:15:56 +08:00
binbin Deng
66f6ffe4b2
Update GPU HF-Transformers example structure (#11526) 2024-07-08 17:58:06 +08:00
ivy-lv11
e7a4e2296f
Add Stable Diffusion examples on GPU and CPU (#11166)
* add sdxl and lcm-lora

* readme

* modify

* add cpu

* add license

* modify

* add file
2024-06-12 16:33:25 +08:00
Yuwen Hu
af96579c76
Update installation guide for pipeline parallel inference (#11224)
* Update installation guide for pipeline parallel inference

* Small fix

* further fix

* Small fix

* Small fix

* Update based on comments

* Small fix

* Small fix

* Small fix
2024-06-05 17:54:29 +08:00
Cengguang Zhang
7ec82c6042
LLM: add README.md for Long-Context examples. (#10765)
* LLM: add readme to long-context examples.

* add precision.

* update wording.

* add GPU type.

* add Long-Context example to GPU examples.

* fix comments.

* update max input length.

* update max length.

* add output length.

* fix wording.
2024-04-17 15:34:59 +08:00
ZehuaCao
599a88db53
Add deepsped-autoTP-Fastapi serving (#10748)
* add deepsped-autoTP-Fastapi serving

* add readme

* add license

* update

* update

* fix
2024-04-16 14:03:23 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang (#30) 2024-03-25 10:06:02 +08:00
dingbaorong
fc7f10cd12 add langchain gpu example (#10277)
* first draft

* fix

* add readme for transformer_int4_gpu

* fix doc

* check device_map

* add arc ut test

* fix ut test

* fix langchain ut

* Refine README

* fix gpu mem too high

* fix ut test

---------

Co-authored-by: Ariadne <wyn2000330@126.com>
2024-03-05 13:33:57 +08:00
binbin Deng
11fe5a87ec LLM: add Modelscope model example (#10126) 2024-02-08 11:18:07 +08:00
binbin Deng
171fb2d185 LLM: reorganize GPU finetuning examples (#9952) 2024-01-25 19:02:38 +08:00
Mingyu Wei
bc9cff51a8 LLM GPU Example Update for Windows Support (#9902)
* Update README in LLM GPU Examples

* Update reference of Intel GPU

* add cpu_embedding=True in comment

* small fixes

* update GPU/README.md and add explanation for cpu_embedding=True

* address comments

* fix small typos

* add backtick for cpu_embedding=True

* remove extra backtick in the doc

* add period mark

* update readme
2024-01-24 13:42:27 +08:00
Yuwen Hu
23fc888abe Update llm gpu xpu default related info to PyTorch 2.1 (#9866) 2024-01-09 15:38:47 +08:00
Jason Dai
37f509bb95 Update readme (#9692) 2023-12-14 19:50:21 +08:00
binbin Deng
68a4be762f remove disco mixtral, update oneapi version (#9671) 2023-12-13 23:24:59 +08:00
Jason Dai
51b668f229 Update GGUF readme (#9611) 2023-12-06 18:21:54 +08:00
Guancheng Fu
963a5c8d79 Add vLLM-XPU version's README/examples (#9536)
* test

* test

* fix last kv cache

* add xpu readme

* remove numactl for xpu example

* fix link error

* update max_num_batched_tokens logic

* add explaination

* add xpu environement version requirement

* refine gpu memory

* fix

* fix style
2023-11-28 09:44:03 +08:00
Jason Dai
82898a4203 Update GPU example README (#9524) 2023-11-23 21:20:26 +08:00
binbin Deng
5e9962b60e LLM: update example layout (#9046) 2023-10-09 15:36:39 +08:00