Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1eed0635f2 
								
							 
						 
						
							
							
								
								Add lightweight serving and support tgi parameter ( #11600 )  
							
							 
							
							... 
							
							
							
							* init tgi request
* update openai api
* update for pp
* update and add readme
* add to docker
* add start bash
* update
* update
* update 
							
						 
						
							2024-07-19 13:15:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								66f6ffe4b2 
								
							 
						 
						
							
							
								
								Update GPU HF-Transformers example structure ( #11526 )  
							
							 
							
							
							
						 
						
							2024-07-08 17:58:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ivy-lv11 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e7a4e2296f 
								
							 
						 
						
							
							
								
								Add Stable Diffusion examples on GPU and CPU ( #11166 )  
							
							 
							
							... 
							
							
							
							* add sdxl and lcm-lora
* readme
* modify
* add cpu
* add license
* modify
* add file 
							
						 
						
							2024-06-12 16:33:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								af96579c76 
								
							 
						 
						
							
							
								
								Update installation guide for pipeline parallel inference ( #11224 )  
							
							 
							
							... 
							
							
							
							* Update installation guide for pipeline parallel inference
* Small fix
* further fix
* Small fix
* Small fix
* Update based on comments
* Small fix
* Small fix
* Small fix 
							
						 
						
							2024-06-05 17:54:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7ec82c6042 
								
							 
						 
						
							
							
								
								LLM: add README.md for Long-Context examples. ( #10765 )  
							
							 
							
							... 
							
							
							
							* LLM: add readme to long-context examples.
* add precision.
* update wording.
* add GPU type.
* add Long-Context example to GPU examples.
* fix comments.
* update max input length.
* update max length.
* add output length.
* fix wording. 
							
						 
						
							2024-04-17 15:34:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								599a88db53 
								
							 
						 
						
							
							
								
								Add deepsped-autoTP-Fastapi serving ( #10748 )  
							
							 
							
							... 
							
							
							
							* add deepsped-autoTP-Fastapi serving
* add readme
* add license
* update
* update
* fix 
							
						 
						
							2024-04-16 14:03:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								16b2ef49c6 
								
							 
						 
						
							
							
								
								Update_document by heyang ( #30 )  
							
							 
							
							
							
						 
						
							2024-03-25 10:06:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								fc7f10cd12 
								
							 
						 
						
							
							
								
								add langchain gpu example ( #10277 )  
							
							 
							
							... 
							
							
							
							* first draft
* fix
* add readme for transformer_int4_gpu
* fix doc
* check device_map
* add arc ut test
* fix ut test
* fix langchain ut
* Refine README
* fix gpu mem too high
* fix ut test
---------
Co-authored-by: Ariadne <wyn2000330@126.com> 
							
						 
						
							2024-03-05 13:33:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								11fe5a87ec 
								
							 
						 
						
							
							
								
								LLM: add Modelscope model example ( #10126 )  
							
							 
							
							
							
						 
						
							2024-02-08 11:18:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								171fb2d185 
								
							 
						 
						
							
							
								
								LLM: reorganize GPU finetuning examples ( #9952 )  
							
							 
							
							
							
						 
						
							2024-01-25 19:02:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Mingyu Wei 
								
							 
						 
						
							
							
							
							
								
							
							
								bc9cff51a8 
								
							 
						 
						
							
							
								
								LLM GPU Example Update for Windows Support ( #9902 )  
							
							 
							
							... 
							
							
							
							* Update README in LLM GPU Examples
* Update reference of Intel GPU
* add cpu_embedding=True in comment
* small fixes
* update GPU/README.md and add explanation for cpu_embedding=True
* address comments
* fix small typos
* add backtick for cpu_embedding=True
* remove extra backtick in the doc
* add period mark
* update readme 
							
						 
						
							2024-01-24 13:42:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								23fc888abe 
								
							 
						 
						
							
							
								
								Update llm gpu xpu default related info to PyTorch 2.1 ( #9866 )  
							
							 
							
							
							
						 
						
							2024-01-09 15:38:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								37f509bb95 
								
							 
						 
						
							
							
								
								Update readme ( #9692 )  
							
							 
							
							
							
						 
						
							2023-12-14 19:50:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								68a4be762f 
								
							 
						 
						
							
							
								
								remove disco mixtral, update oneapi version ( #9671 )  
							
							 
							
							
							
						 
						
							2023-12-13 23:24:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								51b668f229 
								
							 
						 
						
							
							
								
								Update GGUF readme ( #9611 )  
							
							 
							
							
							
						 
						
							2023-12-06 18:21:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								963a5c8d79 
								
							 
						 
						
							
							
								
								Add vLLM-XPU version's README/examples ( #9536 )  
							
							 
							
							... 
							
							
							
							* test
* test
* fix last kv cache
* add xpu readme
* remove numactl for xpu example
* fix link error
* update max_num_batched_tokens logic
* add explaination
* add xpu environement version requirement
* refine gpu memory
* fix
* fix style 
							
						 
						
							2023-11-28 09:44:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								82898a4203 
								
							 
						 
						
							
							
								
								Update GPU example README ( #9524 )  
							
							 
							
							
							
						 
						
							2023-11-23 21:20:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								5e9962b60e 
								
							 
						 
						
							
							
								
								LLM: update example layout ( #9046 )  
							
							 
							
							
							
						 
						
							2023-10-09 15:36:39 +08:00