Yina Chen
								
							 
						 | 
						
							
							
							
							
								
							
							
								77be19bb97
								
							
						 | 
						
							
							
								
								LLM: Support gpt-j in speculative decoding (#10067)
							
							
							
							
							
							
							
							* gptj
* support gptj in speculative decoding
* fix
* update readme
* small fix 
							
						 | 
						
							2024-02-02 14:54:55 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
							
							
								
							
							
								093e6f8f73
								
							
						 | 
						
							
							
								
								LLM: Add qwen CPU speculative example (#9985)
							
							
							
							
							
							
							
							* init from gpu
* update for cpu
* update
* update
* fix xpu readme
* update
* update example prompt
* update prompt and add 72b
* update
* update 
							
						 | 
						
							2024-01-25 17:01:34 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
							
							
								
							
							
								99ff6cf048
								
							
						 | 
						
							
							
								
								Update gpu spec decoding baichuan2 example dependency (#9990)
							
							
							
							
							
							
							
							* add dependency
* update
* update 
							
						 | 
						
							2024-01-25 11:05:04 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
							
							
								
							
							
								3bc3d0bbcd
								
							
						 | 
						
							
							
								
								Update self-speculative readme (#9986)
							
							
							
							
							
						 | 
						
							2024-01-24 22:37:32 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								d4f65a6033
								
							
						 | 
						
							
							
								
								LLM: add mistral speculative example (#9976)
							
							
							
							
							
							
							
							* add mistral example
* update 
							
						 | 
						
							2024-01-24 17:35:15 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
							
							
								
							
							
								b176cad75a
								
							
						 | 
						
							
							
								
								LLM: Add baichuan2 gpu spec example (#9973)
							
							
							
							
							
							
							
							* add baichuan2 gpu spec example
* update readme & example
* remove print
* fix typo
* meet comments
* revert
* update 
							
						 | 
						
							2024-01-24 16:40:16 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
							
							
								
							
							
								5aa4b32c1b
								
							
						 | 
						
							
							
								
								LLM: Add qwen spec gpu example (#9965)
							
							
							
							
							
							
							
							* add qwen spec gpu example
* update readme
---------
Co-authored-by: rnwang04 <ruonan1.wang@intel.com> 
							
						 | 
						
							2024-01-23 15:59:43 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								60b35db1f1
								
							
						 | 
						
							
							
								
								LLM: add chatglm3 speculative decoding example (#9966)
							
							
							
							
							
							
							
							* add chatglm3 example
* update
* fix 
							
						 | 
						
							2024-01-23 15:54:12 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								27b19106f3
								
							
						 | 
						
							
							
								
								LLM: add readme for speculative decoding gpu examples (#9961)
							
							
							
							
							
							
							
							* add readme
* add readme
* meet code review 
							
						 | 
						
							2024-01-23 12:54:19 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								3e601f9a5d
								
							
						 | 
						
							
							
								
								LLM: Support speculative decoding in bigdl-llm (#9951)
							
							
							
							
							
							
							
							* first commit
* fix error, add llama example
* hidden print
* update api usage
* change to api v3
* update
* meet code review
* meet code review, fix style
* add reference, fix style
* fix style
* fix first token time 
							
						 | 
						
							2024-01-22 19:14:56 +08:00 | 
						
						
							
							
							
								
							
							
						 |