Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8fd2dcba86 
								
							 
						 
						
							
							
								
								Add benchmark_util for transformers >= 4.47.0 ( #12644 )  
							
							 
							
							
							
						 
						
							2025-01-03 10:48:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7d80db710e 
								
							 
						 
						
							
							
								
								Add benchmark_util for transformers >= 4.44.0 ( #12171 )  
							
							 
							
							... 
							
							
							
							* Create benchmark_util_4_45.py
* Update __init__.py
* Update lint-python
* Update benchmark_util_4_45.py
* Update benchmark_util_4_45.py
* Create benchmark_util_4_44.py 
							
						 
						
							2024-10-14 15:40:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								69c8d36f16 
								
							 
						 
						
							
							
								
								Switching from vLLM v0.3.3 to vLLM 0.5.4 ( #12042 )  
							
							 
							
							... 
							
							
							
							* Enable single card sync engine
* enable ipex-llm optimizations for vllm
* enable optimizations for lm_head
* Fix chatglm multi-reference problem
* Remove duplicate layer
* LLM: Update vLLM to v0.5.4 (#11746 )
* Enable single card sync engine
* enable ipex-llm optimizations for vllm
* enable optimizations for lm_head
* Fix chatglm multi-reference problem
* update 0.5.4 api_server
* add dockerfile
* fix
* fix
* refine
* fix
---------
Co-authored-by: gc-fu <guancheng.fu@intel.com>
* Add vllm-0.5.4 Dockerfile (#11838 )
* Update BIGDL_LLM_SDP_IGNORE_MASK in start-vllm-service.sh (#11957 )
* Fix vLLM not convert issues (#11817 ) (#11918 )
* Fix not convert issues
* refine
Co-authored-by: Guancheng Fu <110874468+gc-fu@users.noreply.github.com>
* Fix glm4-9b-chat nan error on vllm 0.5.4 (#11969 )
* init
* update mlp forward
* fix minicpm error in vllm 0.5.4
* fix dependabot alerts (#12008 )
* Update 0.5.4 dockerfile (#12021 )
* Add vllm awq loading logic (#11987 )
* [ADD] Add vllm awq loading logic
* [FIX] fix the module.linear_method path
* [FIX] fix quant_config path error
* Enable Qwen padding mlp to 256 to support batch_forward (#12030 )
* Enable padding mlp
* padding to 256
* update style
* Install 27191 runtime in 0.5.4 docker image (#12040 )
* fix rebase error
* fix rebase error
* vLLM: format for 0.5.4 rebase (#12043 )
* format
* Update model_convert.py
* Fix serving docker related modifications (#12046 )
* Fix undesired modifications (#12048 )
* fix
* Refine offline_inference arguments
---------
Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com>
Co-authored-by: Jun Wang <thoughts.times@gmail.com>
Co-authored-by: Wang, Jian4 <61138589+hzjane@users.noreply.github.com>
Co-authored-by: liu-shaojun <johnssalyn@outlook.com>
Co-authored-by: Shaojun Liu <61072813+liu-shaojun@users.noreply.github.com> 
							
						 
						
							2024-09-10 15:37:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e7f7141781 
								
							 
						 
						
							
							
								
								Add benchmark util for transformers 4.42 ( #11725 )  
							
							 
							
							... 
							
							
							
							* add new benchmark_util.py
Add new benchmark_util.py for transformers>=4.43.1. The old one renamed to benchmark_util_prev.py.
* Small fix to import code
* Update __init__.py
* fix file names
* Update lint-python
Update lint-python to exclude benchmark_util_4_29.py
benchmark_util_4_43.py
* Update benchmark_util_4_43.py
* add benchmark_util for transformers 4.42 
							
						 
						
							2024-08-07 08:48:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8fb36b9f4a 
								
							 
						 
						
							
							
								
								add new benchmark_util.py ( #11713 )  
							
							 
							
							... 
							
							
							
							* add new benchmark_util.py 
							
						 
						
							2024-08-05 16:18:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7f88ce23cd 
								
							 
						 
						
							
							
								
								add more gemma2 optimization ( #11673 )  
							
							 
							
							
							
						 
						
							2024-07-29 11:13:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3e8819734b 
								
							 
						 
						
							
							
								
								add basic gemma2 optimization ( #11672 )  
							
							 
							
							
							
						 
						
							2024-07-29 10:46:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								74950a152a 
								
							 
						 
						
							
							
								
								Fix tgi_api_server error file name ( #11075 )  
							
							 
							
							
							
						 
						
							2024-05-20 16:48:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d9f71f1f53 
								
							 
						 
						
							
							
								
								Update benchmark util for example using ( #11027 )  
							
							 
							
							... 
							
							
							
							* mv benchmark_util.py to utils/
* remove
* update 
							
						 
						
							2024-05-15 14:16:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a10f5a1b8d 
								
							 
						 
						
							
							
								
								add python style check ( #10620 )  
							
							 
							
							... 
							
							
							
							* add python style check
* fix style checks
* update runner
* add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow
* update tag to 2.1.0-SNAPSHOT 
							
						 
						
							2024-04-02 16:17:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								0193f29411 
								
							 
						 
						
							
							
								
								LLM : Enable  gguf float16 and Yuan2 model ( #10372 )  
							
							 
							
							... 
							
							
							
							* enable float16
* add yun files
* enable yun
* enable set low_bit on yuan2
* update
* update license
* update generate
* update readme
* update python style
* update 
							
						 
						
							2024-03-13 10:19:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								bf579507c2 
								
							 
						 
						
							
							
								
								Integrate vllm ( #9310 )  
							
							 
							
							... 
							
							
							
							* done
* Rename structure
* add models
* Add structure/sampling_params,sequence
* add input_metadata
* add outputs
* Add policy,logger
* add and update
* add parallelconfig back
* core/scheduler.py
* Add llm_engine.py
* Add async_llm_engine.py
* Add tested entrypoint
* fix minor error
* Fix everything
* fix kv cache view
* fix
* fix
* fix
* format&refine
* remove logger from repo
* try to add token latency
* remove logger
* Refine config.py
* finish worker.py
* delete utils.py
* add license
* refine
* refine sequence.py
* remove sampling_params.py
* finish
* add license
* format
* add license
* refine
* refine
* Refine line too long
* remove exception
* so dumb style-check
* refine
* refine
* refine
* refine
* refine
* refine
* add README
* refine README
* add warning instead error
* fix padding
* add license
* format
* format
* format fix
* Refine vllm dependency (#1 )
vllm dependency clear
* fix licence
* fix format
* fix format
* fix
* adapt LLM engine
* fix
* add license
* fix format
* fix
* Moving README.md to the correct position
* Fix readme.md
* done
* guide for adding models
* fix
* Fix README.md
* Add new model readme
* remove ray-logic
* refactor arg_utils.py
* remove distributed_init_method logic
* refactor entrypoints
* refactor input_metadata
* refactor model_loader
* refactor utils.py
* refactor models
* fix api server
* remove vllm.stucture
* revert by txy 1120
* remove utils
* format
* fix license
* add bigdl model
* Refer to a specfic commit
* Change code base
* add comments
* add async_llm_engine comment
* refine
* formatted
* add worker comments
* add comments
* add comments
* fix style
* add changes
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com> 
							
						 
						
							2023-11-23 16:46:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								0e09dd926b 
								
							 
						 
						
							
							
								
								[LLM] Fix example test ( #9118 )  
							
							 
							
							... 
							
							
							
							* Update llm example test link due to example layout change
* Add better change detect 
							
						 
						
							2023-10-10 13:24:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Song Jiaming 
								
							 
						 
						
							
							
							
							
								
							
							
								c1f9af6d97 
								
							 
						 
						
							
							
								
								[LLM] chatglm example and transformers low-bit examples ( #8751 )  
							
							 
							
							
							
						 
						
							2023-08-16 11:41:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Song Jiaming 
								
							 
						 
						
							
							
							
							
								
							
							
								e717e304a6 
								
							 
						 
						
							
							
								
								LLM first example test and template ( #8658 )  
							
							 
							
							
							
						 
						
							2023-08-10 10:03:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shengsheng Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								02c583144c 
								
							 
						 
						
							
							
								
								[LLM] langchain integrations and examples ( #8256 )  
							
							 
							
							... 
							
							
							
							* langchain intergrations and examples
* add licences and rename
* add licences
* fix license issues and change backbone to model_family
* update examples to use model_family param
* fix linting
* fix code style
* exclude langchain integration from stylecheck
* update langchain examples and update integrations based on latets changes
* update simple llama-cpp-python style API example
* remove bloom in README
* change default n_threads to 2 and remove redundant code
---------
Co-authored-by: leonardozcm <changmin.zhao@intel.com> 
							
						 
						
							2023-06-12 19:22:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								8421af51ae 
								
							 
						 
						
							
							
								
								LLM: support converting to ggml format ( #8235 )  
							
							 
							
							... 
							
							
							
							* add convert
* fix
* fix
* fix
* try
* test
* update check
* fix
* fix 
							
						 
						
							2023-05-31 15:20:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Pingchuan Ma (Henry) 
								
							 
						 
						
							
							
							
							
								
							
							
								1f913a6941 
								
							 
						 
						
							
							
								
								[LLM] Add LLM pep8 coding style checking ( #8233 )  
							
							 
							
							... 
							
							
							
							* add LLM pep8 coding checking
* resolve bugs in testing scripts and code style revision 
							
						 
						
							2023-05-30 15:58:14 +08:00