dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								89069d6173 
								
							 
						 
						
							
							
								
								Add gpu gguf example ( #9603 )  
							
							 
							
							... 
							
							
							
							* add gpu gguf example
* some fixes
* address kai's comments
* address json's comments 
							
						 
						
							2023-12-06 15:17:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								0e8f4020e5 
								
							 
						 
						
							
							
								
								Add traceback error output for win igpu test api in benchmark ( #9607 )  
							
							 
							
							
							
						 
						
							2023-12-06 14:35:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ziteng Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								aeb77b2ab1 
								
							 
						 
						
							
							
								
								Add minimum Qwen model version ( #9606 )  
							
							 
							
							
							
						 
						
							2023-12-06 11:49:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								c998f5f2ba 
								
							 
						 
						
							
							
								
								[LLM] iGPU long context tests ( #9598 )  
							
							 
							
							... 
							
							
							
							* Temp enable PR
* Enable tests for 256-64
* Try again 128-64
* Empty cache after each iteration for igpu benchmark scripts
* Try tests for 512
* change order for 512
* Skip chatglm3 and llama2 for now
* Separate tests for 512-64
* Small fix
* Further fixes
* Change back to nightly again 
							
						 
						
							2023-12-06 10:19:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								4e70e33934 
								
							 
						 
						
							
							
								
								[LLM] code and document for distributed qlora ( #9585 )  
							
							 
							
							... 
							
							
							
							* [LLM] code and document for distributed qlora
* doc
* refine for gradient checkpoint
* refine
* Update alpaca_qlora_finetuning_cpu.py
* Update alpaca_qlora_finetuning_cpu.py
* Update alpaca_qlora_finetuning_cpu.py
* add link in doc 
							
						 
						
							2023-12-06 09:23:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zheng, Yi 
								
							 
						 
						
							
							
							
							
								
							
							
								d154b38bf9 
								
							 
						 
						
							
							
								
								Add llama2 gpu low memory example ( #9514 )  
							
							 
							
							... 
							
							
							
							* Add low memory example
* Minor fixes
* Update readme.md 
							
						 
						
							2023-12-05 17:29:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								06febb5fa7 
								
							 
						 
						
							
							
								
								Update readme for FP8/FP4 inference examples ( #9601 )  
							
							 
							
							
							
						 
						
							2023-12-05 15:59:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								a66fbedd7e 
								
							 
						 
						
							
							
								
								add gpu more data types example ( #9592 )  
							
							 
							
							... 
							
							
							
							* add gpu more data types example
* add int8 
							
						 
						
							2023-12-05 15:45:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ziteng Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								65934c9f4f 
								
							 
						 
						
							
							
								
								[LLM] Fix Qwen causal_mask and attention_mask size mismatching ( #9600 )  
							
							 
							
							... 
							
							
							
							* Fix  #9582  , caused by Qwen modified modeling_qwen.py 7f62181c94 (d2h-049182) 
							
						 
						
							2023-12-05 15:15:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinyi Wan 
								
							 
						 
						
							
							
							
							
								
							
							
								b721138132 
								
							 
						 
						
							
							
								
								Add cpu and gpu examples for BlueLM ( #9589 )  
							
							 
							
							... 
							
							
							
							* Add cpu int4 example for BlueLM
* addexample optimize_model cpu for bluelm
* add example gpu int4 blueLM
* add example optimiza_model GPU for bluelm
* Fixing naming issues and BigDL package version.
* Fixing naming issues...
* Add BlueLM in README.md "Verified Models" 
							
						 
						
							2023-12-05 13:59:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								8b00653039 
								
							 
						 
						
							
							
								
								fix doc ( #9599 )  
							
							 
							
							
							
						 
						
							2023-12-05 13:49:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
							
							
								
							
							
								f211f136b6 
								
							 
						 
						
							
							
								
								Configurable TORCH_LINEAR_THRESHOLD from env ( #9588 )  
							
							 
							
							... 
							
							
							
							* Add TORCH_LINEAR_THRESHOLD from env (BIGDL_LLM_LINEAR_THRESHOLD)
* Change default to 512 
							
						 
						
							2023-12-05 13:19:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								1012507a40 
								
							 
						 
						
							
							
								
								[LLM] Fix performance tests ( #9596 )  
							
							 
							
							... 
							
							
							
							* Fix missing key for cpu_embedding
* Remove 512 as it stuck for now
* Small fix 
							
						 
						
							2023-12-05 10:59:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								8c8a27ded7 
								
							 
						 
						
							
							
								
								Add harness summary job ( #9457 )  
							
							 
							
							... 
							
							
							
							* format yml
* add make_table_results
* add summary job
* add a job to print single result
* upload full directory 
							
						 
						
							2023-12-05 10:04:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								3f4ad97929 
								
							 
						 
						
							
							
								
								[LLM] Add performance tests for windows iGPU ( #9584 )  
							
							 
							
							... 
							
							
							
							* Add support for win gpu benchmark with peak gpu memory monitoring
* Add win igpu tests
* Small fix
* Forward outputs
* Small fix
* Test and small fixes
* Small fix
* Small fix and test
* Small fixes
* Add tests for 512-64 and change back to nightly tests
* Small fix 
							
						 
						
							2023-12-04 20:50:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								29d5bb8df4 
								
							 
						 
						
							
							
								
								Harness workflow dispatch ( #9591 )  
							
							 
							
							... 
							
							
							
							* add set-matrix job
* add workflow_dispatch
* fix context
* fix manual run
* rename step
* add quotes
* add runner option
* not required labels
* add runner label to output
* use double quote 
							
						 
						
							2023-12-04 15:53:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								9557aa9c21 
								
							 
						 
						
							
							
								
								Fix harness nightly ( #9586 )  
							
							 
							
							... 
							
							
							
							* update golden
* loose the restriction of diff
* only compare results when scheduled 
							
						 
						
							2023-12-04 11:45:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
							
							
								
							
							
								5c03651309 
								
							 
						 
						
							
							
								
								[LLM] vLLM: Add Preempt for scheduler ( #9568 )  
							
							 
							
							... 
							
							
							
							Implement Preempt_by_recompute method for vllm. 
							
						 
						
							2023-12-03 20:16:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								f7e596d85a 
								
							 
						 
						
							
							
								
								Update doc ( #9580 )  
							
							 
							
							... 
							
							
							
							* update aiohttp in docs
* update doc 
							
						 
						
							2023-12-01 15:40:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								5de92090b3 
								
							 
						 
						
							
							
								
								try to fix deps installation of bigdl ( #9578 )  
							
							 
							
							
							
						 
						
							2023-12-01 15:25:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								cb228c70ea 
								
							 
						 
						
							
							
								
								Add harness nightly ( #9552 )  
							
							 
							
							... 
							
							
							
							* modify output_path as a directory
* schedule nightly at 21 on Friday
* add tasks and models for nightly
* add accuracy regression
* comment out if to test
* mixed fp4
* for test
* add  missing delimiter
* remove comma
* fixed golden results
* add mixed 4 golden result
* add more options
* add mistral results
* get golden result of stable lm
* move nightly scripts and results to test folder
* add license
* add fp8 stable lm golden
* run on all available devices
* trigger only when ready for review
* fix new line
* update golden
* add mistral 
							
						 
						
							2023-12-01 14:16:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								4d7d5d4c59 
								
							 
						 
						
							
							
								
								Add 3 leaderboard tasks ( #9566 )  
							
							 
							
							... 
							
							
							
							* update leaderboard map
* download model and dataset without overwritten
* fix task drop
* run on all available devices 
							
						 
						
							2023-12-01 14:01:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								74fd7077a2 
								
							 
						 
						
							
							
								
								[LLM] Multi-process and distributed QLoRA on CPU platform ( #9491 )  
							
							 
							
							... 
							
							
							
							* [LLM] Multi-process and distributed QLoRA on CPU platform
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* enable llm-init and bind to socket
* refine
* Update Dockerfile
* add all files of qlora cpu example to /bigdl
* fix
* fix k8s
* Update bigdl-qlora-finetuing-entrypoint.sh
* Update bigdl-qlora-finetuing-entrypoint.sh
* Update bigdl-qlora-finetuning-job.yaml
* fix train sync and performance issues
* add node affinity
* disable user to tune cpu per pod
* Update bigdl-qlora-finetuning-job.yaml 
							
						 
						
							2023-12-01 13:47:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								ed0dc57c6e 
								
							 
						 
						
							
							
								
								LLM: Add cpu qlora support other models guide ( #9567 )  
							
							 
							
							... 
							
							
							
							* use bf16 flag
* add using baichuan model
* update merge
* remove
* update 
							
						 
						
							2023-12-01 11:18:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								bda404fc8f 
								
							 
						 
						
							
							
								
								Update readme ( #9575 )  
							
							 
							
							
							
						 
						
							2023-11-30 22:45:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								69c49d21f5 
								
							 
						 
						
							
							
								
								use fused rms norm ( #9572 )  
							
							 
							
							... 
							
							
							
							* use fused rms norm
* meet code review 
							
						 
						
							2023-11-30 21:47:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Lilac09 
								
							 
						 
						
							
							
							
							
								
							
							
								b785376f5c 
								
							 
						 
						
							
							
								
								Add vllm-example to docker inference image ( #9570 )  
							
							 
							
							... 
							
							
							
							* add vllm-serving to cpu image
* add vllm-serving to cpu image
* add vllm-serving 
							
						 
						
							2023-11-30 17:04:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								66f5b45f57 
								
							 
						 
						
							
							
								
								[LLM] add a llama2 gguf example ( #9553 )  
							
							 
							
							
							
						 
						
							2023-11-30 16:37:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								7f6465518a 
								
							 
						 
						
							
							
								
								support loading llama tokenizer from gguf model ( #9565 )  
							
							 
							
							
							
						 
						
							2023-11-30 14:56:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Lilac09 
								
							 
						 
						
							
							
							
							
								
							
							
								2554ba0913 
								
							 
						 
						
							
							
								
								Add usage of vllm ( #9564 )  
							
							 
							
							... 
							
							
							
							* add usage of vllm
* add usage of vllm
* add usage of vllm
* add usage of vllm
* add usage of vllm
* add usage of vllm 
							
						 
						
							2023-11-30 14:19:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								a0a80d232e 
								
							 
						 
						
							
							
								
								LLM: Add qlora cpu distributed readme ( #9561 )  
							
							 
							
							... 
							
							
							
							* init readme
* add distributed guide
* update 
							
						 
						
							2023-11-30 13:42:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								c8e0c2ed48 
								
							 
						 
						
							
							
								
								Fixed dumped logs in harness ( #9549 )  
							
							 
							
							... 
							
							
							
							* install transformers==4.34.0
* modify output_path as a directory
* add device and task to output dir parents 
							
						 
						
							2023-11-30 12:47:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
							
							
								
							
							
								d85a430a8c 
								
							 
						 
						
							
							
								
								Uing bigdl-llm-init instead of bigdl-nano-init ( #9558 )  
							
							 
							
							... 
							
							
							
							* Replace `bigdl-nano-init` with `bigdl-llm-init`.
* Install `bigdl-llm` instead of `bigdl-nano`.
* Remove nano in README. 
							
						 
						
							2023-11-30 10:10:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								34503efa6a 
								
							 
						 
						
							
							
								
								Fix cpu pinned embedding ( #9556 )  
							
							 
							
							
							
						 
						
							2023-11-29 18:27:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Lilac09 
								
							 
						 
						
							
							
							
							
								
							
							
								557bb6bbdb 
								
							 
						 
						
							
							
								
								add judgement for running serve ( #9555 )  
							
							 
							
							
							
						 
						
							2023-11-29 16:57:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								4ff2ca9d0d 
								
							 
						 
						
							
							
								
								LLM: fix loss error on Arc ( #9550 )  
							
							 
							
							
							
						 
						
							2023-11-29 15:16:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								65121c7997 
								
							 
						 
						
							
							
								
								support loading q4_1/q5_0/q5_1/q8_0 gguf model ( #9546 )  
							
							 
							
							
							
						 
						
							2023-11-29 14:40:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								b824754256 
								
							 
						 
						
							
							
								
								LLM: Update for cpu qlora mpirun ( #9548 )  
							
							 
							
							
							
						 
						
							2023-11-29 10:56:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								5f5ca38b74 
								
							 
						 
						
							
							
								
								[LLM Doc] Fix api doc rendering error ( #9542 )  
							
							 
							
							... 
							
							
							
							* Fix api rendering error
* Fix python style 
							
						 
						
							2023-11-29 09:17:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								a86c6e0b56 
								
							 
						 
						
							
							
								
								[LLM] support loading gguf model ( #9544 )  
							
							 
							
							
							
						 
						
							2023-11-28 15:51:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								32b37f3af7 
								
							 
						 
						
							
							
								
								Update gpu install.md ( #9541 )  
							
							 
							
							... 
							
							
							
							* Update install_gpu.md
* Update install_gpu.md 
							
						 
						
							2023-11-28 11:15:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
							
							
								
							
							
								916c338772 
								
							 
						 
						
							
							
								
								fix bugs in vllm length check ( #9543 )  
							
							 
							
							
							
						 
						
							2023-11-28 11:09:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								5098bc3544 
								
							 
						 
						
							
							
								
								LLM: enable previous models ( #9505 )  
							
							 
							
							... 
							
							
							
							* enable previous models
* test mistral model
* for test
* run models separately
* test all models
* for test
* revert the llm_performance_test.yaml 
							
						 
						
							2023-11-28 10:21:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
							
							
								
							
							
								e7e0cd3b5e 
								
							 
						 
						
							
							
								
								CPU Pinned embedding Layer ( #9538 )  
							
							 
							
							... 
							
							
							
							* CPU Pinned embedding 
							
						 
						
							2023-11-28 09:46:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								963a5c8d79 
								
							 
						 
						
							
							
								
								Add vLLM-XPU version's README/examples ( #9536 )  
							
							 
							
							... 
							
							
							
							* test
* test
* fix last kv cache
* add xpu readme
* remove numactl for xpu example
* fix link error
* update max_num_batched_tokens logic
* add explaination
* add xpu environement version requirement
* refine gpu memory
* fix
* fix style 
							
						 
						
							2023-11-28 09:44:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								b6c3520748 
								
							 
						 
						
							
							
								
								Remove xformers from vLLM-CPU ( #9535 )  
							
							 
							
							
							
						 
						
							2023-11-27 11:21:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								2b9c7d2a59 
								
							 
						 
						
							
							
								
								LLM: quick fix alpaca qlora finetuning script ( #9534 )  
							
							 
							
							
							
						 
						
							2023-11-27 11:04:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								11fa3de290 
								
							 
						 
						
							
							
								
								Add sutup support of win gpu for bigdl-llm ( #9512 )  
							
							 
							
							
							
						 
						
							2023-11-24 17:49:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								45820cf3b9 
								
							 
						 
						
							
							
								
								add optimize model option ( #9530 )  
							
							 
							
							
							
						 
						
							2023-11-24 17:10:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								6bec0faea5 
								
							 
						 
						
							
							
								
								LLM: support Mistral AWQ models ( #9520 )  
							
							 
							
							
							
						 
						
							2023-11-24 16:20:22 +08:00