Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								439c834ed3 
								
							 
						 
						
							
							
								
								LLM: add mixed precision for lm_head ( #10795 )  
							
							 
							
							... 
							
							
							
							* add mixed_quantization
* meet code review
* update
* fix style
* meet review 
							
						 
						
							2024-04-18 19:11:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8796401b08 
								
							 
						 
						
							
							
								
								Support q4k in ipex-llm ( #10796 )  
							
							 
							
							... 
							
							
							
							* support q4k
* update 
							
						 
						
							2024-04-18 18:55:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0e8aac19e3 
								
							 
						 
						
							
							
								
								add q6k precision in ipex-llm ( #10792 )  
							
							 
							
							... 
							
							
							
							* add q6k
* add initial 16k
* update
* fix style 
							
						 
						
							2024-04-18 16:52:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e90e31719f 
								
							 
						 
						
							
							
								
								axolotl lora example ( #10789 )  
							
							 
							
							... 
							
							
							
							* Add axolotl lora example
* Modify readme
* Add comments in yml 
							
						 
						
							2024-04-18 16:38:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								14ca42a048 
								
							 
						 
						
							
							
								
								LLM:Fix moe indexs error on cpu ( #10791 )  
							
							 
							
							
							
						 
						
							2024-04-18 15:56:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cbe7b5753f 
								
							 
						 
						
							
							
								
								Add vLLM[xpu] related code ( #10779 )  
							
							 
							
							... 
							
							
							
							* Add ipex-llm side change
* add runable offline_inference
* refactor to call vllm2
* Verified async server
* add new v2 example
* add README
* fix
* change dir
* refactor readme.md
* add experimental
* fix 
							
						 
						
							2024-04-18 15:29:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								053ec30737 
								
							 
						 
						
							
							
								
								Transformers ppl evaluation on wikitext ( #10784 )  
							
							 
							
							... 
							
							
							
							* tranformers code
* cache 
							
						 
						
							2024-04-18 15:27:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								209c3501e6 
								
							 
						 
						
							
							
								
								LLM: Optimize qwen1.5 moe model ( #10706 )  
							
							 
							
							... 
							
							
							
							* update moe block
* fix style
* enable optmize MLP
* enabel kv_cache
* enable fuse rope
* enable fused qkv
* enable flash_attention
* error sdp quantize
* use old api
* use fuse
* use xetla
* fix python style
* update moe_blocks num
* fix output error
* add cpu sdpa
* update
* update
* update 
							
						 
						
							2024-04-18 14:54:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ziteng Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ff040c8f01 
								
							 
						 
						
							
							
								
								LISA Finetuning Example ( #10743 )  
							
							 
							
							... 
							
							
							
							* enabling xetla only supports qtype=SYM_INT4 or FP8E5
* LISA Finetuning Example on gpu
* update readme
* add licence
* Explain parameters of lisa & Move backend codes to src dir
* fix style
* fix style
* update readme
* support chatglm
* fix style
* fix style
* update readme
* fix 
							
						 
						
							2024-04-18 13:48:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								581ebf6104 
								
							 
						 
						
							
							
								
								GaLore Finetuning Example ( #10722 )  
							
							 
							
							... 
							
							
							
							* GaLore Finetuning Example
* Update README.md
* Update README.md
* change data to HuggingFaceH4/helpful_instructions
* Update README.md
* Update README.md
* shrink train size and delete cache before starting training to save memory
* Update README.md
* Update galore_finetuning.py
* change model to llama2 3b
* Update README.md 
							
						 
						
							2024-04-18 13:47:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								952e517db9 
								
							 
						 
						
							
							
								
								use config rope_theta ( #10787 )  
							
							 
							
							... 
							
							
							
							* use config rope_theta
* fix style 
							
						 
						
							2024-04-17 20:39:11 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								31ea2f9a9f 
								
							 
						 
						
							
							
								
								Fix wrong output for Llama models on CPU ( #10742 )  
							
							 
							
							
							
						 
						
							2024-04-18 11:07:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e764f9b1b1 
								
							 
						 
						
							
							
								
								Disable fast fused rope on UHD  ( #10780 )  
							
							 
							
							... 
							
							
							
							* use decoding fast path
* update
* update
* cleanup 
							
						 
						
							2024-04-18 10:03:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ea5b373a97 
								
							 
						 
						
							
							
								
								Add lookahead GPU example ( #10785 )  
							
							 
							
							... 
							
							
							
							* Add lookahead example
* fix style & attn mask
* fix typo
* address comments 
							
						 
						
							2024-04-17 17:41:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a20271ffe4 
								
							 
						 
						
							
							
								
								LLM: Fix yi-6b fp16 error on pvc ( #10781 )  
							
							 
							
							... 
							
							
							
							* updat for yi fp16
* update
* update 
							
						 
						
							2024-04-17 16:49:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0646e2c062 
								
							 
						 
						
							
							
								
								Fix short prompt for IPEX_CPU speculative decoding cause no_attr error ( #10783 )  
							
							 
							
							
							
						 
						
							2024-04-17 16:19:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7ec82c6042 
								
							 
						 
						
							
							
								
								LLM: add README.md for Long-Context examples. ( #10765 )  
							
							 
							
							... 
							
							
							
							* LLM: add readme to long-context examples.
* add precision.
* update wording.
* add GPU type.
* add Long-Context example to GPU examples.
* fix comments.
* update max input length.
* update max length.
* add output length.
* fix wording. 
							
						 
						
							2024-04-17 15:34:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								766fe45222 
								
							 
						 
						
							
							
								
								Fix spec error caused by lookup pr ( #10777 )  
							
							 
							
							... 
							
							
							
							* Fix spec error
* remove
* fix style 
							
						 
						
							2024-04-17 11:27:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9e5069437f 
								
							 
						 
						
							
							
								
								Fix gradio version in axolotl example ( #10776 )  
							
							 
							
							... 
							
							
							
							* Change to gradio>=4.19.2 
							
						 
						
							2024-04-17 10:23:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f2e923b3ca 
								
							 
						 
						
							
							
								
								Axolotl v0.4.0 support  ( #10773 )  
							
							 
							
							... 
							
							
							
							* Add Axolotl 0.4.0, remove legacy 0.3.0 support.
* replace is_torch_bf16_gpu_available
* Add HF_HUB_OFFLINE=1
* Move transformers out of requirement
* Refine readme and qlora.yml 
							
						 
						
							2024-04-17 09:49:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								26cae0a39c 
								
							 
						 
						
							
							
								
								Update FLEX in Deepspeed README ( #10774 )  
							
							 
							
							... 
							
							
							
							* Update FLEX in Deepspeed README
* Update README.md 
							
						 
						
							2024-04-17 09:28:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wenjing Margaret Mao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c41730e024 
								
							 
						 
						
							
							
								
								edit 'ppl_result does not exist' issue, delete useless code ( #10767 )  
							
							 
							
							... 
							
							
							
							* edit ppl_result not exist issue, delete useless code
* delete nonzero_min function
---------
Co-authored-by: jenniew <jenniewang123@gmail.com> 
							
						 
						
							2024-04-16 18:11:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								899d392e2f 
								
							 
						 
						
							
							
								
								Support prompt lookup in ipex-llm ( #10768 )  
							
							 
							
							... 
							
							
							
							* lookup init
* add lookup
* fix style
* remove redundant code
* change param name
* fix style 
							
						 
						
							2024-04-16 16:52:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d30b22a81b 
								
							 
						 
						
							
							
								
								Refine axolotl 0.3.0 documents and links ( #10764 )  
							
							 
							
							... 
							
							
							
							* Refine axolotl 0.3 based on comments
* Rename requirements to requirement-xpu
* Add comments for paged_adamw_32bit
* change lora_r from 8 to 16 
							
						 
						
							2024-04-16 14:47:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								599a88db53 
								
							 
						 
						
							
							
								
								Add deepsped-autoTP-Fastapi serving ( #10748 )  
							
							 
							
							... 
							
							
							
							* add deepsped-autoTP-Fastapi serving
* add readme
* add license
* update
* update
* fix 
							
						 
						
							2024-04-16 14:03:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0a62933d36 
								
							 
						 
						
							
							
								
								LLM: fix qwen AutoTP ( #10766 )  
							
							 
							
							
							
						 
						
							2024-04-16 09:56:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3e2662c87e 
								
							 
						 
						
							
							
								
								LLM: fix get env KV_CACHE_ALLOC_BLOCK_LENGTH type. ( #10771 )  
							
							 
							
							
							
						 
						
							2024-04-16 09:32:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								73a67804a4 
								
							 
						 
						
							
							
								
								GPU configuration update for examples (windows pip installer, etc.) ( #10762 )  
							
							 
							
							... 
							
							
							
							* renew chatglm3-6b gpu example readme
fix
fix
fix
* fix for comments
* fix
* fix
* fix
* fix
* fix
* apply on HF-Transformers-AutoModels
* apply on PyTorch-Models
* fix
* fix 
							
						 
						
							2024-04-15 17:42:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									yb-peng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b5209d3ec1 
								
							 
						 
						
							
							
								
								Update example/GPU/PyTorch-Models/Model/llava/README.md ( #10757 )  
							
							 
							
							... 
							
							
							
							* Update example/GPU/PyTorch-Models/Model/llava/README.md
* Update README.md
fix path in windows installation 
							
						 
						
							2024-04-15 13:01:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3d561b60ac 
								
							 
						 
						
							
							
								
								LLM: add enable_xetla parameter for optimize_model API ( #10753 )  
							
							 
							
							
							
						 
						
							2024-04-15 12:18:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a9a6b6b7af 
								
							 
						 
						
							
							
								
								Fix baichuan-13b issue on portable zip under transformers 4.36 ( #10746 )  
							
							 
							
							... 
							
							
							
							* fix baichuan-13b issue
* update
* update 
							
						 
						
							2024-04-12 16:27:01 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9e668a5bf0 
								
							 
						 
						
							
							
								
								fix_internlm-chat-7b-8k repo name in examples ( #10747 )  
							
							 
							
							
							
						 
						
							2024-04-12 10:15:48 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c3fc8f4b90 
								
							 
						 
						
							
							
								
								LLM: add bs limitation for llama softmax upcast to fp32 ( #10752 )  
							
							 
							
							
							
						 
						
							2024-04-12 15:40:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									hxsz1997 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0d518aab8d 
								
							 
						 
						
							
							
								
								Merge pull request  #10697  from MargarettMao/ceval  
							
							 
							
							... 
							
							
							
							combine english and chinese, remove nan 
							
						 
						
							2024-04-12 14:37:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								dd0d2df5af 
								
							 
						 
						
							
							
								
								Change fp16.csv mistral-7b-v0.1 into Mistral-7B-v0.1  
							
							 
							
							
							
						 
						
							2024-04-12 14:28:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								7309f1ddf9 
								
							 
						 
						
							
							
								
								Mofidy Typos  
							
							 
							
							
							
						 
						
							2024-04-12 14:23:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								cb594e1fc5 
								
							 
						 
						
							
							
								
								Mofidy Typos  
							
							 
							
							
							
						 
						
							2024-04-12 14:22:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								382c18e600 
								
							 
						 
						
							
							
								
								Mofidy Typos  
							
							 
							
							
							
						 
						
							2024-04-12 14:15:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								1a360823ce 
								
							 
						 
						
							
							
								
								Mofidy Typos  
							
							 
							
							
							
						 
						
							2024-04-12 14:13:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								cdbb1de972 
								
							 
						 
						
							
							
								
								Mark Color Modification  
							
							 
							
							
							
						 
						
							2024-04-12 14:00:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								9bbfcaf736 
								
							 
						 
						
							
							
								
								Mark Color Modification  
							
							 
							
							
							
						 
						
							2024-04-12 13:30:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								bb34c6e325 
								
							 
						 
						
							
							
								
								Mark Color Modification  
							
							 
							
							
							
						 
						
							2024-04-12 13:26:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8086554d33 
								
							 
						 
						
							
							
								
								use new fp16 sdp in llama and mistral ( #10734 )  
							
							 
							
							
							
						 
						
							2024-04-12 10:49:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								019293e1b9 
								
							 
						 
						
							
							
								
								Fuse MOE indexes computation ( #10716 )  
							
							 
							
							... 
							
							
							
							* try moe
* use c++ cpu to compute indexes
* fix style 
							
						 
						
							2024-04-11 10:12:55 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								b151a9b672 
								
							 
						 
						
							
							
								
								edit csv_to_html to combine en & zh  
							
							 
							
							
							
						 
						
							2024-04-11 17:35:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								70ed9397f9 
								
							 
						 
						
							
							
								
								LLM: fix AttributeError of FP16Linear ( #10740 )  
							
							 
							
							
							
						 
						
							2024-04-11 17:03:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Keyan (Kyrie) Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1256a2cc4e 
								
							 
						 
						
							
							
								
								Add chatglm3 long input example ( #10739 )  
							
							 
							
							... 
							
							
							
							* Add long context input example for chatglm3
* Small fix
* Small fix
* Small fix 
							
						 
						
							2024-04-11 16:33:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									hxsz1997 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fd473ddb1b 
								
							 
						 
						
							
							
								
								Merge pull request  #10730  from MargarettMao/MargarettMao-parent_folder  
							
							 
							
							... 
							
							
							
							Edit ppl update_HTML_parent_folder 
							
						 
						
							2024-04-11 15:45:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2d64630757 
								
							 
						 
						
							
							
								
								Remove transformers version in axolotl example ( #10736 )  
							
							 
							
							... 
							
							
							
							* Remove transformers version in axolotl requirements.txt 
							
						 
						
							2024-04-11 14:02:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									yb-peng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2685c41318 
								
							 
						 
						
							
							
								
								Modify all-in-one benchmark ( #10726 )  
							
							 
							
							... 
							
							
							
							* Update 8192 prompt in all-in-one
* Add cpu_embedding param for linux api
* Update run.py
* Update README.md 
							
						 
						
							2024-04-11 13:38:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								301504aa8d 
								
							 
						 
						
							
							
								
								Fix transformers version warning ( #10732 )  
							
							 
							
							
							
						 
						
							2024-04-11 13:12:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wenjing Margaret Mao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9bec233e4d 
								
							 
						 
						
							
							
								
								Delete python/llm/test/benchmark/perplexity/update_html_in_parent_folder.py  
							
							 
							
							... 
							
							
							
							Delete due to repetition 
							
						 
						
							2024-04-11 07:21:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4b024b7aac 
								
							 
						 
						
							
							
								
								LLM: optimize chatglm2 8k input. ( #10723 )  
							
							 
							
							... 
							
							
							
							* LLM: optimize chatglm2 8k input.
* rename. 
							
						 
						
							2024-04-10 16:59:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuxuan Xia 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cd22cb8257 
								
							 
						 
						
							
							
								
								Update Env check Script ( #10709 )  
							
							 
							
							... 
							
							
							
							* Update env check bash file
* Update env-check 
							
						 
						
							2024-04-10 15:06:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								29bf28bd6f 
								
							 
						 
						
							
							
								
								Upgrade python to 3.11 in Docker Image ( #10718 )  
							
							 
							
							... 
							
							
							
							* install python 3.11 for cpu-inference docker image
* update xpu-inference dockerfile
* update cpu-serving image
* update qlora image
* update lora image
* update document 
							
						 
						
							2024-04-10 14:41:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b727767f00 
								
							 
						 
						
							
							
								
								Add axolotl v0.3.0 with ipex-llm on Intel GPU ( #10717 )  
							
							 
							
							... 
							
							
							
							* Add axolotl v0.3.0 support on Intel GPU.
* Add finetune example on llama-2-7B with Alpaca dataset. 
							
						 
						
							2024-04-10 14:38:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c9e6d42ad1 
								
							 
						 
						
							
							
								
								LLM: Fix chatglm3-6b-32k error ( #10719 )  
							
							 
							
							... 
							
							
							
							* fix chatglm3-6b-32k
* update style 
							
						 
						
							2024-04-10 11:24:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Keyan (Kyrie) Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								585c174e92 
								
							 
						 
						
							
							
								
								Read the value of KV_CACHE_ALLOC_BLOCK_LENGTH from the environment variables ( #10707 )  
							
							 
							
							... 
							
							
							
							* Read the value of KV_CACHE_ALLOC_BLOCK_LENGTH from the environment variables.
* Fix style 
							
						 
						
							2024-04-10 10:48:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d1eaea509f 
								
							 
						 
						
							
							
								
								update chatglm readme ( #10659 )  
							
							 
							
							
							
						 
						
							2024-04-09 14:24:46 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								878a97077b 
								
							 
						 
						
							
							
								
								Fix llava example to support transformerds 4.36 ( #10614 )  
							
							 
							
							... 
							
							
							
							* fix llava example
* update 
							
						 
						
							2024-04-09 13:47:07 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1e817926ba 
								
							 
						 
						
							
							
								
								Fix low memory generation example issue in transformers 4.36 ( #10702 )  
							
							 
							
							... 
							
							
							
							* update cache in low memory generate
* update 
							
						 
						
							2024-04-09 09:56:52 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								97db2492c8 
								
							 
						 
						
							
							
								
								Update setup.py for bigdl-core-xe-esimd-21 on Windows ( #10705 )  
							
							 
							
							... 
							
							
							
							* Support bigdl-core-xe-esimd-21 for windows in setup.py
* Update setup-llm-env accordingly 
							
						 
						
							2024-04-09 18:21:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhicun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b4147a97bb 
								
							 
						 
						
							
							
								
								Fix dtype mismatch error ( #10609 )  
							
							 
							
							... 
							
							
							
							* fix llama
* fix
* fix code style
* add torch type in model.py
---------
Co-authored-by: arda <arda@arda-arc19.sh.intel.com> 
							
						 
						
							2024-04-09 17:50:33 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f37a1f2a81 
								
							 
						 
						
							
							
								
								Upgrade to python 3.11 ( #10711 )  
							
							 
							
							... 
							
							
							
							* create conda env with python 3.11
* recommend to use Python 3.11
* update 
							
						 
						
							2024-04-09 17:41:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8f45e22072 
								
							 
						 
						
							
							
								
								fix llama2 ( #10710 )  
							
							 
							
							
							
						 
						
							2024-04-09 17:28:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e438f941f2 
								
							 
						 
						
							
							
								
								disable rwkv5 fp16 ( #10699 )  
							
							 
							
							
							
						 
						
							2024-04-09 16:42:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6a32216269 
								
							 
						 
						
							
							
								
								LLM: add llama2 8k input example. ( #10696 )  
							
							 
							
							... 
							
							
							
							* LLM: add llama2-32K example.
* refactor name.
* fix comments.
* add IPEX_LLM_LOW_MEM notes and update sample output. 
							
						 
						
							2024-04-09 16:02:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wenjing Margaret Mao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								289cc99cd6 
								
							 
						 
						
							
							
								
								Update README.md ( #10700 )  
							
							 
							
							... 
							
							
							
							Edit "summarize the results" 
							
						 
						
							2024-04-09 16:01:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wenjing Margaret Mao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d3116de0db 
								
							 
						 
						
							
							
								
								Update README.md ( #10701 )  
							
							 
							
							... 
							
							
							
							edit "summarize the results" 
							
						 
						
							2024-04-09 15:50:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d59e0cce5c 
								
							 
						 
						
							
							
								
								Migrate harness to ipexllm ( #10703 )  
							
							 
							
							... 
							
							
							
							* migrate to ipexlm
* fix workflow
* fix run_multi
* fix precision map
* rename ipexlm to ipexllm
* rename bigdl to ipex  in comments 
							
						 
						
							2024-04-09 15:48:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Keyan (Kyrie) Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1e27e08322 
								
							 
						 
						
							
							
								
								Modify example from fp32 to fp16 ( #10528 )  
							
							 
							
							... 
							
							
							
							* Modify example from fp32 to fp16
* Remove Falcon from fp16 example for now
* Remove MPT from fp16 example 
							
						 
						
							2024-04-09 15:45:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								44922bb5c2 
								
							 
						 
						
							
							
								
								LLM: support baichuan2-13b using AutoTP ( #10691 )  
							
							 
							
							
							
						 
						
							2024-04-09 14:06:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c7422712fc 
								
							 
						 
						
							
							
								
								mistral 4.36 use fp16 sdp ( #10704 )  
							
							 
							
							
							
						 
						
							2024-04-09 13:50:33 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ovo233 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								dcb2038aad 
								
							 
						 
						
							
							
								
								Enable optimization for sentence_transformers ( #10679 )  
							
							 
							
							... 
							
							
							
							* enable optimization for sentence_transformers
* fix python style check failure 
							
						 
						
							2024-04-09 12:33:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5a1f446d3c 
								
							 
						 
						
							
							
								
								support fp8 in xetla ( #10555 )  
							
							 
							
							... 
							
							
							
							* support fp8 in xetla
* change name
* adjust model file
* support convert back to cpu
* factor
* fix bug
* fix style 
							
						 
						
							2024-04-08 13:22:09 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									jenniew 
								
							 
						 
						
							
							
							
							
								
							
							
								591bae092c 
								
							 
						 
						
							
							
								
								combine english and chinese, remove nan  
							
							 
							
							
							
						 
						
							2024-04-08 19:37:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7c43ac0164 
								
							 
						 
						
							
							
								
								LLM: optimize llama natvie sdp for split qkv tensor ( #10693 )  
							
							 
							
							... 
							
							
							
							* LLM: optimize llama natvie sdp for split qkv tensor.
* fix block real size.
* fix comment.
* fix style.
* refactor. 
							
						 
						
							2024-04-08 17:48:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1274cba79b 
								
							 
						 
						
							
							
								
								stablelm fp8 kv cache ( #10672 )  
							
							 
							
							... 
							
							
							
							* stablelm fp8 kvcache
* update
* fix
* change to fp8 matmul
* fix style
* fix
* fix
* meet code review
* add comment 
							
						 
						
							2024-04-08 15:16:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								65127622aa 
								
							 
						 
						
							
							
								
								fix UT threshold ( #10689 )  
							
							 
							
							
							
						 
						
							2024-04-08 14:58:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c0cd238e40 
								
							 
						 
						
							
							
								
								LLM: support llama2 8k input with w4a16. ( #10677 )  
							
							 
							
							... 
							
							
							
							* LLM: support llama2 8k input with w4a16.
* fix comment and style.
* fix style.
* fix comments and split tensor to quantized attention forward.
* fix style.
* refactor name.
* fix style.
* fix style.
* fix style.
* refactor checker name.
* refactor native sdp split qkv tensor name.
* fix style.
* fix comment rename variables.
* fix co-exist of intermedia results. 
							
						 
						
							2024-04-08 11:43:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhicun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								321bc69307 
								
							 
						 
						
							
							
								
								Fix llamaindex ut ( #10673 )  
							
							 
							
							... 
							
							
							
							* fix llamaindex ut
* add GPU ut 
							
						 
						
							2024-04-08 09:47:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									yb-peng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2d88bb9b4b 
								
							 
						 
						
							
							
								
								add test api transformer_int4_fp16_gpu ( #10627 )  
							
							 
							
							... 
							
							
							
							* add test api transformer_int4_fp16_gpu
* update config.yaml and README.md in all-in-one
* modify run.py in all-in-one
* re-order test-api
* re-order test-api in config
* modify README.md in all-in-one
* modify README.md in all-in-one
* modify config.yaml
---------
Co-authored-by: pengyb2001 <arda@arda-arc21.sh.intel.com>
Co-authored-by: ivy-lv11 <zhicunlv@gmail.com> 
							
						 
						
							2024-04-07 15:47:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								47cabe8fcc 
								
							 
						 
						
							
							
								
								LLM: Fix no return_last_logit running bigdl_ipex chatglm3 ( #10678 )  
							
							 
							
							... 
							
							
							
							* fix no return_last_logits
* update only for chatglm 
							
						 
						
							2024-04-07 15:27:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9ad4b29697 
								
							 
						 
						
							
							
								
								LLM: CPU benchmark using tcmalloc ( #10675 )  
							
							 
							
							
							
						 
						
							2024-04-07 14:17:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d9a1153b4e 
								
							 
						 
						
							
							
								
								LLM: upgrade deepspeed in AutoTP on GPU ( #10647 )  
							
							 
							
							
							
						 
						
							2024-04-07 14:05:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								56dfcb2ade 
								
							 
						 
						
							
							
								
								Migrate portable zip to ipex-llm ( #10617 )  
							
							 
							
							... 
							
							
							
							* change portable zip prompt to ipex-llm
* fix chat with ui
* add no proxy 
							
						 
						
							2024-04-07 13:58:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhicun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9d8ba64c0d 
								
							 
						 
						
							
							
								
								Llamaindex: add tokenizer_id and support chat ( #10590 )  
							
							 
							
							... 
							
							
							
							* add tokenizer_id
* fix
* modify
* add from_model_id and from_mode_id_low_bit
* fix typo and add comment
* fix python code style
---------
Co-authored-by: pengyb2001 <284261055@qq.com> 
							
						 
						
							2024-04-07 13:51:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								10ee786920 
								
							 
						 
						
							
							
								
								Replace with IPEX-LLM in example comments ( #10671 )  
							
							 
							
							... 
							
							
							
							* Replace with IPEX-LLM in example comments
* More replacement
* revert some changes 
							
						 
						
							2024-04-07 13:29:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								08018a18df 
								
							 
						 
						
							
							
								
								Remove not-imported MistralConfig ( #10670 )  
							
							 
							
							
							
						 
						
							2024-04-07 10:32:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1a9b8204a4 
								
							 
						 
						
							
							
								
								LLM: support int4 fp16 chatglm2-6b 8k input. ( #10648 )  
							
							 
							
							
							
						 
						
							2024-04-07 09:39:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								69bdbf5806 
								
							 
						 
						
							
							
								
								Fix vllm print error message issue ( #10664 )  
							
							 
							
							... 
							
							
							
							* update chatglm readme
* Add condition to invalidInputError
* update
* update
* style 
							
						 
						
							2024-04-05 15:08:13 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								29d97e4678 
								
							 
						 
						
							
							
								
								Update readme ( #10665 )  
							
							 
							
							
							
						 
						
							2024-04-05 18:01:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4c3e493b2d 
								
							 
						 
						
							
							
								
								fix stablelm2 1.6b ( #10656 )  
							
							 
							
							... 
							
							
							
							* fix stablelm2 1.6b
* meet code review 
							
						 
						
							2024-04-03 22:15:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cc8b3be11c 
								
							 
						 
						
							
							
								
								Add GPU and CPU example for stablelm-zephyr-3b ( #10643 )  
							
							 
							
							... 
							
							
							
							* Add example for StableLM
* fix
* add to readme 
							
						 
						
							2024-04-03 16:28:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6000241b10 
								
							 
						 
						
							
							
								
								Add Deepspeed Example of FLEX Mistral ( #10640 )  
							
							 
							
							
							
						 
						
							2024-04-03 16:04:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d18dbfb097 
								
							 
						 
						
							
							
								
								update spr perf test ( #10644 )  
							
							 
							
							
							
						 
						
							2024-04-03 15:53:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								702e686901 
								
							 
						 
						
							
							
								
								optimize starcoder normal kv cache ( #10642 )  
							
							 
							
							
							
						 
						
							2024-04-03 15:27:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3a9ab8f1ae 
								
							 
						 
						
							
							
								
								fix stablelm logits diff ( #10636 )  
							
							 
							
							... 
							
							
							
							* fix logits diff
* Small fixes
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2024-04-03 15:08:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhicun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b827f534d5 
								
							 
						 
						
							
							
								
								Add tokenizer_id in Langchain ( #10588 )  
							
							 
							
							... 
							
							
							
							* fix low-bit
* fix
* fix style
---------
Co-authored-by: arda <arda@arda-arc12.sh.intel.com> 
							
						 
						
							2024-04-03 14:25:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhicun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f6fef09933 
								
							 
						 
						
							
							
								
								fix prompt format for llama-2 in langchain ( #10637 )  
							
							 
							
							
							
						 
						
							2024-04-03 14:17:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								330d4b4f4b 
								
							 
						 
						
							
							
								
								update readme ( #10631 )  
							
							 
							
							
							
						 
						
							2024-04-02 23:08:02 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c875b3c858 
								
							 
						 
						
							
							
								
								Add seq len check for llama softmax upcast to fp32 ( #10629 )  
							
							 
							
							
							
						 
						
							2024-04-03 12:05:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4431134ec5 
								
							 
						 
						
							
							
								
								update readme ( #10632 )  
							
							 
							
							
							
						 
						
							2024-04-02 19:54:30 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								23e33a0ca1 
								
							 
						 
						
							
							
								
								Fix qwen-vl style ( #10633 )  
							
							 
							
							... 
							
							
							
							* update
* update 
							
						 
						
							2024-04-02 18:41:38 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2bbd8a1548 
								
							 
						 
						
							
							
								
								LLM: fix llama2 FP16 & bs>1 & autotp on PVC and ARC ( #10611 )  
							
							 
							
							
							
						 
						
							2024-04-03 09:28:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								654dc5ba57 
								
							 
						 
						
							
							
								
								Fix Qwen-VL example problem ( #10582 )  
							
							 
							
							... 
							
							
							
							* update
* update
* update
* update 
							
						 
						
							2024-04-02 12:17:30 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fd384ddfb8 
								
							 
						 
						
							
							
								
								Optimize StableLM ( #10619 )  
							
							 
							
							... 
							
							
							
							* Initial commit for stablelm optimizations
* Small style fix
* add dependency
* Add mlp optimizations
* Small fix
* add attention forward
* Remove quantize kv for now as head_dim=80
* Add merged qkv
* fix lisence
* Python style fix
---------
Co-authored-by: qiuxin2012 <qiuxin2012cs@gmail.com> 
							
						 
						
							2024-04-02 18:58:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								27be448920 
								
							 
						 
						
							
							
								
								LLM: add cpu_embedding and peak memory record for deepspeed autotp script ( #10621 )  
							
							 
							
							
							
						 
						
							2024-04-02 17:32:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ba8cc6bd68 
								
							 
						 
						
							
							
								
								optimize starcoder2-3b ( #10625 )  
							
							 
							
							
							
						 
						
							2024-04-02 17:16:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a10f5a1b8d 
								
							 
						 
						
							
							
								
								add python style check ( #10620 )  
							
							 
							
							... 
							
							
							
							* add python style check
* fix style checks
* update runner
* add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow
* update tag to 2.1.0-SNAPSHOT 
							
						 
						
							2024-04-02 16:17:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								58b57177e3 
								
							 
						 
						
							
							
								
								LLM: support bigdl quantize kv cache env and add warning. ( #10623 )  
							
							 
							
							... 
							
							
							
							* LLM: support bigdl quantize kv cache env and add warnning.
* fix style.
* fix comments. 
							
						 
						
							2024-04-02 15:41:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0a95c556a1 
								
							 
						 
						
							
							
								
								Fix starcoder first token perf ( #10612 )  
							
							 
							
							... 
							
							
							
							* add bias check
* update 
							
						 
						
							2024-04-02 09:21:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e567956121 
								
							 
						 
						
							
							
								
								LLM: add memory optimization for llama. ( #10592 )  
							
							 
							
							... 
							
							
							
							* add initial memory optimization.
* fix logic.
* fix logic,
* remove env var check in mlp split. 
							
						 
						
							2024-04-02 09:07:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Keyan (Kyrie) Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								01f491757a 
								
							 
						 
						
							
							
								
								Modify the link in Langchain-upstream ut ( #10608 )  
							
							 
							
							... 
							
							
							
							* Modify the link in Langchain-upstream ut
* fix langchain-upstream ut 
							
						 
						
							2024-04-01 17:03:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bfc1caa5e5 
								
							 
						 
						
							
							
								
								LLM: support iq1s for llama2-70b-hf ( #10596 )  
							
							 
							
							
							
						 
						
							2024-04-01 13:13:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d6af4877dd 
								
							 
						 
						
							
							
								
								LLM: remove ipex.optimize for gpt-j ( #10606 )  
							
							 
							
							... 
							
							
							
							* remove ipex.optimize
* fix
* fix 
							
						 
						
							2024-04-01 12:21:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								437a349dd6 
								
							 
						 
						
							
							
								
								fix rwkv with pip installer ( #10591 )  
							
							 
							
							
							
						 
						
							2024-03-29 17:56:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9a83f21b86 
								
							 
						 
						
							
							
								
								LLM: check user env ( #10580 )  
							
							 
							
							... 
							
							
							
							* LLM: check user env
* small fix
* small fix
* small fix 
							
						 
						
							2024-03-29 17:19:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Keyan (Kyrie) Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								848fa04dd6 
								
							 
						 
						
							
							
								
								Fix typo in Baichuan2 example ( #10589 )  
							
							 
							
							
							
						 
						
							2024-03-29 13:31:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0136fad1d4 
								
							 
						 
						
							
							
								
								LLM: support iq1_s ( #10564 )  
							
							 
							
							... 
							
							
							
							* init version
* update utils
* remove unsed code 
							
						 
						
							2024-03-29 09:43:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f4537798c1 
								
							 
						 
						
							
							
								
								Enable kv cache quantization by default for flex when 1 < batch <= 8 ( #10584 )  
							
							 
							
							... 
							
							
							
							* Enable kv cache quantization by default for flex when 1 < batch <= 8.
* Change up bound from <8 to <=8. 
							
						 
						
							2024-03-29 09:43:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b44f7adbad 
								
							 
						 
						
							
							
								
								LLM: Disable esimd sdp for PVC GPU when batch size>1 ( #10579 )  
							
							 
							
							... 
							
							
							
							* llm: disable esimd sdp for pvc bz>1.
* fix logic.
* fix: avoid call get device name twice. 
							
						 
						
							2024-03-28 22:55:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5963239b46 
								
							 
						 
						
							
							
								
								Fix qwen's position_ids no enough ( #10572 )  
							
							 
							
							... 
							
							
							
							* fix position_ids
* fix position_ids 
							
						 
						
							2024-03-28 17:05:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								52a2135d83 
								
							 
						 
						
							
							
								
								Replace ipex with ipex-llm ( #10554 )  
							
							 
							
							... 
							
							
							
							* fix ipex with ipex_llm
* fix ipex with ipex_llm
* update
* update
* update
* update
* update
* update
* update
* update 
							
						 
						
							2024-03-28 13:54:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cheen Hau, 俊豪 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1c5eb14128 
								
							 
						 
						
							
							
								
								Update pip install to use --extra-index-url for ipex package ( #10557 )  
							
							 
							
							... 
							
							
							
							* Change to 'pip install .. --extra-index-url' for readthedocs
* Change to 'pip install .. --extra-index-url' for examples
* Change to 'pip install .. --extra-index-url' for remaining files
* Fix URL for ipex
* Add links for ipex US and CN servers
* Update ipex cpu url
* remove readme
* Update for github actions
* Update for dockerfiles 
							
						 
						
							2024-03-28 09:56:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								92dfed77be 
								
							 
						 
						
							
							
								
								LLM: fix abnormal output of fp16 deepspeed autotp ( #10558 )  
							
							 
							
							
							
						 
						
							2024-03-28 09:35:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c450c85489 
								
							 
						 
						
							
							
								
								Delete llm/readme.md ( #10569 )  
							
							 
							
							
							
						 
						
							2024-03-27 20:06:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								51d34ca68e 
								
							 
						 
						
							
							
								
								Fix wrong import in speculative ( #10562 )  
							
							 
							
							
							
						 
						
							2024-03-27 18:21:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cheen Hau, 俊豪 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f239bc329b 
								
							 
						 
						
							
							
								
								Specify oneAPI minor version in documentation ( #10561 )  
							
							 
							
							
							
						 
						
							2024-03-27 17:58:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fbeb10c796 
								
							 
						 
						
							
							
								
								LLM: Set different env based on different Linux kernels ( #10566 )  
							
							 
							
							
							
						 
						
							2024-03-27 17:56:33 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									hxsz1997 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d86477f14d 
								
							 
						 
						
							
							
								
								Remove native_int4 in LangChain examples ( #10510 )  
							
							 
							
							... 
							
							
							
							* rebase the modify to ipex-llm
* modify the typo 
							
						 
						
							2024-03-27 17:48:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								04baac5a2e 
								
							 
						 
						
							
							
								
								Fix fastchat top_k ( #10560 )  
							
							 
							
							... 
							
							
							
							* fix -1 top_k
* fix
* done 
							
						 
						
							2024-03-27 16:01:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fc8c7904f0 
								
							 
						 
						
							
							
								
								LLM: fix torch_dtype setting of apply fp16 optimization through optimize_model ( #10556 )  
							
							 
							
							
							
						 
						
							2024-03-27 14:18:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ea4bc450c4 
								
							 
						 
						
							
							
								
								LLM: add esimd sdp for pvc ( #10543 )  
							
							 
							
							... 
							
							
							
							* add esimd sdp for pvc
* update
* fix
* fix batch 
							
						 
						
							2024-03-26 19:04:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b78289a595 
								
							 
						 
						
							
							
								
								Remove ipex-llm dependency in readme ( #10544 )  
							
							 
							
							
							
						 
						
							2024-03-26 18:25:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								11550d3f25 
								
							 
						 
						
							
							
								
								LLM: Add length check for IPEX-CPU speculative decoding  ( #10529 )  
							
							 
							
							... 
							
							
							
							Add length check for IPEX-CPU speculative decoding. 
							
						 
						
							2024-03-26 17:47:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a3b007f3b1 
								
							 
						 
						
							
							
								
								[Serving] Fix fastchat breaks ( #10548 )  
							
							 
							
							... 
							
							
							
							* fix fastchat
* fix doc 
							
						 
						
							2024-03-26 17:03:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								69a28d6b4c 
								
							 
						 
						
							
							
								
								fix chatglm ( #10540 )  
							
							 
							
							
							
						 
						
							2024-03-26 16:01:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c563b41491 
								
							 
						 
						
							
							
								
								add nightly_build workflow ( #10533 )  
							
							 
							
							... 
							
							
							
							* add nightly_build workflow
* add create-job-status-badge action
* update
* update
* update
* update setup.py
* release
* revert 
							
						 
						
							2024-03-26 12:47:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0a3e4e788f 
								
							 
						 
						
							
							
								
								LLM: fix mistral hidden_size setting for deepspeed autotp ( #10527 )  
							
							 
							
							
							
						 
						
							2024-03-26 10:55:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1dd40b429c 
								
							 
						 
						
							
							
								
								enable fp4 fused mlp and qkv ( #10531 )  
							
							 
							
							... 
							
							
							
							* enable fp4 fused mlp and qkv
* update qwen
* update qwen2 
							
						 
						
							2024-03-26 08:34:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								16b2ef49c6 
								
							 
						 
						
							
							
								
								Update_document by heyang ( #30 )  
							
							 
							
							
							
						 
						
							2024-03-25 10:06:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a1048ca7f6 
								
							 
						 
						
							
							
								
								Update setup.py and add new actions and add compatible mode ( #25 )  
							
							 
							
							... 
							
							
							
							* update setup.py
* add new action
* add compatible mode 
							
						 
						
							2024-03-22 15:44:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9df70d95eb 
								
							 
						 
						
							
							
								
								Refactor bigdl.llm to  ipex_llm ( #24 )  
							
							 
							
							... 
							
							
							
							* Rename bigdl/llm to ipex_llm
* rm python/llm/src/bigdl
* from bigdl.llm to from ipex_llm 
							
						 
						
							2024-03-22 15:41:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								cc5806f4bc 
								
							 
						 
						
							
							
								
								LLM: add save/load example for hf-transformers ( #10432 )  
							
							 
							
							
							
						 
						
							2024-03-22 13:57:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								34d0a9328c 
								
							 
						 
						
							
							
								
								LLM: Speed-up mixtral in pipeline parallel inference ( #10472 )  
							
							 
							
							... 
							
							
							
							* speed-up mixtral
* fix style 
							
						 
						
							2024-03-22 11:06:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								b9d4280892 
								
							 
						 
						
							
							
								
								LLM: fix baichuan7b quantize kv abnormal output. ( #10504 )  
							
							 
							
							... 
							
							
							
							* fix abnormal output.
* fix style.
* fix style. 
							
						 
						
							2024-03-22 10:00:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								f0f317b6cf 
								
							 
						 
						
							
							
								
								fix a typo in yuan ( #10503 )  
							
							 
							
							
							
						 
						
							2024-03-22 09:40:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								3a3756b51d 
								
							 
						 
						
							
							
								
								Add FastChat bigdl_worker ( #10493 )  
							
							 
							
							... 
							
							
							
							* done
* fix format
* add licence
* done
* fix doc
* refactor folder
* add license 
							
						 
						
							2024-03-21 18:35:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								dba7ddaab3 
								
							 
						 
						
							
							
								
								add sdp fp8 for qwen llama436 baichuan mistral baichuan2 ( #10485 )  
							
							 
							
							... 
							
							
							
							* add sdp fp8
* fix style
* fix qwen
* fix baichuan 13
* revert baichuan 13b and baichuan2-13b
* fix style
* update 
							
						 
						
							2024-03-21 17:23:05 +08:00