Yang Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								952e517db9
								
							
						 | 
						
							
							
								
								use config rope_theta (#10787)
							
							
							
							
							
							
							
							* use config rope_theta
* fix style 
							
						 | 
						
							2024-04-17 20:39:11 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								31ea2f9a9f
								
							
						 | 
						
							
							
								
								Fix wrong output for Llama models on CPU (#10742)
							
							
							
							
							
						 | 
						
							2024-04-18 11:07:27 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xin Qiu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								e764f9b1b1
								
							
						 | 
						
							
							
								
								Disable fast fused rope on UHD  (#10780)
							
							
							
							
							
							
							
							* use decoding fast path
* update
* update
* cleanup 
							
						 | 
						
							2024-04-18 10:03:53 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								ea5b373a97
								
							
						 | 
						
							
							
								
								Add lookahead GPU example (#10785)
							
							
							
							
							
							
							
							* Add lookahead example
* fix style & attn mask
* fix typo
* address comments 
							
						 | 
						
							2024-04-17 17:41:55 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								a20271ffe4
								
							
						 | 
						
							
							
								
								LLM: Fix yi-6b fp16 error on pvc (#10781)
							
							
							
							
							
							
							
							* updat for yi fp16
* update
* update 
							
						 | 
						
							2024-04-17 16:49:59 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									ZehuaCao
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								0646e2c062
								
							
						 | 
						
							
							
								
								Fix short prompt for IPEX_CPU speculative decoding cause no_attr error (#10783)
							
							
							
							
							
						 | 
						
							2024-04-17 16:19:57 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								7ec82c6042
								
							
						 | 
						
							
							
								
								LLM: add README.md for Long-Context examples. (#10765)
							
							
							
							
							
							
							
							* LLM: add readme to long-context examples.
* add precision.
* update wording.
* add GPU type.
* add Long-Context example to GPU examples.
* fix comments.
* update max input length.
* update max length.
* add output length.
* fix wording. 
							
						 | 
						
							2024-04-17 15:34:59 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								766fe45222
								
							
						 | 
						
							
							
								
								Fix spec error caused by lookup pr (#10777)
							
							
							
							
							
							
							
							* Fix spec error
* remove
* fix style 
							
						 | 
						
							2024-04-17 11:27:35 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Qiyuan Gong
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								9e5069437f
								
							
						 | 
						
							
							
								
								Fix gradio version in axolotl example (#10776)
							
							
							
							
							
							
							
							* Change to gradio>=4.19.2 
							
						 | 
						
							2024-04-17 10:23:43 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Qiyuan Gong
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								f2e923b3ca
								
							
						 | 
						
							
							
								
								Axolotl v0.4.0 support  (#10773)
							
							
							
							
							
							
							
							* Add Axolotl 0.4.0, remove legacy 0.3.0 support.
* replace is_torch_bf16_gpu_available
* Add HF_HUB_OFFLINE=1
* Move transformers out of requirement
* Refine readme and qlora.yml 
							
						 | 
						
							2024-04-17 09:49:11 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Heyang Sun
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								26cae0a39c
								
							
						 | 
						
							
							
								
								Update FLEX in Deepspeed README (#10774)
							
							
							
							
							
							
							
							* Update FLEX in Deepspeed README
* Update README.md 
							
						 | 
						
							2024-04-17 09:28:24 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wenjing Margaret Mao
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								c41730e024
								
							
						 | 
						
							
							
								
								edit 'ppl_result does not exist' issue, delete useless code (#10767)
							
							
							
							
							
							
							
							* edit ppl_result not exist issue, delete useless code
* delete nonzero_min function
---------
Co-authored-by: jenniew <jenniewang123@gmail.com> 
							
						 | 
						
							2024-04-16 18:11:56 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								899d392e2f
								
							
						 | 
						
							
							
								
								Support prompt lookup in ipex-llm (#10768)
							
							
							
							
							
							
							
							* lookup init
* add lookup
* fix style
* remove redundant code
* change param name
* fix style 
							
						 | 
						
							2024-04-16 16:52:38 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Qiyuan Gong
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								d30b22a81b
								
							
						 | 
						
							
							
								
								Refine axolotl 0.3.0 documents and links (#10764)
							
							
							
							
							
							
							
							* Refine axolotl 0.3 based on comments
* Rename requirements to requirement-xpu
* Add comments for paged_adamw_32bit
* change lora_r from 8 to 16 
							
						 | 
						
							2024-04-16 14:47:45 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									ZehuaCao
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								599a88db53
								
							
						 | 
						
							
							
								
								Add deepsped-autoTP-Fastapi serving (#10748)
							
							
							
							
							
							
							
							* add deepsped-autoTP-Fastapi serving
* add readme
* add license
* update
* update
* fix 
							
						 | 
						
							2024-04-16 14:03:23 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								0a62933d36
								
							
						 | 
						
							
							
								
								LLM: fix qwen AutoTP (#10766)
							
							
							
							
							
						 | 
						
							2024-04-16 09:56:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3e2662c87e
								
							
						 | 
						
							
							
								
								LLM: fix get env KV_CACHE_ALLOC_BLOCK_LENGTH type. (#10771)
							
							
							
							
							
						 | 
						
							2024-04-16 09:32:30 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jin Qiao
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								73a67804a4
								
							
						 | 
						
							
							
								
								GPU configuration update for examples (windows pip installer, etc.) (#10762)
							
							
							
							
							
							
							
							* renew chatglm3-6b gpu example readme
fix
fix
fix
* fix for comments
* fix
* fix
* fix
* fix
* fix
* apply on HF-Transformers-AutoModels
* apply on PyTorch-Models
* fix
* fix 
							
						 | 
						
							2024-04-15 17:42:52 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									yb-peng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								b5209d3ec1
								
							
						 | 
						
							
							
								
								Update example/GPU/PyTorch-Models/Model/llava/README.md (#10757)
							
							
							
							
							
							
							
							* Update example/GPU/PyTorch-Models/Model/llava/README.md
* Update README.md
fix path in windows installation 
							
						 | 
						
							2024-04-15 13:01:37 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3d561b60ac
								
							
						 | 
						
							
							
								
								LLM: add enable_xetla parameter for optimize_model API (#10753)
							
							
							
							
							
						 | 
						
							2024-04-15 12:18:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jiao Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								a9a6b6b7af
								
							
						 | 
						
							
							
								
								Fix baichuan-13b issue on portable zip under transformers 4.36 (#10746)
							
							
							
							
							
							
							
							* fix baichuan-13b issue
* update
* update 
							
						 | 
						
							2024-04-12 16:27:01 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jiao Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								9e668a5bf0
								
							
						 | 
						
							
							
								
								fix_internlm-chat-7b-8k repo name in examples (#10747)
							
							
							
							
							
						 | 
						
							2024-04-12 10:15:48 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								c3fc8f4b90
								
							
						 | 
						
							
							
								
								LLM: add bs limitation for llama softmax upcast to fp32 (#10752)
							
							
							
							
							
						 | 
						
							2024-04-12 15:40:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									hxsz1997
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								0d518aab8d
								
							
						 | 
						
							
							
								
								Merge pull request #10697 from MargarettMao/ceval
							
							
							
							
							
							
							
							combine english and chinese, remove nan 
							
						 | 
						
							2024-04-12 14:37:47 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								dd0d2df5af
								
							
						 | 
						
							
							
								
								Change fp16.csv mistral-7b-v0.1 into Mistral-7B-v0.1
							
							
							
							
							
						 | 
						
							2024-04-12 14:28:46 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								7309f1ddf9
								
							
						 | 
						
							
							
								
								Mofidy Typos
							
							
							
							
							
						 | 
						
							2024-04-12 14:23:13 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								cb594e1fc5
								
							
						 | 
						
							
							
								
								Mofidy Typos
							
							
							
							
							
						 | 
						
							2024-04-12 14:22:09 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								382c18e600
								
							
						 | 
						
							
							
								
								Mofidy Typos
							
							
							
							
							
						 | 
						
							2024-04-12 14:15:48 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								1a360823ce
								
							
						 | 
						
							
							
								
								Mofidy Typos
							
							
							
							
							
						 | 
						
							2024-04-12 14:13:21 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								cdbb1de972
								
							
						 | 
						
							
							
								
								Mark Color Modification
							
							
							
							
							
						 | 
						
							2024-04-12 14:00:50 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								9bbfcaf736
								
							
						 | 
						
							
							
								
								Mark Color Modification
							
							
							
							
							
						 | 
						
							2024-04-12 13:30:16 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								bb34c6e325
								
							
						 | 
						
							
							
								
								Mark Color Modification
							
							
							
							
							
						 | 
						
							2024-04-12 13:26:36 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								8086554d33
								
							
						 | 
						
							
							
								
								use new fp16 sdp in llama and mistral (#10734)
							
							
							
							
							
						 | 
						
							2024-04-12 10:49:02 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yang Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								019293e1b9
								
							
						 | 
						
							
							
								
								Fuse MOE indexes computation (#10716)
							
							
							
							
							
							
							
							* try moe
* use c++ cpu to compute indexes
* fix style 
							
						 | 
						
							2024-04-11 10:12:55 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									jenniew
								
							 
						 | 
						
							
							
							
							
								
							
							
								b151a9b672
								
							
						 | 
						
							
							
								
								edit csv_to_html to combine en & zh
							
							
							
							
							
						 | 
						
							2024-04-11 17:35:36 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								70ed9397f9
								
							
						 | 
						
							
							
								
								LLM: fix AttributeError of FP16Linear (#10740)
							
							
							
							
							
						 | 
						
							2024-04-11 17:03:56 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Keyan (Kyrie) Zhang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								1256a2cc4e
								
							
						 | 
						
							
							
								
								Add chatglm3 long input example (#10739)
							
							
							
							
							
							
							
							* Add long context input example for chatglm3
* Small fix
* Small fix
* Small fix 
							
						 | 
						
							2024-04-11 16:33:43 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									hxsz1997
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								fd473ddb1b
								
							
						 | 
						
							
							
								
								Merge pull request #10730 from MargarettMao/MargarettMao-parent_folder
							
							
							
							
							
							
							
							Edit ppl update_HTML_parent_folder 
							
						 | 
						
							2024-04-11 15:45:24 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Qiyuan Gong
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								2d64630757
								
							
						 | 
						
							
							
								
								Remove transformers version in axolotl example (#10736)
							
							
							
							
							
							
							
							* Remove transformers version in axolotl requirements.txt 
							
						 | 
						
							2024-04-11 14:02:31 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									yb-peng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								2685c41318
								
							
						 | 
						
							
							
								
								Modify all-in-one benchmark (#10726)
							
							
							
							
							
							
							
							* Update 8192 prompt in all-in-one
* Add cpu_embedding param for linux api
* Update run.py
* Update README.md 
							
						 | 
						
							2024-04-11 13:38:50 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xiangyu Tian
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								301504aa8d
								
							
						 | 
						
							
							
								
								Fix transformers version warning (#10732)
							
							
							
							
							
						 | 
						
							2024-04-11 13:12:49 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wenjing Margaret Mao
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								9bec233e4d
								
							
						 | 
						
							
							
								
								Delete python/llm/test/benchmark/perplexity/update_html_in_parent_folder.py
							
							
							
							
							
							
							
							Delete due to repetition 
							
						 | 
						
							2024-04-11 07:21:12 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								4b024b7aac
								
							
						 | 
						
							
							
								
								LLM: optimize chatglm2 8k input. (#10723)
							
							
							
							
							
							
							
							* LLM: optimize chatglm2 8k input.
* rename. 
							
						 | 
						
							2024-04-10 16:59:06 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yuxuan Xia
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								cd22cb8257
								
							
						 | 
						
							
							
								
								Update Env check Script (#10709)
							
							
							
							
							
							
							
							* Update env check bash file
* Update env-check 
							
						 | 
						
							2024-04-10 15:06:00 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Shaojun Liu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								29bf28bd6f
								
							
						 | 
						
							
							
								
								Upgrade python to 3.11 in Docker Image (#10718)
							
							
							
							
							
							
							
							* install python 3.11 for cpu-inference docker image
* update xpu-inference dockerfile
* update cpu-serving image
* update qlora image
* update lora image
* update document 
							
						 | 
						
							2024-04-10 14:41:27 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Qiyuan Gong
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								b727767f00
								
							
						 | 
						
							
							
								
								Add axolotl v0.3.0 with ipex-llm on Intel GPU (#10717)
							
							
							
							
							
							
							
							* Add axolotl v0.3.0 support on Intel GPU.
* Add finetune example on llama-2-7B with Alpaca dataset. 
							
						 | 
						
							2024-04-10 14:38:29 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								c9e6d42ad1
								
							
						 | 
						
							
							
								
								LLM: Fix chatglm3-6b-32k error (#10719)
							
							
							
							
							
							
							
							* fix chatglm3-6b-32k
* update style 
							
						 | 
						
							2024-04-10 11:24:06 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Keyan (Kyrie) Zhang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								585c174e92
								
							
						 | 
						
							
							
								
								Read the value of KV_CACHE_ALLOC_BLOCK_LENGTH from the environment variables (#10707)
							
							
							
							
							
							
							
							* Read the value of KV_CACHE_ALLOC_BLOCK_LENGTH from the environment variables.
* Fix style 
							
						 | 
						
							2024-04-10 10:48:46 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jiao Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								d1eaea509f
								
							
						 | 
						
							
							
								
								update chatglm readme (#10659)
							
							
							
							
							
						 | 
						
							2024-04-09 14:24:46 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jiao Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								878a97077b
								
							
						 | 
						
							
							
								
								Fix llava example to support transformerds 4.36 (#10614)
							
							
							
							
							
							
							
							* fix llava example
* update 
							
						 | 
						
							2024-04-09 13:47:07 -07:00 | 
						
						
							
							
							
								
							
							
						 |