Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2d210817ff 
								
							 
						 
						
							
							
								
								add phi3 optimization ( #10871 )  
							
							 
							
							
							
						 
						
							2024-04-24 15:17:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								eb39c61607 
								
							 
						 
						
							
							
								
								LLM: add min new token to perf test. ( #10869 )  
							
							 
							
							
							
						 
						
							2024-04-24 14:32:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fb2a160af3 
								
							 
						 
						
							
							
								
								Add phi-2 to 2048-256 test for fixes ( #10867 )  
							
							 
							
							
							
						 
						
							2024-04-24 10:00:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fabf54e052 
								
							 
						 
						
							
							
								
								LLM: make pipeline parallel inference example more common ( #10786 )  
							
							 
							
							
							
						 
						
							2024-04-24 09:28:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									hxsz1997 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								328b1a1de9 
								
							 
						 
						
							
							
								
								Fix the not stop issue of llama3 examples ( #10860 )  
							
							 
							
							... 
							
							
							
							* fix not stop issue in GPU/HF-Transformers-AutoModels
* fix not stop issue in GPU/PyTorch-Models/Model/llama3
* fix not stop issue in CPU/HF-Transformers-AutoModels/Model/llama3
* fix not stop issue in CPU/PyTorch-Models/Model/llama3
* update the output in readme
* update format
* add reference
* update prompt format
* update output format in readme
* update example output in readme 
							
						 
						
							2024-04-23 19:10:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5c9eb5d0f5 
								
							 
						 
						
							
							
								
								Support llama-index install option for upstreaming purposes ( #10866 )  
							
							 
							
							... 
							
							
							
							* Support llama-index install option for upstreaming purposes
* Small fix
* Small fix 
							
						 
						
							2024-04-23 19:08:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								21bb8bd164 
								
							 
						 
						
							
							
								
								Add phi-2 to igpu performance test ( #10865 )  
							
							 
							
							
							
						 
						
							2024-04-23 18:13:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								36eb8b2e96 
								
							 
						 
						
							
							
								
								Add llama3 speculative example ( #10856 )  
							
							 
							
							... 
							
							
							
							* Initial llama3 speculative example
* update README
* update README
* update README 
							
						 
						
							2024-04-23 17:03:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhicun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a017bf2981 
								
							 
						 
						
							
							
								
								add quick start for dify ( #10813 )  
							
							 
							
							... 
							
							
							
							* add quick start
* modify
* modify
* add
* add
* resize
* add mp4
* add vedio
* add video
* video
* add 
							
						 
						
							2024-04-23 16:32:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								763413b7e1 
								
							 
						 
						
							
							
								
								LLM: support llama split tensor for long context in transformers>=4.36. ( #10844 )  
							
							 
							
							... 
							
							
							
							* LLm: support llama split tensor for long context in transformers>=4.36.
* fix dtype.
* fix style.
* fix style.
* fix style.
* fix style.
* fix dtype.
* fix style. 
							
						 
						
							2024-04-23 16:13:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bce99a5b00 
								
							 
						 
						
							
							
								
								Minior fix for quick start ( #10857 )  
							
							 
							
							... 
							
							
							
							* Fix typo and space in quick start. 
							
						 
						
							2024-04-23 15:22:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5eee1976ac 
								
							 
						 
						
							
							
								
								Add Axolotl v0.4.0 quickstart ( #10840 )  
							
							 
							
							... 
							
							
							
							* Add Axolotl v0.4.0 quickstart 
							
						 
						
							2024-04-23 14:57:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								92ea54b512 
								
							 
						 
						
							
							
								
								Fix speculative decoding bug ( #10855 )  
							
							 
							
							
							
						 
						
							2024-04-23 14:28:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									yb-peng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c9dee6cd0e 
								
							 
						 
						
							
							
								
								Update 8192.txt ( #10824 )  
							
							 
							
							... 
							
							
							
							* Update 8192.txt
* Update 8192.txt with original text 
							
						 
						
							2024-04-23 14:02:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								18c032652d 
								
							 
						 
						
							
							
								
								LLM: Add mixtral speculative CPU example ( #10830 )  
							
							 
							
							... 
							
							
							
							* init mixtral sp example
* use different prompt_format
* update output
* update 
							
						 
						
							2024-04-23 10:05:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5494aa55f6 
								
							 
						 
						
							
							
								
								Downgrade datasets in axolotl example ( #10849 )  
							
							 
							
							... 
							
							
							
							* Downgrade datasets to 2.15.0 to address axolotl prepare issue https://github.com/OpenAccess-AI-Collective/axolotl/issues/1544 
Tks to @kwaa for providing the solution in https://github.com/intel-analytics/ipex-llm/issues/10821#issuecomment-2068861571  
							
						 
						
							2024-04-23 09:41:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2ec45c49d3 
								
							 
						 
						
							
							
								
								fix ollama quickstart( #10846 )  
							
							 
							
							
							
						 
						
							2024-04-22 22:04:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fe5a082b84 
								
							 
						 
						
							
							
								
								add phi-2 optimization ( #10843 )  
							
							 
							
							
							
						 
						
							2024-04-22 18:56:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								47bd5f504c 
								
							 
						 
						
							
							
								
								[vLLM]Remove vllm-v1, refactor v2 ( #10842 )  
							
							 
							
							... 
							
							
							
							* remove vllm-v1
* fix format 
							
						 
						
							2024-04-22 17:51:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								23c6a52fb0 
								
							 
						 
						
							
							
								
								LLM: Fix ipex torchscript=True error ( #10832 )  
							
							 
							
							... 
							
							
							
							* remove
* update
* remove torchscript 
							
						 
						
							2024-04-22 15:53:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fc33aa3721 
								
							 
						 
						
							
							
								
								fix missing import ( #10839 )  
							
							 
							
							
							
						 
						
							2024-04-22 14:34:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3b82834aaf 
								
							 
						 
						
							
							
								
								Update README.md ( #10838 )  
							
							 
							
							
							
						 
						
							2024-04-22 14:18:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3daad242b8 
								
							 
						 
						
							
							
								
								Fix  No module named 'transformers.cache_utils' with transformers < 4.36 ( #10835 )  
							
							 
							
							... 
							
							
							
							* update sdp condition
* update
* fix
* fix 431 error
* revert sdp & style fix
* fix
* meet comments 
							
						 
						
							2024-04-22 14:05:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c6e868f7ad 
								
							 
						 
						
							
							
								
								update oneapi usage in cpp quickstart ( #10836 )  
							
							 
							
							... 
							
							
							
							* update oneapi usage
* update
* small fix 
							
						 
						
							2024-04-22 11:48:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ae3b577537 
								
							 
						 
						
							
							
								
								Update README.md ( #10833 )  
							
							 
							
							
							
						 
						
							2024-04-22 11:07:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5f95054f97 
								
							 
						 
						
							
							
								
								LLM:Add qwen moe example libs md ( #10828 )  
							
							 
							
							
							
						 
						
							2024-04-22 10:03:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1edb19c1dd 
								
							 
						 
						
							
							
								
								small fix of cpp quickstart( #10829 )  
							
							 
							
							
							
						 
						
							2024-04-22 09:44:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								61c67af386 
								
							 
						 
						
							
							
								
								Fix vLLM-v2 install instructions( #10822 )  
							
							 
							
							
							
						 
						
							2024-04-22 09:02:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3cd21d5105 
								
							 
						 
						
							
							
								
								Update readme ( #10817 )  
							
							 
							
							
							
						 
						
							2024-04-19 22:16:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								197f8dece9 
								
							 
						 
						
							
							
								
								Add open-webui windows document ( #10775 )  
							
							 
							
							... 
							
							
							
							* add windows document
* update
* fix document
* build fix
* update some description
* reorg document structure
* update doc
* re-update to better view
* add reminder for running model on gpus
* update
* remove useless part 
							
						 
						
							2024-04-19 18:06:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a8df429985 
								
							 
						 
						
							
							
								
								QuickStart: Run Llama 3 on Intel GPU using llama.cpp and ollama with IPEX-LLM ( #10809 )  
							
							 
							
							... 
							
							
							
							* initial commit
* update llama.cpp
* add demo video at first
* fix ollama link in readme
* meet review
* update
* small fix 
							
						 
						
							2024-04-19 17:44:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								caf75beef8 
								
							 
						 
						
							
							
								
								Disable sdpa ( #10814 )  
							
							 
							
							
							
						 
						
							2024-04-19 17:33:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								57edf2033c 
								
							 
						 
						
							
							
								
								fix lookahead with transformers >= 4.36 ( #10808 )  
							
							 
							
							
							
						 
						
							2024-04-19 16:24:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								34ff07b689 
								
							 
						 
						
							
							
								
								Add CPU related info to langchain-chatchat quickstart ( #10812 )  
							
							 
							
							
							
						 
						
							2024-04-19 15:59:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ovo233 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1a885020ee 
								
							 
						 
						
							
							
								
								Updated importing of top_k_top_p_filtering for transformers>=4.39.0 ( #10794 )  
							
							 
							
							... 
							
							
							
							* In transformers>=4.39.0, the top_k_top_p_filtering function has been deprecated and moved to the hugging face package trl. Thus, for versions >= 4.39.0, import this function from trl. 
							
						 
						
							2024-04-19 15:34:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								07e8b045a9 
								
							 
						 
						
							
							
								
								Add Meta-llama-3-8B-Instruct and Yi-6B-Chat to igpu nightly perf ( #10810 )  
							
							 
							
							
							
						 
						
							2024-04-19 15:09:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fbd1743b5e 
								
							 
						 
						
							
							
								
								Ollama quickstart update ( #10806 )  
							
							 
							
							... 
							
							
							
							* add ollama doc for OLLAMA_NUM_GPU
* remove useless params
* revert unexpected changes back
* move env setting to server part
* update 
							
						 
						
							2024-04-19 15:00:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								08458b4f74 
								
							 
						 
						
							
							
								
								remove rms norm copy ( #10793 )  
							
							 
							
							
							
						 
						
							2024-04-19 13:57:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c7235e34a8 
								
							 
						 
						
							
							
								
								Small update to ut ( #10804 )  
							
							 
							
							
							
						 
						
							2024-04-19 10:59:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								995c01367d 
								
							 
						 
						
							
							
								
								Update readme ( #10802 )  
							
							 
							
							
							
						 
						
							2024-04-19 06:52:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8153c3008e 
								
							 
						 
						
							
							
								
								Initial llama3 example ( #10799 )  
							
							 
							
							... 
							
							
							
							* Add initial hf huggingface GPU example
* Small fix
* Add llama3 gpu pytorch model example
* Add llama 3 hf transformers CPU example
* Add llama 3 pytorch model CPU example
* Fixes
* Small fix
* Small fixes
* Small fix
* Small fix
* Add links
* update repo id
* change prompt tuning url
* remove system header if there is no system prompt
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
Co-authored-by: Yuwen Hu <54161268+Oscilloscope98@users.noreply.github.com> 
							
						 
						
							2024-04-18 11:01:33 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								754b0ffecf 
								
							 
						 
						
							
							
								
								Fix pvc llama ( #10798 )  
							
							 
							
							... 
							
							
							
							* ifx
* update 
							
						 
						
							2024-04-18 10:44:57 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								439c834ed3 
								
							 
						 
						
							
							
								
								LLM: add mixed precision for lm_head ( #10795 )  
							
							 
							
							... 
							
							
							
							* add mixed_quantization
* meet code review
* update
* fix style
* meet review 
							
						 
						
							2024-04-18 19:11:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8796401b08 
								
							 
						 
						
							
							
								
								Support q4k in ipex-llm ( #10796 )  
							
							 
							
							... 
							
							
							
							* support q4k
* update 
							
						 
						
							2024-04-18 18:55:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhicun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								88463cbf47 
								
							 
						 
						
							
							
								
								fix transformer version ( #10788 )  
							
							 
							
							... 
							
							
							
							* fix transformer version
* uninstall sentence transformer
* uninstall
* uninstall 
							
						 
						
							2024-04-18 17:37:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0e8aac19e3 
								
							 
						 
						
							
							
								
								add q6k precision in ipex-llm ( #10792 )  
							
							 
							
							... 
							
							
							
							* add q6k
* add initial 16k
* update
* fix style 
							
						 
						
							2024-04-18 16:52:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e90e31719f 
								
							 
						 
						
							
							
								
								axolotl lora example ( #10789 )  
							
							 
							
							... 
							
							
							
							* Add axolotl lora example
* Modify readme
* Add comments in yml 
							
						 
						
							2024-04-18 16:38:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								14ca42a048 
								
							 
						 
						
							
							
								
								LLM:Fix moe indexs error on cpu ( #10791 )  
							
							 
							
							
							
						 
						
							2024-04-18 15:56:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cbe7b5753f 
								
							 
						 
						
							
							
								
								Add vLLM[xpu] related code ( #10779 )  
							
							 
							
							... 
							
							
							
							* Add ipex-llm side change
* add runable offline_inference
* refactor to call vllm2
* Verified async server
* add new v2 example
* add README
* fix
* change dir
* refactor readme.md
* add experimental
* fix 
							
						 
						
							2024-04-18 15:29:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								053ec30737 
								
							 
						 
						
							
							
								
								Transformers ppl evaluation on wikitext ( #10784 )  
							
							 
							
							... 
							
							
							
							* tranformers code
* cache 
							
						 
						
							2024-04-18 15:27:18 +08:00