ipex-llm

Author	SHA1	Message	Date
Xiangyu Tian	51b41faad7	vLLM: update vLLM XPU to 0.8.3 version (#13118 ) vLLM: update vLLM XPU to 0.8.3 version	2025-04-30 14:40:53 +08:00
Guancheng Fu	d222eaffd7	Update README.md (#13113 )	2025-04-27 17:13:18 +08:00
Guancheng Fu	0cfdd399e7	Update README.md (#13104 )	2025-04-24 10:21:17 +08:00
Guancheng Fu	14cd613fe1	Update vLLM docs with some new features (#13092 ) * done * fix * done * Update README.md	2025-04-22 14:39:28 +08:00
Yuwen Hu	0801d27a6f	Remove PyTorch 2.3 support for Intel GPU (#13097 ) * Remove PyTorch 2.3 installation option for GPU * Remove xpu_lnl option in installation guides for docs * Update BMG quickstart * Remove PyTorch 2.3 dependencies for GPU examples * Update the graphmode example to use stable version 2.2.0 * Fix based on comments	2025-04-22 10:26:16 +08:00
Ruonan Wang	27d669210f	remove fschat in EAGLE example (#13005 ) * update fschat version * fix	2025-03-25 15:48:48 +08:00
Heyang Sun	cd109bb061	Gemma QLoRA example (#12969 ) * Gemma QLoRA example * Update README.md * Update README.md --------- Co-authored-by: sgwhat <ge.song@intel.com>	2025-03-14 14:27:51 +08:00
Yuwen Hu	7c0c77cce3	Tiny fixes (#12936 )	2025-03-05 14:55:26 +08:00
Yuwen Hu	68a770745b	Add moonlight GPU example (#12929 ) * Add moonlight GPU example and update table * Small fix * Fix based on comments * Small fix	2025-03-05 11:31:14 +08:00
Yuwen Hu	443cb5d4e0	Update Janus-Pro GPU example (#12906 )	2025-02-28 15:39:03 +08:00
Xin Qiu	e946127613	glm 4v 1st sdp for vision (#12904 ) * glm4v 1st sdp * update glm4v example * meet code review * fix style	2025-02-28 13:23:27 +08:00
Guancheng Fu	02ec313eab	Update README.md (#12877 )	2025-02-24 09:59:17 +08:00
Xu, Shuo	1e00bed001	Add GPU example for Janus-Pro (#12869 ) * Add example for Janus-Pro * Update model link * Fixes * Fixes --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com> Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2025-02-21 18:36:50 +08:00
Guancheng Fu	4eed0c7d99	initial implementation for low_bit_loader vLLM (#12838 ) * initial * add logic for handling tensor parallel models * fix * Add some comments * add doc * fix done	2025-02-19 19:45:34 +08:00
Xiangyu Tian	b26409d53f	R1 Hybrid: Add Benchmark for DeepSeek R1 transformers example (#12854 ) * init * fix * update * update * fix * fix	2025-02-19 18:33:21 +08:00
Xiangyu Tian	93c10be762	LLM: Support hybrid convert for DeepSeek V3/R1 (#12834 ) LLM: Support hybrid convert for DeepSeek V3/R1	2025-02-19 11:31:19 +08:00
Xiangyu Tian	09150b6058	Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 (#12832 ) Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 with DeepseekV3Attention and DeepseekV3MLP to XPU	2025-02-18 13:34:14 +08:00
Yishuo Wang	8aea5319bb	update more lora example (#12785 )	2025-02-08 09:46:48 +08:00
Yishuo Wang	d0d9c9d636	remove load_in_8bit usage as it is not supported a long time ago (#12779 )	2025-02-07 11:21:29 +08:00
Yishuo Wang	b4c9e23f73	fix galore and peft finetune example (#12776 )	2025-02-06 16:36:13 +08:00
Yishuo Wang	c0d6b282b8	fix lisa finetune example (#12775 )	2025-02-06 16:35:43 +08:00
Yishuo Wang	2e5f2e5dda	fix dpo finetune (#12774 )	2025-02-06 16:35:21 +08:00
Yishuo Wang	9697197f3e	fix qlora finetune example (#12769 )	2025-02-06 11:18:28 +08:00
Yuwen Hu	184adb2653	Small fix to MiniCPM-o-2_6 GPU example (#12766 )	2025-02-05 11:32:26 +08:00
Yuwen Hu	d11f257ee7	Add GPU example for MiniCPM-o-2_6 (#12735 ) * Add init example for omni mode * Small fix * Small fix * Add chat example * Remove lagecy link * Further update link * Add readme * Small fix * Update main readme link * Update based on comments * Small fix * Small fix * Small fix	2025-01-23 16:10:19 +08:00
Yuwen Hu	c52bdff76b	Update Deepseek coder GPU example (#12712 ) * Update Deepseek coder GPU example * Fix based on comment	2025-01-16 14:05:31 +08:00
Xu, Shuo	350fae285d	Add Qwen2-VL HF GPU example with ModelScope Support (#12606 ) * Add qwen2-vl example * complete generate.py & readme * improve lint style * update 1-6 * update main readme * Format and other small fixes --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2025-01-13 15:42:04 +08:00
Xu, Shuo	62318964fa	Update llama example information (#12640 ) Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2025-01-02 13:48:39 +08:00
Yishuo Wang	c72a5db757	remove unused code again (#12624 )	2024-12-27 14:17:11 +08:00
Xu, Shuo	55ce091242	Add GLM4-Edge-V GPU example (#12596 ) * Add GLM4-Edge-V examples * polish readme * revert wrong changes * polish readme * polish readme * little polish in reference info and indent * Small fix and sample output updates * Update main readme --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2024-12-27 09:40:29 +08:00
Xu, Shuo	ef585d3360	Polish Readme for ModelScope-related examples (#12603 )	2024-12-26 10:52:47 +08:00
Xu, Shuo	b0338c5529	Add --modelscope option for glm-v4 MiniCPM-V-2_6 glm-edge and internvl2 (#12583 ) * Add --modelscope option for glm-v4 and MiniCPM-V-2_6 * glm-edge * minicpm-v-2_6:don't use model_hub=modelscope when use lowbit; internvl2 --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-12-20 13:54:17 +08:00
Xu, Shuo	47da3c999f	Add `--modelscope` in GPU examples for minicpm, minicpm3, baichuan2 (#12564 ) * Add --modelscope for more models * minicpm --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-12-19 17:25:46 +08:00
Xu, Shuo	47e90a362f	Add `--modelscope` in GPU examples for glm4, codegeex2, qwen2 and qwen2.5 (#12561 ) * Add --modelscope for more models * imporve readme --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-12-19 10:00:39 +08:00
Xu, Shuo	ccc18eefb5	Add Modelscope option for chatglm3 on GPU (#12545 ) * Add Modelscope option for GPU model chatglm3 * Update readme * Update readme * Update readme * Update readme * format update --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-12-16 20:00:37 +08:00
Chu,Youcheng	a86487c539	Add GLM-Edge GPU example (#12483 ) * feat: initial commit * generate.py and README updates * Update link for main readme * Update based on comments * Small fix --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2024-12-16 14:39:19 +08:00
Jun Wang	0b953e61ef	[REFINE] graphmode code (#12540 )	2024-12-16 09:17:01 +08:00
Heyang Sun	fa261b8af1	torch 2.3 inference docker (#12517 ) * torch 2.3 inference docker * Update README.md * add convert code * rename image * remove 2.1 and add graph example * Update README.md	2024-12-13 10:47:04 +08:00
Chu,Youcheng	ce6fcaa9ba	update transformers version in example of glm4 (#12453 ) * fix: update transformers version in example of glm4 * fix: textual adjustments * fix: texual adjustment	2024-11-27 15:02:25 +08:00
Yuwen Hu	effb9bb41c	Small update to LangChain examples readme (#12452 )	2024-11-27 14:02:25 +08:00
Chu,Youcheng	acd77d9e87	Remove env variable `BIGDL_LLM_XMX_DISABLED` in documentation (#12445 ) * fix: remove BIGDL_LLM_XMX_DISABLED in mddocs * fix: remove set SYCL_CACHE_PERSISTENT=1 in example * fix: remove BIGDL_LLM_XMX_DISABLED in workflows * fix: merge igpu and A-series Graphics * fix: remove set BIGDL_LLM_XMX_DISABLED=1 in example * fix: remove BIGDL_LLM_XMX_DISABLED in workflows * fix: merge igpu and A-series Graphics * fix: textual adjustment * fix: textual adjustment * fix: textual adjustment	2024-11-27 11:16:36 +08:00
Jin, Qiao	c2efa264d9	Update LangChain examples to use upstream (#12388 ) * Update LangChain examples to use upstream * Update README and fix links * Update LangChain CPU examples to use upstream * Update LangChain CPU voice_assistant example * Update CPU README * Update GPU README * Remove GPU Langchain vLLM example and fix comments * Change langchain -> LangChain * Add reference for both upstream llms and embeddings * Fix comments * Fix comments * Fix comments * Fix comments * Fix comment	2024-11-26 16:43:15 +08:00
Jinhe	66bd7abae4	add sdxl and lora-lcm optimization (#12444 ) * add sdxl and lora-lcm optimization * fix openjourney speed drop	2024-11-26 11:38:09 +08:00
Jinhe	7e0a840f74	add optimization to openjourney (#12423 ) * add optimization to openjourney * add optimization to openjourney	2024-11-21 15:23:51 +08:00
Jinhe	d2a37b6ab2	add Stable diffusion examples (#12418 ) * add openjourney example * add timing * add stable diffusion to model page * 4.1 fix * small fix	2024-11-20 17:18:36 +08:00
Qiyuan Gong	7e50ff113c	Add padding_token=eos_token for GPU trl QLora example (#12398 ) * Avoid tokenizer doesn't have a padding token error.	2024-11-14 10:51:30 +08:00
Guancheng Fu	0ee54fc55f	Upgrade to vllm 0.6.2 (#12338 ) * Initial updates for vllm 0.6.2 * fix * Change Dockerfile to support v062 * Fix * fix examples * Fix * done * fix * Update engine.py * Fix Dockerfile to original path * fix * add option * fix * fix * fix * fix --------- Co-authored-by: xiangyuT <xiangyu.tian@intel.com>	2024-11-12 20:35:34 +08:00
Qiyuan Gong	2dfcc36825	Fix trl version and padding in trl qlora example (#12368 ) * Change trl to 0.9.6 * Enable padding to avoid padding related errors.	2024-11-08 16:05:17 +08:00
Jin, Qiao	82a61b5cf3	Limit trl version in example (#12332 ) * Limit trl version in example * Limit trl version in example	2024-11-05 14:50:10 +08:00
Zijie Li	cd5e22cee5	Update Llava GPU Example (#12311 ) * update-llava-example * add warmup * small fix on llava example * remove space& extra print prompt * renew example * small fix --------- Co-authored-by: Jinhe Tang <jin.tang1337@gmail.com>	2024-11-01 17:06:00 +08:00

1 2 3 4 5 ...

404 commits