ipex-llm

Author	SHA1	Message	Date
Yishuo Wang	5faba06409	simple optimization for moonlight moe decoding forward (#12891 )	2025-02-25 16:18:27 +08:00
Xiangyu Tian	ae9f5320da	vLLM CPU: Fix Triton Version to Resolve Related Error(#12893 )	2025-02-25 15:00:41 +08:00
Yishuo Wang	ab3fc66eb7	optimize attention part of moonlight-14B-A3B (#12886 )	2025-02-25 09:38:13 +08:00
Shaojun Liu	dd30d12cb6	Fix serving-cpu image: setuptools-scm requires setuptools>=61 (#12876 ) * setuptools-scm requires setuptools>=61 * Update Dockerfile * Update Dockerfile * Update Dockerfile	2025-02-25 09:10:14 +08:00
Yuwen Hu	06694ba61a	Further fix portable zip file link (#12885 )	2025-02-24 18:06:57 +08:00
Yuwen Hu	671ddfd847	Update wrong file name for portable zip quickstart (#12883 )	2025-02-24 17:52:09 +08:00
Yuwen Hu	a9c8e73a77	Update llama.cpp Prerequisites guide regarding oneAPI 2025.0 (#12881 ) * Update llama.cpp Prerequisites guide regarding oneAPI 2025.0 * Update based on comments * Small fix * Small fix	2025-02-24 16:32:23 +08:00
Wang, Jian4	4f2f92afa3	Update inference-cpp docker (#12882 ) * remove nouse run.py * add WORKDIR /llm	2025-02-24 14:32:44 +08:00
Yishuo Wang	3f6ecce508	support using xgrammar to get json output (#12870 )	2025-02-24 14:10:58 +08:00
Shaojun Liu	afad979168	Add Apache 2.0 License Information in Dockerfile to Comply with OSPDT Requirements (#12878 ) * ospdt: add Header for Dockerfile * OSPDT: add Header for Dockerfile * OSPDT: add Header for Dockerfile * OSPDT: add Header for Dockerfile	2025-02-24 14:00:46 +08:00
Guancheng Fu	02ec313eab	Update README.md (#12877 )	2025-02-24 09:59:17 +08:00
Shaojun Liu	10400abfb7	Fix CodeQL workflow (#12875 ) * Update codeql.yml * Update codeql.yml	2025-02-24 09:16:54 +08:00
Xu, Shuo	1e00bed001	Add GPU example for Janus-Pro (#12869 ) * Add example for Janus-Pro * Update model link * Fixes * Fixes --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com> Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2025-02-21 18:36:50 +08:00
Yuwen Hu	21d6a78be0	Update Ollama portable zip QuickStart to fit new version (#12871 ) * Update ollama portable zip quickstart * Update demo images	2025-02-21 17:54:14 +08:00
Wang, Jian4	3ea5389a99	Fix vllm api_server v1/models error (#12867 )	2025-02-21 11:08:29 +08:00
binbin Deng	8077850452	[NPU GGUF] Add simple example (#12853 )	2025-02-21 09:58:00 +08:00
Wang, Jian4	348dc8056d	Fix vllm gptq awq error (#12863 ) * fix gptq awq error * fix python style	2025-02-20 16:27:23 +08:00
Yuwen Hu	a488981f3f	Ollama portable zip QuickStart tiny fix (#12862 ) * Tiny fix to ollama portable zip quickstart * Tiny fix	2025-02-20 14:11:12 +08:00
Yuwen Hu	0f2706be42	Update CN Ollama portable zip QuickStart for troubleshooting & tips (#12860 ) * Small fix for english version * Update CN ollama portable zip quickstart for troubleshooting & tips * Small fix	2025-02-20 11:32:06 +08:00
Jason Dai	38a682adb1	Update Readme (#12855 )	2025-02-19 19:55:29 +08:00
Guancheng Fu	4eed0c7d99	initial implementation for low_bit_loader vLLM (#12838 ) * initial * add logic for handling tensor parallel models * fix * Add some comments * add doc * fix done	2025-02-19 19:45:34 +08:00
Xin Qiu	c81b7fc003	Add Portable zip Linux QuickStart (#12849 ) * linux doc * update * Update ollama_portablze_zip_quickstart.md * Update ollama_portablze_zip_quickstart.md * Update ollama_portablze_zip_quickstart.zh-CN.md * Update ollama_portablze_zip_quickstart.md * meet code review * update * Add tips & troubleshooting sections for both Linux & Windows * Rebase * Fix based on comments * Small fix * Fix img * Update table for linux * Small fix --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2025-02-19 19:13:55 +08:00
Xiangyu Tian	b26409d53f	R1 Hybrid: Add Benchmark for DeepSeek R1 transformers example (#12854 ) * init * fix * update * update * fix * fix	2025-02-19 18:33:21 +08:00
SONG Ge	5d041f9ebf	Add latest models list in ollama quickstart (#12850 ) * Add latest models llist on ollama quickstart * update oneapi version describe * move models list to ollama_portable_zip doc * update CN readme	2025-02-19 18:29:43 +08:00
Yishuo Wang	aee2db30f9	update sdp support (#12847 )	2025-02-19 12:07:00 +08:00
Xiangyu Tian	93c10be762	LLM: Support hybrid convert for DeepSeek V3/R1 (#12834 ) LLM: Support hybrid convert for DeepSeek V3/R1	2025-02-19 11:31:19 +08:00
Yuwen Hu	637543e135	Update Ollama portable zip QuickStart with troubleshooting (#12846 ) * Update ollama portable zip quickstart with runtime configurations * Small fix * Update based on comments * Small fix * Small fix	2025-02-19 11:04:03 +08:00
binbin Deng	bde8acc303	[NPU] Update doc of gguf support (#12837 )	2025-02-19 10:46:35 +08:00
Wang, Jian4	e1809a6295	Update multimodal on vllm 0.6.6 (#12816 ) * add glm4v and minicpmv example * fix	2025-02-19 10:04:42 +08:00
Xiangyu Tian	09150b6058	Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 (#12832 ) Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 with DeepseekV3Attention and DeepseekV3MLP to XPU	2025-02-18 13:34:14 +08:00
Xiangyu Tian	09ed96082b	Add DeepSeek V3/R1 CPU example (#12836 ) Add DeepSeek V3/R1 CPU example for bf16 model	2025-02-18 12:45:49 +08:00
Yishuo Wang	8418450300	optimize minicpm-o's tts part (#12833 )	2025-02-17 14:53:37 +08:00
Shaojun Liu	f7b5a093a7	Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815 ) * Update Dockerfile * Update Dockerfile * Ensure scripts are executable * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * update * Update Dockerfile * remove inference-cpu and inference-xpu * update README	2025-02-17 14:23:22 +08:00
Jason Dai	eaec64baca	Update README.md (#12826 )	2025-02-14 21:20:57 +08:00
joan726	59e8e1e91e	Added ollama_portablze_zip_quickstart.zh-CN.md (#12822 )	2025-02-14 18:54:12 +08:00
Jason Dai	a09552e59a	Update ollama quickstart (#12823 )	2025-02-14 09:55:48 +08:00
Yuwen Hu	f67986021c	Update download link for Ollama portable zip QuickStart (#12821 ) * Update download link for Ollama portable zip quickstart * Update based on comments	2025-02-13 17:48:02 +08:00
Jason Dai	16e63cbc18	Update readme (#12820 )	2025-02-13 14:26:04 +08:00
Yuwen Hu	68414afcb9	Add initial QuickStart for Ollama portable zip (#12817 ) * Add initial quickstart for Ollama portable zip * Small fix * Fixed based on comments * Small fix * Add demo image for run ollama * Update download link	2025-02-13 13:18:14 +08:00
Wang, Jian4	1083fe5508	Reenable pp and lightweight-serving serving on 0.6.6 (#12814 ) * reenable pp ang lightweight serving on 066 * update readme * updat * update tag	2025-02-13 10:16:00 +08:00
Guancheng Fu	af693425f1	Upgrade to vLLM 0.6.6 (#12796 ) * init * update engine init * fix serving load_in_low_bit problem * temp * temp * temp * temp * temp * fix * fixed * done * fix * fix all arguments * fix * fix throughput script * fix * fix * use official ipex-llm * Fix readme * fix --------- Co-authored-by: hzjane <a1015616934@qq.com>	2025-02-12 16:47:51 +08:00
Yishuo Wang	f8ab833f74	support and optimize janus pro (#12813 )	2025-02-12 15:07:24 +08:00
Shaojun Liu	bd815a4d96	Update the base image of inference-cpp image to oneapi 2025.0.2 (#12802 ) * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile	2025-02-12 14:15:08 +08:00
Yishuo Wang	73cfe293fa	add basic support for Baichuan-M1-14B-Instruct (#12808 )	2025-02-11 17:27:42 +08:00
binbin Deng	d093b75aa0	[NPU] Update driver installation in QuickStart (#12807 )	2025-02-11 15:49:21 +08:00
Xiangyu Tian	b70ad902b4	Fix ipex-llm CPU linear dtype not match (#12805 )	2025-02-11 10:34:44 +08:00
Shaojun Liu	2701a9d1e3	Remove Migrated Workflows to Avoid Duplication and Confusion (#12801 ) * Delete .github/actions/llm directory * Delete .github/workflows/release-ipex-llm.yaml * Delete .github/workflows/llm-nightly-test.yml * Delete .github/workflows/llm_unit_tests.yml * Delete .github/workflows/llm-binary-build.yml * Delete .github/workflows/llm_example_tests.yml * Delete .github/workflows/llm_performance_tests.yml * Delete .github/workflows/manually_build.yml * Delete .github/workflows/manually_build_for_testing.yml * Delete .github/workflows/release-pypi.yml	2025-02-10 14:58:08 +08:00
Yina Chen	eb2df5ed70	common.h -> npu/npu_common.h (#12800 )	2025-02-10 14:38:22 +08:00
Yishuo Wang	e4ceb722b6	fix qwen2 vl (#12798 )	2025-02-10 13:25:53 +08:00
binbin Deng	3fee838b14	[NPU] Fix of c++ convert example (#12797 )	2025-02-10 11:17:58 +08:00

1 2 3 4 5 ...

3974 commits