Commit graph

  • 25e1709050
    To avoid errors caused by a Transformers version that is too new. (#13291) main Shaojun Liu 2025-08-14 14:52:47 +0800
  • cac90a9238
    update patches (#13290) Shaojun Liu 2025-08-14 10:15:48 +0800
  • 9cfdf143a2
    delete the deprecated llm win test (#13275) Yina Chen 2025-08-01 11:27:46 +0800
  • 891e1f511b
    [Doc] Add note about avoiding sourcing oneAPI for flashmoe and llama.cpp portable zip (#13274) Qiyuan Gong 2025-07-30 13:58:52 +0800
  • 951c23739d
    update quickstart md related to llama.cpp/ollama (#13265) SheldonChen 2025-07-21 16:20:20 +0800
  • 68c5103a0a
    [NPU] Update quickstart reference (#13262) Emmanuel Ferdman 2025-07-21 04:55:40 +0300
  • b229e5ad60
    Update README.md (#13258) Jason Dai 2025-07-18 07:27:01 +0800
  • f0b600da77
    update llama.cpp version (#13251) Yina Chen 2025-07-09 17:30:27 +0800
  • 28f72123bd
    update ollama version (#13244) Ruonan Wang 2025-07-01 09:20:46 +0800
  • 6ba3138d7c
    Fix ambiguous boolean evaluation in bert.py (#13236) zxue2 2025-06-30 14:14:01 +0800
  • 3f6d407be4
    Fix engine.py (#13215) Guancheng Fu 2025-06-09 09:03:17 +0800
  • 5a629ae470
    update vllm patch (#13211) Shaojun Liu 2025-06-06 17:20:45 +0800
  • ac04992278
    Update engine.py (#13209) Guancheng Fu 2025-06-06 15:47:33 +0800
  • dd49368e0c
    only install onednn for windows when torch 2.6 (#13207) Ruonan Wang 2025-06-05 17:28:21 +0800
  • 5a1c1297e1
    Fix internvl fp16 error (#13205) Wang, Jian4 2025-06-05 11:17:44 +0800
  • 45864790f7
    Enable phi-4 with vision and audio (#13203) Wang, Jian4 2025-06-05 10:15:20 +0800
  • e032156518
    Support torch_fp8 (#13196) Yina Chen 2025-06-04 20:08:01 +0800
  • 3accc31b86
    Update 1ccl_for_multi_arc.patch (#13199) Guancheng Fu 2025-05-30 17:13:59 +0800
  • bb50cd0881
    Update api_server.py (#13198) Guancheng Fu 2025-05-30 09:26:53 +0800
  • 9df610f80d
    fix trl import when not running speculative (#13187) Ruonan Wang 2025-05-26 13:21:54 +0800
  • c5d919b151
    update vllm patch (#13185) Shaojun Liu 2025-05-23 15:02:50 +0800
  • 531bef2810
    vLLM: Fix conver_to_half condition (#13177) Xiangyu Tian 2025-05-22 15:44:10 +0800
  • e3130a06ed
    Fix multimodal errors (#13178) Wang, Jian4 2025-05-22 15:39:27 +0800
  • 154af7d7f7
    vLLM: set convert_to_half to False by default (#13172) Xiangyu Tian 2025-05-21 18:41:28 +0800
  • 1576347892
    Update Dockerfile (#13168) Shaojun Liu 2025-05-20 16:41:13 +0800
  • 66eb054988
    Update vllm patch (#13164) Wang, Jian4 2025-05-19 16:54:21 +0800
  • d83e5068d2
    Enable whisper (#13162) Wang, Jian4 2025-05-19 14:07:51 +0800
  • 8ba57b41cd
    Add merge quantized qkv (#13160) Yina Chen 2025-05-16 15:46:47 +0800
  • 1e4e1353a0
    Resolve messages formatting issues (#13095) Emmanuel Ferdman 2025-05-15 11:46:52 +0300
  • 35b49e4d91
    Add trl version in error message (#13049) Kai Huang 2025-05-15 09:16:27 +0800
  • bd45bf7584
    Update llama_cpp_quickstart.md (#13145) Pranav Singh 2025-05-15 06:10:53 +0530
  • bd71739e64
    Update docs and scripts to align with new Docker image release (#13156) Shaojun Liu 2025-05-13 17:06:29 +0800
  • f6441b4e3d
    Add moe_softmax_topk (#13157) Yina Chen 2025-05-13 14:50:59 +0800
  • aa12f69bbf
    Update Ollama portable zip QuickStart regarding saving VRAM (#13155) Yuwen Hu 2025-05-13 13:25:22 +0800
  • 086a8b3ab9
    Update flashmoe_quickstart (#13154) Jason Dai 2025-05-13 07:56:09 +0800
  • 886c7632b2
    Add IPEX_LLM_FORCE_BATCH_FORWARD for vLLM docker image (#13151) Xiangyu Tian 2025-05-12 13:44:33 +0800
  • 5df03ced2c
    Update vllm patch for fix telechat2 and baichuan2 error(#13150) Wang, Jian4 2025-05-12 10:54:22 +0800
  • 9da1c56fa8
    Create flashmoe quickstart (#13147) Jason Dai 2025-05-12 10:11:22 +0800
  • da08c9ca60
    Update Dockerfile (#13148) Guancheng Fu 2025-05-12 09:19:18 +0800
  • 0438e39f3e
    Add PyTorch 2.6 support in Latest Update (#13144) Yuwen Hu 2025-05-09 13:26:49 +0800
  • 45f7bf6688
    Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability (#13141) Shaojun Liu 2025-05-09 10:19:42 +0800
  • f5d9c49a2a
    add rotary_half_with_cache_inplaced to ipex_llm.transformers.models.common (#13143) Ruonan Wang 2025-05-09 09:20:44 +0800
  • f2598b119e
    update for bge-m3 (#13138) Wang, Jian4 2025-05-07 16:59:52 +0800
  • e88a2aa65b
    Modify ollama num_ctx related doc (#13139) SONG Ge 2025-05-07 16:44:58 +0800
  • 3a28b69202
    Add qwen3 support (#13137) Yishuo Wang 2025-05-07 14:03:16 +0800
  • be76918b61
    Update 083 multimodal benchmark (#13135) Wang, Jian4 2025-05-07 09:35:09 +0800
  • 01bc7e9eb9
    Fix 083 lm_head error (#13132) Wang, Jian4 2025-05-06 15:47:20 +0800
  • 685a749adb
    Update ollama-release doc into v0.6.2 (#13094) SONG Ge 2025-04-30 16:22:42 +0800
  • 51b41faad7
    vLLM: update vLLM XPU to 0.8.3 version (#13118) Xiangyu Tian 2025-04-30 14:40:53 +0800
  • f66eee1d1d
    Update BMG troubleshooting guides regarding PPA installation (#13119) Yuwen Hu 2025-04-28 15:48:17 +0800
  • ad741503a9
    Update bmg_quickstart.md (#13117) Jason Dai 2025-04-27 22:03:14 +0800
  • 6b033f8982
    Update readme (#13116) Jason Dai 2025-04-27 18:18:19 +0800
  • d222eaffd7
    Update README.md (#13113) Guancheng Fu 2025-04-27 17:13:18 +0800
  • 16fa778e65
    enable glm4v and gemma-3 on vllm 083 (#13114) Wang, Jian4 2025-04-27 17:10:56 +0800
  • cf97d8f1d7
    Update start-vllm-service.sh (#13109) Guancheng Fu 2025-04-25 15:42:15 +0800
  • 9808fb1ac2
    update doc about flash-moe (#13103) Ruonan Wang 2025-04-24 17:53:14 +0800
  • 0cfdd399e7
    Update README.md (#13104) Guancheng Fu 2025-04-24 10:21:17 +0800
  • 908fdb982e
    small refactor and fix (#13101) Yishuo Wang 2025-04-22 14:45:31 +0800
  • 14cd613fe1
    Update vLLM docs with some new features (#13092) Guancheng Fu 2025-04-22 14:39:28 +0800
  • 0801d27a6f
    Remove PyTorch 2.3 support for Intel GPU (#13097) Yuwen Hu 2025-04-22 10:26:16 +0800
  • a2a35fdfad
    Update portable zip link (#13098) Yina Chen 2025-04-21 17:25:35 +0800
  • 2f78afcd2a
    Refactor some functions to ipex_llm.transformers.models.common (#13091) Ruonan Wang 2025-04-18 11:15:43 +0800
  • 73198d5b80
    Update to b17 image (#13085) Shaojun Liu 2025-04-17 16:18:22 +0800
  • db5edba786
    Update Dockerfile (#13081) Shaojun Liu 2025-04-16 09:18:46 +0800
  • fa56212bb3
    Update vLLM patch (#13079) Shaojun Liu 2025-04-15 16:55:29 +0800
  • f5aaa83649
    Update serving-xpu Dockerfile (#13077) Shaojun Liu 2025-04-15 13:34:14 +0800
  • cfadf3f2f7
    upgrade linux-libc-dev to fix CVEs (#13076) Shaojun Liu 2025-04-15 11:43:53 +0800
  • e08c6bd018
    Fix several models based on sdp api change (#13075) Ruonan Wang 2025-04-15 11:13:12 +0800
  • 7826152f5a
    update vllm patch (#13072) Shaojun Liu 2025-04-14 14:56:10 +0800
  • 10c30cdba9
    set woq_int4 as default int4 (#13021) Yishuo Wang 2025-04-14 14:10:59 +0800
  • 6693e8ab04
    Deepseek kv / sdp support (#13068) Ruonan Wang 2025-04-11 11:26:15 +0800
  • 3ee6dec0f8
    update vllm patch (#13064) Guancheng Fu 2025-04-10 15:03:37 +0800
  • 1d7f4a83ac
    Update documentation to build Docker image from Dockerfile instead of pulling from registry (#13057) Shaojun Liu 2025-04-09 16:40:20 +0800
  • cd0d4857b8
    ipex-llm 2.2.0 post-release update (#13053) Yuwen Hu 2025-04-07 17:41:22 +0800
  • ef852dcb4a
    add audio optimization for qwen2.5-omni (#13037) Yishuo Wang 2025-04-07 17:20:26 +0800
  • 7548c12b2c
    Update portable zip QuickStart regarding signature verification (#13050) Yuwen Hu 2025-04-07 13:34:00 +0800
  • 33ae52d083
    Small doc fix (#13045) Yuwen Hu 2025-04-03 17:35:22 +0800
  • 3cb718d715
    Small updates to Ollama portable zip quickstart (#13043) Yuwen Hu 2025-04-03 17:18:22 +0800
  • b73728c7ce
    Small updates to Ollama portable zip Quickstart (#13040) Yuwen Hu 2025-04-02 18:44:36 +0800
  • 4427012672
    Link updates to pytorch 2.6 quickstart (#13032) Yuwen Hu 2025-04-01 10:35:22 +0800
  • 633d1c72e7
    Add PyTorch 2.6 QuickStart for Intel GPU (#13024) Yuwen Hu 2025-04-01 10:21:38 +0800
  • 34b1b14225
    vLLM: Fix vLLM CPU dockerfile to resolve cmake deprecated issue (#13026) Xiangyu Tian 2025-03-31 16:09:25 +0800
  • 300eb01d98
    Add basic optimization for Qwen2.5 omni (#13022) Yishuo Wang 2025-03-28 17:21:52 +0800
  • 61c2e9c271
    Refactor docker image by applying patch method (#13011) Guancheng Fu 2025-03-28 08:13:50 +0800
  • 7809ca9864
    Reuse --privileged (#13015) Wang, Jian4 2025-03-27 10:00:50 +0800
  • f437b36678
    Fix vllm glm edge model (#13007) Guancheng Fu 2025-03-26 09:25:32 +0800
  • 374747b492
    Update bert optimization to fit higher transformers/torch version (#13006) Yuwen Hu 2025-03-25 16:12:03 +0800
  • 27d669210f
    remove fschat in EAGLE example (#13005) Ruonan Wang 2025-03-25 15:48:48 +0800
  • 08f96a5139
    Rename LICENSE-Intel®-OpenMP*-Runtime-Library.txt to LICENSE-Intel®-OpenMP-Runtime-Library.txt (#13002) Shaojun Liu 2025-03-25 10:07:55 +0800
  • 0e0786a63c
    update llama.cpp related quickstart with rebased llama.cpp (#12996) Ruonan Wang 2025-03-25 09:49:39 +0800
  • 7a86dd0569
    Remove unused Gradio (#12995) Shaojun Liu 2025-03-24 10:51:06 +0800
  • 46a4f53967
    OSPDT: add tpp licenses for release 2.2.0 (#12840) Shaojun Liu 2025-03-21 15:52:22 +0800
  • 5bdf57327d
    Remove ipex import in fastchat loader (#12984) Yuwen Hu 2025-03-20 18:29:00 +0800
  • 6f634b41da
    Update model support list regarding Gemma3 for Ollama portable zip QuickStart (#12979) Yuwen Hu 2025-03-19 11:16:45 +0800
  • dd026db50b
    Add SNC to llama.cpp portable zip quick start (#12972) Qiyuan Gong 2025-03-17 10:58:06 +0800
  • b0d56273a8
    Fix Docker build failure due to outdated ipex-llm pip index URL (#12977) Shaojun Liu 2025-03-17 10:46:01 +0800
  • 760abc47aa
    Fix Docker build failure due to outdated ipex-llm pip index URL (#12976) Shaojun Liu 2025-03-17 09:50:09 +0800
  • 03c9024209
    Update README (#12973) Jason Dai 2025-03-14 19:04:10 +0800
  • 6a7819f1ac
    Update portable zip related quickstart regarding recommanded driver (#12970) Yuwen Hu 2025-03-14 16:34:24 +0800
  • c9ecb7a113
    Fix qwen nan value issue on vllm (#12971) Wang, Jian4 2025-03-14 14:43:54 +0800