ipex-llm/python/llm/example/CPU/HF-Transformers-AutoModels/Model
Keyan (Kyrie) Zhang 59861f73e5 Add Deepseek-6.7B (#9991)
* Add new example Deepseek

* Add new example Deepseek

* Add new example Deepseek

* Add new example Deepseek

* Add new example Deepseek

* modify deepseek

* modify deepseek

* Add verified model in README

* Turn cpu_embedding=True in Deepseek example

---------

Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:36:39 +08:00
..
aquila
aquila2
baichuan
baichuan2
bluelm
chatglm Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
chatglm2 Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
chatglm3
codellama LLM: fix installation of codellama (#9813) 2024-01-02 14:32:50 +08:00
codeshell Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
deciLM-7b Add CPU and GPU examples for DeciLM-7B (#9867) 2024-02-27 13:15:49 +08:00
deepseek Add Deepseek-6.7B (#9991) 2024-02-28 11:36:39 +08:00
deepseek-moe Add DeepSeek-MoE-16B-Chat (#10155) 2024-02-28 10:12:09 +08:00
distil-whisper
dolly_v1
dolly_v2
falcon falcon for transformers 4.36 (#9960) 2024-02-22 17:04:40 -08:00
flan-t5
fuyu
gemma update Gemma readme (#10229) 2024-02-23 16:57:08 +08:00
internlm Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
internlm-xcomposer
internlm2 Add HF and PyTorch example InternLM2 (#10061) 2024-02-04 10:25:55 +08:00
llama2 [LLM] Correct prompt format of Yi, Llama2 and Qwen in generate.py (#9786) 2023-12-26 16:57:55 +08:00
mistral
mixtral
moss Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
mpt Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
phi-1_5 Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
phi-2 Add CPU and GPU examples of phi-2 (#10014) 2024-02-23 14:05:53 +08:00
phixtral add phixtral and optimize phi-moe (#10052) 2024-02-05 11:12:47 +08:00
phoenix Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
qwen [LLM] Correct prompt format of Qwen in generate.py (#9678) 2023-12-14 14:01:30 +08:00
qwen-vl
qwen1.5 Add Qwen1.5-7B-Chat (#10113) 2024-02-21 13:29:29 +08:00
redpajama
replit Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
skywork Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
solar Fix README.md for solar (#9957) 2024-01-24 15:50:54 +08:00
starcoder Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
vicuna Uing bigdl-llm-init instead of bigdl-nano-init (#9558) 2023-11-30 10:10:29 +08:00
whisper
wizardcoder-python
yi [LLM] Correct prompt format of Yi, Llama2 and Qwen in generate.py (#9786) 2023-12-26 16:57:55 +08:00
yuan2 Add CPU and GPU examples for Yuan2-2B-hf (#9946) 2024-02-23 14:09:30 +08:00
ziya Add ziya CPU example (#10114) 2024-02-20 13:59:52 +08:00
README.md

BigDL-LLM Transformers INT4 Optimization for Large Language Model

You can use BigDL-LLM to run any Huggingface Transformer models with INT4 optimizations on either servers or laptops. This directory contains example scripts to help you quickly get started using BigDL-LLM to run some popular open-source models in the community. Each model has its own dedicated folder, where you can find detailed instructions on how to install and run it.

To run the examples, we recommend using Intel® Xeon® processors (server), or >= 12th Gen Intel® Core™ processor (client).

For OS, BigDL-LLM supports Ubuntu 20.04 or later (glibc>=2.17), CentOS 7 or later (glibc>=2.17), and Windows 10/11.

Best Known Configuration on Linux

For better performance, it is recommended to set environment variables on Linux with the help of BigDL-LLM:

pip install bigdl-llm
source bigdl-llm-init