ipex-llm

intel/ipex-llm - Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

Find a file

Shengsheng Huang f2e4c40cee change the readthedocs theme and reorg the sections (#6056 ) * refactor toc * refactor toc * Change to pydata-sphinx-theme and update packages requirement list for ReadtheDocs * Remove customized css for old theme * Add index page to each top bar section and limit dropdown maximum to be 4 * Use js to change 'More' to 'Libraries' * Add custom.css to conf.py for further css changes * Add BigDL logo and search bar * refactor toc * refactor toc and add overview * refactor toc and add overview * refactor toc and add overview * refactor get started * add paper and video section * add videos * add grid columns in landing page * add document roadmap to index * reapply search bar and github icon commit * reorg orca and chronos sections * Test: weaken ads by js * update: change left attrbute * update: add comments * update: change opacity to 0.7 * Remove useless theme template override for old theme * Add sidebar releases component in the home page * Remove sidebar search and restore top nav search button * Add BigDL handouts * Add back to homepage button to pages except from the home page * Update releases contents & styles in left sidebar * Add version badge to the top bar * Test: weaken ads by js * update: add comments * remove landing page contents * rfix chronos install * refactor install * refactor chronos section titles * refactor nano index * change chronos landing * revise chronos landing page * add document navigator to nano landing page * revise install landing page * Improve css of versions in sidebar * Make handouts image pointing to a page in new tab * add win guide to install * add dliib installation * revise title bar * rename index files * add index page for user guide * add dllib and orca API * update user guide landing page * refactor side bar * Remove extra style configuration of card components & make different card usage consistent * Remove extra styles for Nano how-to guides * Remove extra styles for Chronos how-to guides * Remove dark mode for now * Update index page description * Add decision tree for choosing BigDL libraries in index page * add dllib models api, revise core layers formats * Change primary & info color in light mode * Restyle card components * Restructure Chronos landing page * Update card style * Update BigDL library selection decision tree * Fix failed Chronos tutorials filter * refactor PPML documents * refactor and add friesian documents * add friesian arch diagram * update landing pages and fill key features guide index page * Restyle link card component * Style video frames in PPML sections * Adjust Nano landing page * put api docs to the last in index for convinience * Make badge horizontal padding smaller & small changes * Change the second letter of all header titles to be small capitalizd * Small changes on Chronos index page * Revise decision tree to make it smaller * Update: try to change the position of ads. * Bugfix: deleted nonexist file config * Update: update ad JS/CSS/config * Update: change ad. * Update: delete my template and change files. * Update: change chronos installation table color. * Update: change table font color to --pst-color-primary-text * Remove old contents in landing page sidebar * Restyle badge for usage in card footer again * Add quicklinks template on landing page sidebar * add quick links * Add scala logo * move tf, pytorch out of the link * change orca key features cards * fix typo * fix a mistake in wording * Restyle badge for card footer * Update decision tree * Remove useless html templates * add more api docs and update tutorials in dllib * update chronos install using new style * merge changes in nano doc from master * fix quickstart links in sidebar quicklinks * Make tables responsive * Fix overflow in api doc * Fix list indents problems in [User guide] section * Further fixes to nested bullets contents in [User Guide] section * Fix strange title in Nano 5-min doc * Fix list indent problems in [DLlib] section * Fix misnumbered list problems and other small fixes for [Chronos] section * Fix list indent problems and other small fixes for [Friesian] section * Fix list indent problem and other small fixes for [PPML] section * Fix list indent problem for developer guide * Fix list indent problem for [Cluster Serving] section * fix dllib links * Fix wrong relative link in section landing page Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> Co-authored-by: Juntao Luo <1072087358@qq.com>		2022-10-18 15:35:31 +08:00
.github	[PPML]build gramine image with all log (#6123 )	2022-10-14 10:51:34 +08:00
docs/readthedocs	change the readthedocs theme and reorg the sections (#6056 )	2022-10-18 15:35:31 +08:00
.gitignore	Update multi-task example (#5787 )	2022-09-16 09:45:43 +08:00
.readthedocs.yml	update readthedocs config (#4322 )	2022-03-29 18:19:38 +08:00
LICENSE	Add LICENSE and SECURITY files via upload (#3722 )	2021-12-14 14:49:30 +08:00
pyproject.toml	update readthedocs conf (#3158 )	2021-10-20 18:28:57 +08:00
README.md	Update Project Readme (#6086 )	2022-10-11 16:49:30 +08:00
SECURITY.md	Add LICENSE and SECURITY files via upload (#3722 )	2021-12-14 14:49:30 +08:00

README.md

Fast, Distributed, Secure AI for Big Data

BigDL seamlessly scales your data analytics & AI applications from laptop to cloud, with the following libraries:

Orca: Distributed Big Data & AI (TF & PyTorch) Pipeline on Spark and Ray
Nano: Transparent Acceleration of Tensorflow & PyTorch Programs
DLlib: “Equivalent of Spark MLlib” for Deep Learning
Chronos: Scalable Time Series Analysis using AutoML
Friesian: End-to-End Recommendation Systems
PPML (experimental): Secure Big Data and AI (with SGX Hardware Security)

For more information, you may read the docs.

Choosing the right BigDL library

flowchart TD;
    Feature1{{HW Secured Big Data & AI?}};
    Feature1-- No -->Feature2{{Python vs. Scala/Java?}};
    Feature1-- "Yes"  -->ReferPPML([<em><strong>PPML</strong></em>]);
    Feature2-- Python -->Feature3{{What type of application?}};
    Feature2-- Scala/Java -->ReferDLlib([<em><strong>DLlib</strong></em>]);
    Feature3-- "Distributed Big Data + AI (TF/PyTorch)" -->ReferOrca([<em><strong>Orca</strong></em>]);
    Feature3-- Accelerating TensorFlow / PyTorch -->ReferNano([<em><strong>Nano</strong></em>]);
    Feature3-- DL for Spark MLlib -->ReferDLlib2([<em><strong>DLlib</strong></em>]);
    Feature3-- High Level App Framework -->Feature4{{Domain?}};
    Feature4-- Time Series -->ReferChronos([<em><strong>Chronos</strong></em>]);
    Feature4-- Recommendation System -->ReferFriesian([<em><strong>Friesian</strong></em>]);
    
    click ReferNano "https://github.com/intel-analytics/bigdl#nano"
    click ReferOrca "https://github.com/intel-analytics/bigdl#orca"
    click ReferDLlib "https://github.com/intel-analytics/bigdl#dllib"
    click ReferDLlib2 "https://github.com/intel-analytics/bigdl#dllib"
    click ReferChronos "https://github.com/intel-analytics/bigdl#chronos"
    click ReferFriesian "https://github.com/intel-analytics/bigdl#friesian"
    click ReferPPML "https://github.com/intel-analytics/bigdl#ppml"
    
    classDef ReferStyle1 fill:#5099ce,stroke:#5099ce;
    classDef Feature fill:#FFF,stroke:#08409c,stroke-width:1px;
    class ReferNano,ReferOrca,ReferDLlib,ReferDLlib2,ReferChronos,ReferFriesian,ReferPPML ReferStyle1;
    class Feature1,Feature2,Feature3,Feature4,Feature5,Feature6,Feature7 Feature;

Installing

To install BigDL, we recommend using conda environment:
```
conda create -n my_env 
conda activate my_env
pip install bigdl
```
To install latest nightly build, use pip install --pre --upgrade bigdl; see Python and Scala user guide for more details.
To install each individual library, such as Chronos, use pip install bigdl-chronos; see the document website for more details.

Getting Started

Orca

The Orca library seamlessly scales out your single node TensorFlow, PyTorch or OpenVINO programs across large clusters (so as to process distributed Big Data).

Show Orca example

You can build end-to-end, distributed data processing & AI programs using Orca in 4 simple steps:

# 1. Initilize Orca Context (to run your program on K8s, YARN or local laptop)
from bigdl.orca import init_orca_context, OrcaContext
sc = init_orca_context(cluster_mode="k8s", cores=4, memory="10g", num_nodes=2) 

# 2. Perform distribtued data processing (supporting Spark DataFrames,
# TensorFlow Dataset, PyTorch DataLoader, Ray Dataset, Pandas, Pillow, etc.)
spark = OrcaContext.get_spark_session()
df = spark.read.parquet(file_path)
df = df.withColumn('label', df.label-1)
...

# 3. Build deep learning models using standard framework APIs
# (supporting TensorFlow, PyTorch, Keras, OpenVino, etc.)
from tensorflow import keras
...
model = keras.models.Model(inputs=[user, item], outputs=predictions)  
model.compile(...)

# 4. Use Orca Estimator for distributed training/inference
from bigdl.orca.learn.tf.estimator import Estimator
est = Estimator.from_keras(keras_model=model)  
est.fit(data=df,
        feature_cols=['user', 'item'],
        label_cols=['label'],
        ...)

See Orca user guide, as well as TensorFlow and PyTorch quickstart, for more details.

In addition, you can also run standard Ray programs on Spark cluster using RayOnSpark in Orca.

Show RayOnSpark example

You can not only run Ray program on Spark cluster, but also write Ray code inline with Spark code (so as to process the in-memory Spark RDDs or DataFrames) using RayOnSpark in Orca.

# 1. Initilize Orca Context (to run your program on K8s, YARN or local laptop)
from bigdl.orca import init_orca_context, OrcaContext
sc = init_orca_context(cluster_mode="yarn", cores=4, memory="10g", num_nodes=2, init_ray_on_spark=True) 

# 2. Distribtued data processing using Spark
spark = OrcaContext.get_spark_session()
df = spark.read.parquet(file_path).withColumn(...)

# 3. Convert Spark DataFrame to Ray Dataset
from bigdl.orca.data import spark_df_to_ray_dataset
dataset = spark_df_to_ray_dataset(df)

# 4. Use Ray to operate on Ray Datasets
import ray

@ray.remote
def consume(data) -> int:
   num_batches = 0
   for batch in data.iter_batches(batch_size=10):
       num_batches += 1
   return num_batches

print(ray.get(consume.remote(dataset)))

See RayOnSpark user guide and quickstart for more details.

Nano

You can transparently accelerate your TensorFlow or PyTorch programs on your laptop or server using Nano. With minimum code changes, Nano automatically applies modern CPU optimizations (e.g., SIMD, multiprocessing, low precision, etc.) to standard TensorFlow and PyTorch code, with up-to 10x speedup.

Show Nano inference example

You can automatically optimize a trained PyTorch model for inference or deployment using Nano:

model = ResNet18().load_state_dict(...)
train_dataloader = ...
val_dataloader = ...
def accuracy (pred, target):
  ... 

from bigdl.nano.pytorch import InferenceOptimizer
optimizer = InferenceOptimizer()
optimizer.optimize(model,
                   training_data=train_dataloader,
                   validation_data=val_dataloader,
                   metric=accuracy)
new_model, config = optimizer.get_best_model()

optimizer.summary()

The output of optimizer.summary() will be something like:

 -------------------------------- ---------------------- -------------- ----------------------
|             method             |        status        | latency(ms)  |       accuracy       |
 -------------------------------- ---------------------- -------------- ----------------------
|            original            |      successful      |    43.688    |        0.969         |
|           fp32_ipex            |      successful      |    33.383    |    not recomputed    |
|              bf16              |   fail to forward    |     None     |         None         |
|           bf16_ipex            |    early stopped     |   203.897    |         None         |
|              int8              |      successful      |    10.74     |        0.969         |
|            jit_fp32            |      successful      |    38.732    |    not recomputed    |
|         jit_fp32_ipex          |      successful      |    35.205    |    not recomputed    |
|  jit_fp32_ipex_channels_last   |      successful      |    19.327    |    not recomputed    |
|         openvino_fp32          |      successful      |    10.215    |    not recomputed    |
|         openvino_int8          |      successful      |    8.192     |        0.969         |
|        onnxruntime_fp32        |      successful      |    20.931    |    not recomputed    |
|    onnxruntime_int8_qlinear    |      successful      |    8.274     |        0.969         |
|    onnxruntime_int8_integer    |   fail to convert    |     None     |         None         |
 -------------------------------- ---------------------- -------------- ----------------------

Optimization cost 64.3s in total.

Show Nano Training example

You may easily accelerate PyTorch training (e.g., IPEX, BF16, Multi-Instance Training, etc.) using Nano:

model = ResNet18()
optimizer = torch.optim.SGD(...)
train_loader = ...
val_loader = ...

from bigdl.nano.pytorch import TorchNano

# Define your training loop inside `TorchNano.train`
class Trainer(TorchNano):
	def train(self):
	# call `setup` to prepare for model, optimizer(s) and dataloader(s) for accelerated training
	model, optimizer, (train_loader, val_loader) = self.setup(model, optimizer,
  train_loader, val_loader)
  
    for epoch in range(num_epochs):  
      model.train()  
      for data, target in train_loader:  
        optimizer.zero_grad()  
        output = model(data)  
        # replace the loss.backward() with self.backward(loss)  
        loss = loss_fuc(output, target)  
        self.backward(loss)  
        optimizer.step()   

# Accelerated training (IPEX, BF16 and Multi-Instance Training)
Trainer(use_ipex=True, precision='bf16', num_processes=2).train()

See Nano user guide and tutotial for more details.

DLlib

With DLlib, you can write distributed deep learning applications as standard (Scala or Python) Spark programs, using the same Spark DataFrames and ML Pipeline APIs.

Show DLlib Scala example

You can build distributed deep learning applications for Spark using DLlib Scala APIs in 3 simple steps:

// 1. Call `initNNContext` at the beginning of the code: 
import com.intel.analytics.bigdl.dllib.NNContext
val sc = NNContext.initNNContext()

// 2. Define the deep learning model using Keras-style API in DLlib:
import com.intel.analytics.bigdl.dllib.keras.layers._
import com.intel.analytics.bigdl.dllib.keras.Model
val input = Input[Float](inputShape = Shape(10))  
val dense = Dense[Float](12).inputs(input)  
val output = Activation[Float]("softmax").inputs(dense)  
val model = Model(input, output)

// 3. Use `NNEstimator` to train/predict/evaluate the model using Spark DataFrame and ML pipeline APIs
import org.apache.spark.sql.SparkSession
import org.apache.spark.ml.feature.MinMaxScaler
import org.apache.spark.ml.Pipeline
import com.intel.analytics.bigdl.dllib.nnframes.NNEstimator
import com.intel.analytics.bigdl.dllib.nn.CrossEntropyCriterion
import com.intel.analytics.bigdl.dllib.optim.Adam
val spark = SparkSession.builder().getOrCreate()
val trainDF = spark.read.parquet("train_data")
val validationDF = spark.read.parquet("val_data")
val scaler = new MinMaxScaler().setInputCol("in").setOutputCol("value")
val estimator = NNEstimator(model, CrossEntropyCriterion())  
        .setBatchSize(128).setOptimMethod(new Adam()).setMaxEpoch(5)
val pipeline = new Pipeline().setStages(Array(scaler, estimator))

val pipelineModel = pipeline.fit(trainDF)  
val predictions = pipelineModel.transform(validationDF)

Show DLlib Python example

You can build distributed deep learning applications for Spark using DLlib Python APIs in 3 simple steps:

# 1. Call `init_nncontext` at the beginning of the code:
from bigdl.dllib.nncontext import init_nncontext
sc = init_nncontext()

# 2. Define the deep learning model using Keras-style API in DLlib:
from bigdl.dllib.keras.layers import Input, Dense, Activation
from bigdl.dllib.keras.models import Model
input = Input(shape=(10,))
dense = Dense(12)(input)
output = Activation("softmax")(dense)
model = Model(input, output)

# 3. Use `NNEstimator` to train/predict/evaluate the model using Spark DataFrame and ML pipeline APIs
from pyspark.sql import SparkSession
from pyspark.ml.feature import MinMaxScaler
from pyspark.ml import Pipeline
from bigdl.dllib.nnframes import NNEstimator
from bigdl.dllib.nn.criterion import CrossEntropyCriterion
from bigdl.dllib.optim.optimizer import Adam
spark = SparkSession.builder.getOrCreate()
train_df = spark.read.parquet("train_data")
validation_df = spark.read.parquet("val_data")
scaler = MinMaxScaler().setInputCol("in").setOutputCol("value")
estimator = NNEstimator(model, CrossEntropyCriterion())\
    .setBatchSize(128)\
    .setOptimMethod(Adam())\
    .setMaxEpoch(5)
pipeline = Pipeline(stages=[scaler, estimator])

pipelineModel = pipeline.fit(train_df)
predictions = pipelineModel.transform(validation_df)

See DLlib NNFrames and Keras API user guides for more details.

Chronos

The Chronos library makes it easy to build fast, accurate and scalable time series analysis applications (with AutoML).

Show Chronos example

You can train a time series forecaster using Chronos in 3 simple steps:

from bigdl.chronos.forecaster import TCNForecaster 
from bigdl.chronos.data.repo_dataset import get_public_dataset

# 1. Process time series data using `TSDataset`
tsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')
for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
    data.roll(lookback=100, horizon=1)

# 2. Create a `TCNForecaster` (automatically configured based on train_data)
forecaster = TCNForecaster.from_tsdataset(train_data)

# 3. Train the forecaster for prediction
forecaster.fit(train_data)

pred = forecaster.predict(test_data)

To apply AutoML, use AutoTSEstimator instead of normal forecasters.

# Create and fit an `AutoTSEstimator`
from bigdl.chronos.autots import AutoTSEstimator
autotsest = AutoTSEstimator(model="tcn", future_seq_len=10)

tsppl = autotsest.fit(data=tsdata_train, validation_data=tsdata_val)
pred = tsppl.predict(tsdata_test)

See Chronos user guide and quick start for more details.

Friesian

The Chronos library makes it easy to build end-to-end, large-scale recommedation system (including offline feature transformation and traning, near-line feature and model update, and online serving pipeline).

See Freisian readme for more details.

PPML

BigDL PPML provides a hardware (Intel SGX) protected Trusted Cluster Environment for running distributed Big Data & AI applications (in a secure fashion on private or public cloud).

See PPML tutorial and user guide for more details.

Getting Support

Citation

If you've found BigDL useful for your project, you may cite the paper as follows:

@INPROCEEDINGS{9880257,
  author={Dai, Jason Jinquan and Ding, Ding and Shi, Dongjie and Huang, Shengsheng and Wang, Jiao and Qiu, Xin and Huang, Kai and Song, Guoqiong and Wang, Yang and Gong, Qiyuan and Song, Jiaming and Yu, Shan and Zheng, Le and Chen, Yina and Deng, Junwei and Song, Ge},
  booktitle={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, 
  title={BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster}, 
  year={2022},
  volume={},
  number={},
  pages={21407-21414},
  doi={10.1109/CVPR52688.2022.02076}}