add serving index to readme (#3648)

This commit is contained in:
Song Jiaming 2021-12-02 16:57:17 +08:00 committed by GitHub
parent 1951149792
commit a83ab371d2
11 changed files with 171 additions and 8 deletions

View file

@ -18,6 +18,8 @@ BigDL makes it easy for data scientists and data engineers to build end-to-end,
* [Chronos](#getting-started-with-chronos): scalable time series analysis using AutoML
* [PPML](#ppml-privacy-preserving-machine-learning): privacy preserving big data analysis and machine learning (*experimental*)
* [Serving](#getting-started-with-serving): Distributed and Automated Model Inference on Big Data Streaming Frameworks
For more information, you may [read the docs](https://bigdl.readthedocs.io/).
@ -199,6 +201,13 @@ See the Chronos [user guide](https://bigdl.readthedocs.io/en/latest/doc/Chronos/
See the [PPML user guide](https://bigdl.readthedocs.io/en/latest/doc/PPML/Overview/ppml.html) for more details.
## Getting Started with Serving
BigDL Cluster Serving is an end-to-end pipeline to scale-out local applications. We recommend you to refer to following example to kick-off.
[Migrate Keras application to Serving](https://bigdl.readthedocs.io/en/latest/doc/Serving/Example/keras-to-cluster-serving-example.ipynb)
See the Serving [user guide](https://bigdl.readthedocs.io/en/latest/doc/Serving/Overview/serving.html) and [quickstart](https://bigdl.readthedocs.io/en/latest/doc/Serving/QuickStart/serving-quickstart.html) for more details.
## More information
- [Document Website](https://bigdl.readthedocs.io/)

View file

@ -25,6 +25,8 @@ sys.path.insert(0, os.path.abspath("../../../python/friesian/src/"))
sys.path.insert(0, os.path.abspath("../../../python/chronos/src/"))
sys.path.insert(0, os.path.abspath("../../../python/dllib/src/"))
sys.path.insert(0, os.path.abspath("../../../python/orca/src/"))
sys.path.insert(0, os.path.abspath("../../../python/serving/src/"))
# -- Project information -----------------------------------------------------

View file

@ -1,4 +1,4 @@
# BigDL Cluster Serving Example
# Cluster Serving Example
There are some examples provided for new user or existing Tensorflow user.
## End-to-end Example

View file

@ -0,0 +1,118 @@
# Contribute to Cluster Serving
This is the guide to contribute your code to Cluster Serving.
Cluster Serving takes advantage of Analytics Zoo core with integration of Deep Learning Frameworks, e.g. Tensorflow, OpenVINO, PyTorch, and implements the inference logic on top of it, and parallelize the computation with Flink and Redis by default. To contribute more features to Cluster Serving, you could refer to following sections accordingly.
## Dev Environment
### Get Code and Prepare Branch
Go to Analytics Zoo main repo https://github.com/intel-analytics/analytics-zoo, press Fork to your github repo, and git clone the forked repo to local. Use `git checkout -b your_branch_name` to create a new branch, and you could start to write code and pull request to Analytics Zoo from this branch.
### Environment Set up
You could refer to [Analytics Zoo Scala Developer Guide](https://analytics-zoo.readthedocs.io/en/latest/doc/UserGuide/develop.html#scala) to set up develop environment. Cluster Serving is an Analytics Zoo Scala module.
### Debug in IDE
Cluster Serving depends on Flink and Redis. To install Redis and start Redis server,
```
$ export REDIS_VERSION=5.0.5
$ wget http://download.redis.io/releases/redis-${REDIS_VERSION}.tar.gz && \
tar xzf redis-${REDIS_VERSION}.tar.gz && \
rm redis-${REDIS_VERSION}.tar.gz && \
cd redis-${REDIS_VERSION} && \
make
$ ./src/redis-server
```
in IDE, embedded Flink would be used so that no dependency is needed.
Once set up, you could copy the `/path/to/analytics-zoo/scripts/cluster-serving/config.yaml` to `/path/to/analytics-zoo/config.yaml`, and run `zoo/src/main/scala/com/intel/analytics/zoo/serving/ClusterServing.scala` in IDE. Since IDE consider `/path/to/analytics-zoo/` as the current directory, it would read the config file in it.
Run `zoo/src/main/scala/com/intel/analytics/zoo/serving/http/Frontend2.scala` if you use HTTP frontend.
Once started, you could run python client code to finish an end-to-end test just as you run Cluster Serving in [Programming Guide](https://github.com/intel-analytics/analytics-zoo/blob/master/docs/docs/ClusterServingGuide/ProgrammingGuide.md#4-model-inference).
### Test Package
Once you write the code and complete the test in IDE, you can package the jar and test.
To package,
```
cd /path/to/analytics-zoo/zoo
./make-dist.sh
```
Then, in `target` folder, copy `analytics-zoo-xxx-flink-udf.jar` to your test directory, and rename it as `zoo.jar`, and also copy the `config.yaml` to your test directory.
You could copy `/path/to/analytics-zoo/scripts/cluster-serving/cluster-serving-start` to start Cluster Serving, this scripts will start Redis server for you and submit Flink job. If you prefer not to control Redis, you could use the command in it `${FLINK_HOME}/bin/flink run -c com.intel.analytics.zoo.serving.ClusterServing zoo.jar` to start Cluster Serving.
To run frontend, call `java -cp zoo.jar com.intel.analytics.zoo.serving.http.Frontend2`.
The rest are the same with test in IDE.
## Add Features
### Data Connector
Data connector is the producer of Cluster Serving. The remote clients put data into data pipeline
#### Scala code (The Server)
To define a new data connector to, e.g. Kafka, Redis, or other database, you have to define a Flink Source first.
You could refer to `com/intel/analytics/zoo/serving/engine/FlinkRedisSource.scala` as an example.
```
class FlinkRedisSource(params: ClusterServingHelper)
extends RichParallelSourceFunction[List[(String, String)]] {
@volatile var isRunning = true
override def open(parameters: Configuration): Unit = {
// initlalize the connector
}
override def run(sourceContext: SourceFunction
.SourceContext[List[(String, String)]]): Unit = while (isRunning) {
// get data from data pipeline
}
override def cancel(): Unit = {
// close the connector
}
}
```
Then you could refer to `com/intel/analytics/zoo/serving/engine/FlinkInference.scala` as the inference method to your new connector. Usually it could be directly used without new implementation. However, you could still define your new method if you need.
Finally, you have to define a Flink Sink, to write data back to data pipeline.
You could refer to `com/intel/analytics/zoo/serving/engine/FlinkRedisSink.scala` as an example.
```
class FlinkRedisSink(params: ClusterServingHelper)
extends RichSinkFunction[List[(String, String)]] {
override def open(parameters: Configuration): Unit = {
// initialize the connector
}
override def close(): Unit = {
// close the connector
}
override def invoke(value: List[(String, String)], context: SinkFunction.Context[_]): Unit = {
// write data to data pipeline
}
}
```
Please note that normally you should do the space (memory or disk) control of your data pipeline in your code.
Please locate Flink Source and Flink Sink code to `com/intel/analytics/zoo/serving/engine/`
If you have some method which need to be wrapped as a class, you could locate them in `com/intel/analytics/zoo/serving/pipeline/`
#### Python Code (The Client)
You could refer to `pyzoo/zoo/serving/client.py` to define your client code according to your data connector.
Please locate this part of code in `pyzoo/zoo/serving/data_pipeline_name/`, e.g. `pyzoo/zoo/serving/kafka/` if you create a Kafka connector.
##### put to data pipeline
It is recommended to refer to `InputQueue.enqueue()` and `InputQueue.predict()` method. This method calls `self.data_to_b64` method first and add data to data pipeline. You could define a similar enqueue method to work with your data connector.
##### get from data pipeline
It is recommended to refer to `OutputQueue.query()` and `OutputQueue.dequeue()` method. This method gets result from data pipeline and calls `self.get_ndarray_from_b64` method to decode. You could define a similar dequeue method to work with your data connector.
## Benchmark Test
You could use `zoo/src/main/scala/com/intel/analytics/zoo/serving/engine/Operations.scala` to test the inference time of your model.
The script takes two arguments, run it with `-m modelPath` and `-j jsonPath` to indicate the path to the model and the path to the prepared json format operation template of the model.
The model will output the inference time stats of preprocessing, prediction and postprocessing processes, which varies with the different preprocessing/postprocessing time and thread numbers.

View file

@ -1,4 +1,4 @@
# BigDL Cluster Serving FAQ
# Cluster Serving FAQ
## General Debug Guide
You could use following guide to debug if serving is not working properly.

View file

@ -1,4 +1,4 @@
# BigDL Cluster Serving Overview
# Cluster Serving User Guide
BigDL Cluster Serving is a lightweight distributed, real-time serving solution that supports a wide range of deep learning models (such as TensorFlow, PyTorch, Caffe, BigDL and OpenVINO models). It provides a simple pub/sub API, so that the users can easily send their inference requests to the input queue (using a simple Python API); Cluster Serving will then automatically manage the scale-out and real-time model inference across a large cluster (using distributed streaming frameworks such as Apache Spark Streaming, Apache Flink, etc.)
The overall architecture of BigDL Cluster Serving solution is illustrated as below:
@ -26,3 +26,24 @@ You can launch the Cluster Serving service by running the startup script on the
Cluster Serving provides a simple pub/sub API to the users, so that you can easily send the inference requests to an input queue (currently Redis Streams is used) using a simple Python API.
Cluster Serving will then read the requests from the Redis stream, run the distributed real-time inference across the cluster (using Flink), and return the results back through Redis. As a result, you may get the inference results again using a simple Python API.
## Next Steps
### Deploy Cluster Serving
To deploy Cluster Serving, follow steps below
[1. Install Cluster Serving](https://bigdl.readthedocs.io/en/latest/doc/Serving/ProgrammingGuide/serving-installation.html)
[2. Start Cluster Serving](https://bigdl.readthedocs.io/en/latest/doc/Serving/ProgrammingGuide/serving-start.html)
[3. Inference by Cluster Serving](https://bigdl.readthedocs.io/en/latest/doc/Serving/ProgrammingGuide/serving-inference.html)
### Examples
You could find some end-to-end examples about how to build a serving application from scratch or how to migrate an existed local application to serving.
[Exammple link](https://bigdl.readthedocs.io/en/latest/doc/Serving/Example/example.html)
### Trouble Shooting
Some frequently asked questions are at [FAQ](https://bigdl.readthedocs.io/en/latest/doc/Serving/FAQ/faq.html)
### Contribute Guide
For contributors, check [Contribute Guide](https://bigdl.readthedocs.io/en/latest/doc/Serving/FAQ/contribute-guide.html)

View file

@ -1,4 +1,4 @@
# BigDL Cluster Serving Programming Guide
# Inference by Cluster Serving
## Model Inference
Once you finish the installation and service launch, you could do inference using Cluster Serving client API.

View file

@ -1,4 +1,4 @@
# BigDL Cluster Serving Programming Guide
# Install Cluster Serving
## Installation
It is recommended to install Cluster Serving by pulling the pre-built Docker image to your local node, which have packaged all the required dependencies. Alternatively, you may also manually install Cluster Serving (through either pip or direct downloading), Redis on the local node.

View file

@ -1,4 +1,4 @@
# BigDL Cluster Serving Programming Guide
# Start Cluster Serving
## Launching Service of Serving

View file

@ -1,4 +1,4 @@
# BigDL Cluster Serving Quick Start
# Cluster Serving Quick Start
This section provides a quick start example for you to run BigDL Cluster Serving. To simplify the example, we use docker to run Cluster Serving. If you do not have docker installed, [install docker](https://docs.docker.com/install/) first. The quick start example contains all the necessary components so the first time users can get it up and running within minutes:

View file

@ -10,7 +10,7 @@ BigDL Documentation
* `RayOnSpark <doc/Ray/Overview/ray.html>`_: run Ray programs directly on Big Data clusters
* `Chronos <doc/Chronos/Overview/chronos.html>`_: scalable time series analysis using AutoML
* `PPML <doc/PPML/Overview/ppml.html>`_: privacy preserving big data analysis and machine learning (*experimental*)
* `Serving <doc/Serving/Overview/serving.html>`_: distributed and automated model inference on Big Data streaming frameworks
-------
@ -85,6 +85,19 @@ BigDL Documentation
doc/PPML/QuickStart/deploy_intel_sgx_device_plugin_for_kubernetes.md
doc/PPML/QuickStart/trusted-serving-on-k8s-guide.md
.. toctree::
:maxdepth: 1
:caption: Serving Overview
doc/Serving/Overview/serving.md
doc/Serving/QuickStart/serving-quickstart.md
doc/Serving/ProgrammingGuide/serving-installation.md
doc/Serving/ProgrammingGuide/serving-start.md
doc/Serving/ProgrammingGuide/serving-inference.md
doc/Serving/Example/example.md
doc/Serving/FAQ/faq.md
doc/Serving/FAQ/contribute-guide.md
.. toctree::
:maxdepth: 1
:caption: Common Use Case