PPML FL document (#6017)

This commit is contained in:
Song Jiaming 2022-10-08 10:22:32 +08:00 committed by GitHub
parent 936f8d2adb
commit b7be0c8c44
3 changed files with 42 additions and 13 deletions

View file

@ -2,21 +2,21 @@
Vertical Federated Learning (VFL) is a federated machine learning case where multiple data sets share the same sample ID space but differ in feature space.
VFL is supported in BigDL PPML. It allows users to train a federated machine learning model where data features are held by different parties. In BigDL PPML, the following VFL scenarios are supported.
* **Private Set Intersection**: To get data intersection of different VFL parties.
* **Neural Network Model**: To train common neural network model with Pytorch or Tensorflow backend across VFL parties.
* **FGBoost Model**: To train gradient boosted decision tree (GBDT) model across multiple VFL parties.
* Private Set Intersection: To get data intersection of different VFL parties.
* Neural Network Model: To train common neural network model with Pytorch or Tensorflow backend across VFL parties.
* FGBoost Model: To train gradient boosted decision tree (GBDT) model across multiple VFL parties.
For each scenario, some quick starts are available in above links.
## Quick Start Examples
For each scenario, an quick start example is available in following links.
* [Private Set Intersection](https://github.com/intel-analytics/BigDL/blob/main/python/ppml/example/psi/psi-tutorial.md): A PSI example of getting intersection of two parties
* [Pytorch Neural Network Model](https://github.com/intel-analytics/BigDL/blob/main/python/ppml/example/pytorch_nn_lr/pytorch-nn-lr-tutorial.md): An Pytorch based Logistic Regression application by two parties
* [FGBoost Model](https://github.com/intel-analytics/BigDL/blob/main/python/ppml/example/fgboost_regression/fgboost-tutorial.md): An federated Gradient Boosted Regression Tree application by two parties
## Key Concepts
A **FL Server** is a gRPC server to handle requests from FL Client. A **FL Client** is a gRPC client to send requests to FL Server. These requests include:
* serialized model to use in training at FL Server
* some model related instance, e.g. loss function, optimizer
* the Tensor which FL Server and FL Client interact, e.g. predict output, label, gradient
## System Architecture
The high-level architecture is shown in the diagram below. This includes the components of the BigDL PPML FL and [SGX](https://www.intel.com/content/www/us/en/developer/tools/software-guard-extensions/overview.html) for Privacy Preservation.
A **FL Context** is a singleton holding a FL Client instance. By default, only one instance is held in a FL application. And the gRPC channel in this singleton instance could be reused in multiple algorithms.
![](../images/fl_architecture.png)
## Next steps
For detailed usage of BigDL PPML VFL, please see [User Guide](user_guide.md)
## Lifecycle
## Fault Tolerance

View file

@ -0,0 +1,29 @@
# BigDL PPML VFL User Guide
## Deployment
### SGX
FL Server is protected by SGX, please see [PPML Prerequisite](https://github.com/intel-analytics/BigDL/blob/main/docs/readthedocs/source/doc/PPML/Overview/ppml.md#21-prerequisite) to get SGX environment ready.
### FL Server
You could set configurations of FL Server by editting `ppml-conf.yaml`
#### Configuration
##### clientNum
an integer, the total client number of this FL application
##### serverPort
an integer, the port used by FL Server
##### privateKeyFilePath
a string, the file path of TLS private key
##### certChainFilePath
a string, the file path of TLS certificate chain
#### Start
You can run FL Server in SGX with the following command:
```bash
docker exec -it YOUR_DOCKER bash /ppml/trusted-big-data-ml/work/start-scripts/start-python-fl-server-sgx.sh -p 8980 -c 2
```
You can also set port with `-p` and set client number with `-c` while the default settings are `port=8980` and `client-num=2`.
## Programming Guide
Once the FL Server deployment is ready, you can write the client code and start your FL application.
You could see the [examples](overview.md#quick-start-examples) in overview for basic usages of the APIs.
You could check [API Doc]() for more details.

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB