PPML FL document (#6017)

This commit is contained in:
Song Jiaming 2022-10-08 10:22:32 +08:00 committed by GitHub
parent 936f8d2adb
commit b7be0c8c44
3 changed files with 42 additions and 13 deletions

View file

@ -2,21 +2,21 @@
Vertical Federated Learning (VFL) is a federated machine learning case where multiple data sets share the same sample ID space but differ in feature space. Vertical Federated Learning (VFL) is a federated machine learning case where multiple data sets share the same sample ID space but differ in feature space.
VFL is supported in BigDL PPML. It allows users to train a federated machine learning model where data features are held by different parties. In BigDL PPML, the following VFL scenarios are supported. VFL is supported in BigDL PPML. It allows users to train a federated machine learning model where data features are held by different parties. In BigDL PPML, the following VFL scenarios are supported.
* **Private Set Intersection**: To get data intersection of different VFL parties. * Private Set Intersection: To get data intersection of different VFL parties.
* **Neural Network Model**: To train common neural network model with Pytorch or Tensorflow backend across VFL parties. * Neural Network Model: To train common neural network model with Pytorch or Tensorflow backend across VFL parties.
* **FGBoost Model**: To train gradient boosted decision tree (GBDT) model across multiple VFL parties. * FGBoost Model: To train gradient boosted decision tree (GBDT) model across multiple VFL parties.
For each scenario, some quick starts are available in above links. ## Quick Start Examples
For each scenario, an quick start example is available in following links.
* [Private Set Intersection](https://github.com/intel-analytics/BigDL/blob/main/python/ppml/example/psi/psi-tutorial.md): A PSI example of getting intersection of two parties
* [Pytorch Neural Network Model](https://github.com/intel-analytics/BigDL/blob/main/python/ppml/example/pytorch_nn_lr/pytorch-nn-lr-tutorial.md): An Pytorch based Logistic Regression application by two parties
* [FGBoost Model](https://github.com/intel-analytics/BigDL/blob/main/python/ppml/example/fgboost_regression/fgboost-tutorial.md): An federated Gradient Boosted Regression Tree application by two parties
## Key Concepts ## System Architecture
A **FL Server** is a gRPC server to handle requests from FL Client. A **FL Client** is a gRPC client to send requests to FL Server. These requests include: The high-level architecture is shown in the diagram below. This includes the components of the BigDL PPML FL and [SGX](https://www.intel.com/content/www/us/en/developer/tools/software-guard-extensions/overview.html) for Privacy Preservation.
* serialized model to use in training at FL Server
* some model related instance, e.g. loss function, optimizer
* the Tensor which FL Server and FL Client interact, e.g. predict output, label, gradient
A **FL Context** is a singleton holding a FL Client instance. By default, only one instance is held in a FL application. And the gRPC channel in this singleton instance could be reused in multiple algorithms. ![](../images/fl_architecture.png)
## Next steps
For detailed usage of BigDL PPML VFL, please see [User Guide](user_guide.md)
## Lifecycle
## Fault Tolerance

View file

@ -0,0 +1,29 @@
# BigDL PPML VFL User Guide
## Deployment
### SGX
FL Server is protected by SGX, please see [PPML Prerequisite](https://github.com/intel-analytics/BigDL/blob/main/docs/readthedocs/source/doc/PPML/Overview/ppml.md#21-prerequisite) to get SGX environment ready.
### FL Server
You could set configurations of FL Server by editting `ppml-conf.yaml`
#### Configuration
##### clientNum
an integer, the total client number of this FL application
##### serverPort
an integer, the port used by FL Server
##### privateKeyFilePath
a string, the file path of TLS private key
##### certChainFilePath
a string, the file path of TLS certificate chain
#### Start
You can run FL Server in SGX with the following command:
```bash
docker exec -it YOUR_DOCKER bash /ppml/trusted-big-data-ml/work/start-scripts/start-python-fl-server-sgx.sh -p 8980 -c 2
```
You can also set port with `-p` and set client number with `-c` while the default settings are `port=8980` and `client-num=2`.
## Programming Guide
Once the FL Server deployment is ready, you can write the client code and start your FL application.
You could see the [examples](overview.md#quick-start-examples) in overview for basic usages of the APIs.
You could check [API Doc]() for more details.

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB