[PPML] Refine readthedoc lines and space (#6509)

* Remove empty lines
* Use same space for indent
* Add empty line at the end
This commit is contained in:
Qiyuan Gong 2022-11-09 13:41:21 +08:00 committed by GitHub
parent 4f3c27bde0
commit 17fb75f8d7
6 changed files with 59 additions and 65 deletions

View file

@ -58,4 +58,4 @@ to
``` ```
CMD=("${CMD[0]}" -agentlib:jdwp=transport=dt_socket,server=y,address=4000,suspend=n "${CMD[@]:1:$LAST}") CMD=("${CMD[0]}" -agentlib:jdwp=transport=dt_socket,server=y,address=4000,suspend=n "${CMD[@]:1:$LAST}")
``` ```
And in IDEA, create a Run Configuration remote JVM profile. The IDEA will create the VM args automatically. And in IDEA, create a Run Configuration remote JVM profile. The IDEA will create the VM args automatically.

View file

@ -77,13 +77,13 @@ In your BigDL PPML container, you can run `/ppml/trusted-big-data-ml/azure/creat
Note: Please use the same VNet information of your client to create AKS. And use DC-Series VM size(i.e.Standard_DC8ds_v3) to create AKS. Note: Please use the same VNet information of your client to create AKS. And use DC-Series VM size(i.e.Standard_DC8ds_v3) to create AKS.
```bash ```bash
/ppml/trusted-big-data-ml/azure/create-aks.sh \ /ppml/trusted-big-data-ml/azure/create-aks.sh \
--resource-group myResourceGroup \ --resource-group myResourceGroup \
--vnet-resource-group myVnetResourceGroup \ --vnet-resource-group myVnetResourceGroup \
--vnet-name myVnetName \ --vnet-name myVnetName \
--subnet-name mySubnetName \ --subnet-name mySubnetName \
--cluster-name myAKSName \ --cluster-name myAKSName \
--vm-size myAKSNodeVMSize \ --vm-size myAKSNodeVMSize \
--node-count myAKSInitNodeCount --node-count myAKSInitNodeCount
``` ```
You can check the information by running: You can check the information by running:
@ -313,20 +313,20 @@ DATA_KEY_PATH=
secure_password=`az keyvault secret show --name "key-pass" --vault-name $KEY_VAULT_NAME --query "value" | sed -e 's/^"//' -e 's/"$//'` secure_password=`az keyvault secret show --name "key-pass" --vault-name $KEY_VAULT_NAME --query "value" | sed -e 's/^"//' -e 's/"$//'`
bash bigdl-ppml-submit.sh \ bash bigdl-ppml-submit.sh \
--master $RUNTIME_SPARK_MASTER \ --master $RUNTIME_SPARK_MASTER \
--deploy-mode client \ --deploy-mode client \
--sgx-enabled true \ --sgx-enabled true \
--sgx-log-level error \ --sgx-log-level error \
--sgx-driver-memory 4g \ --sgx-driver-memory 4g \
--sgx-driver-jvm-memory 2g \ --sgx-driver-jvm-memory 2g \
--sgx-executor-memory 16g \ --sgx-executor-memory 16g \
--sgx-executor-jvm-memory 7g \ --sgx-executor-jvm-memory 7g \
--driver-memory 8g \ --driver-memory 8g \
--driver-cores 4 \ --driver-cores 4 \
--executor-memory 18g \ --executor-memory 18g \
--executor-cores 4 \ --executor-cores 4 \
--num-executors 2 \ --num-executors 2 \
--conf spark.cores.max=8 \ --conf spark.cores.max=8 \
--name spark-decrypt-sgx \ --name spark-decrypt-sgx \
--conf spark.kubernetes.container.image=myContainerRegistry.azurecr.io/intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene:$BIGDL_VERSION \ --conf spark.kubernetes.container.image=myContainerRegistry.azurecr.io/intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene:$BIGDL_VERSION \
--conf spark.kubernetes.driver.podTemplateFile=/ppml/trusted-big-data-ml/azure/spark-driver-template-az.yaml \ --conf spark.kubernetes.driver.podTemplateFile=/ppml/trusted-big-data-ml/azure/spark-driver-template-az.yaml \
@ -343,7 +343,6 @@ bash bigdl-ppml-submit.sh \
--verbose \ --verbose \
$SPARK_EXTRA_JAR_PATH \ $SPARK_EXTRA_JAR_PATH \
$ARGS $ARGS
``` ```
## 4. Run TPC-H example ## 4. Run TPC-H example
@ -375,32 +374,34 @@ will generate roughly 10GB of input data.
Generate primary key and data key, then save to file system. Generate primary key and data key, then save to file system.
The example code for generating the primary key and data key is like below: The example code for generating the primary key and data key is like below:
```
```bash
BIGDL_VERSION=2.1.0 BIGDL_VERSION=2.1.0
java -cp '/ppml/trusted-big-data-ml/work/bigdl-$BIGDL_VERSION/jars/*:/ppml/trusted-big-data-ml/work/spark-3.1.2/conf/:/ppml/trusted-big-data-ml/work/spark-3.1.2/jars/* \ java -cp '/ppml/trusted-big-data-ml/work/bigdl-$BIGDL_VERSION/jars/*:/ppml/trusted-big-data-ml/work/spark-3.1.2/conf/:/ppml/trusted-big-data-ml/work/spark-3.1.2/jars/* \
-Xmx10g \ -Xmx10g \
com.intel.analytics.bigdl.ppml.examples.GenerateKeys \ com.intel.analytics.bigdl.ppml.examples.GenerateKeys \
--kmsType AzureKeyManagementService \ --kmsType AzureKeyManagementService \
--vaultName xxx \ --vaultName xxx \
--primaryKeyPath xxx/keys/primaryKey \ --primaryKeyPath xxx/keys/primaryKey \
--dataKeyPath xxx/keys/dataKey --dataKeyPath xxx/keys/dataKey
``` ```
### 4.3 Encrypt Data ### 4.3 Encrypt Data
Encrypt data with specified BigDL `AzureKeyManagementService` Encrypt data with specified BigDL `AzureKeyManagementService`
The example code of encrypting data is like below: The example code of encrypting data is like below:
```
```bash
BIGDL_VERSION=2.1.0 BIGDL_VERSION=2.1.0
java -cp '/ppml/trusted-big-data-ml/work/bigdl-$BIGDL_VERSION/jars/*:/ppml/trusted-big-data-ml/work/spark-3.1.2/conf/:/ppml/trusted-big-data-ml/work/spark-3.1.2/jars/* \ java -cp '/ppml/trusted-big-data-ml/work/bigdl-$BIGDL_VERSION/jars/*:/ppml/trusted-big-data-ml/work/spark-3.1.2/conf/:/ppml/trusted-big-data-ml/work/spark-3.1.2/jars/* \
-Xmx10g \ -Xmx10g \
com.intel.analytics.bigdl.ppml.examples.tpch.EncryptFiles \ com.intel.analytics.bigdl.ppml.examples.tpch.EncryptFiles \
--kmsType AzureKeyManagementService \ --kmsType AzureKeyManagementService \
--vaultName xxx \ --vaultName xxx \
--primaryKeyPath xxx/keys/primaryKey \ --primaryKeyPath xxx/keys/primaryKey \
--dataKeyPath xxx/keys/dataKey \ --dataKeyPath xxx/keys/dataKey \
--inputPath xxx/dbgen \ --inputPath xxx/dbgen \
--outputPath xxx/dbgen-encrypted --outputPath xxx/dbgen-encrypted
``` ```
After encryption, you may upload encrypted data to Azure Data Lake store. After encryption, you may upload encrypted data to Azure Data Lake store.
@ -417,7 +418,7 @@ location of the input data and where the output should be saved.
The example script to run a query is like: The example script to run a query is like:
``` ```bash
export RUNTIME_DRIVER_MEMORY=8g export RUNTIME_DRIVER_MEMORY=8g
export RUNTIME_DRIVER_PORT=54321 export RUNTIME_DRIVER_PORT=54321
@ -435,20 +436,20 @@ INPUT_DIR=xxx/dbgen-encrypted
OUTPUT_DIR=xxx/output OUTPUT_DIR=xxx/output
bash bigdl-ppml-submit.sh \ bash bigdl-ppml-submit.sh \
--master $RUNTIME_SPARK_MASTER \ --master $RUNTIME_SPARK_MASTER \
--deploy-mode client \ --deploy-mode client \
--sgx-enabled true \ --sgx-enabled true \
--sgx-log-level error \ --sgx-log-level error \
--sgx-driver-memory 4g \ --sgx-driver-memory 4g \
--sgx-driver-jvm-memory 2g \ --sgx-driver-jvm-memory 2g \
--sgx-executor-memory 16g \ --sgx-executor-memory 16g \
--sgx-executor-jvm-memory 7g \ --sgx-executor-jvm-memory 7g \
--driver-memory 8g \ --driver-memory 8g \
--driver-cores 4 \ --driver-cores 4 \
--executor-memory 18g \ --executor-memory 18g \
--executor-cores 4 \ --executor-cores 4 \
--num-executors 2 \ --num-executors 2 \
--conf spark.cores.max=8 \ --conf spark.cores.max=8 \
--name spark-tpch-sgx \ --name spark-tpch-sgx \
--conf spark.kubernetes.container.image=myContainerRegistry.azurecr.io/intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene:$BIGDL_VERSION \ --conf spark.kubernetes.container.image=myContainerRegistry.azurecr.io/intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene:$BIGDL_VERSION \
--conf spark.kubernetes.driver.podTemplateFile=/ppml/trusted-big-data-ml/azure/spark-driver-template-az.yaml \ --conf spark.kubernetes.driver.podTemplateFile=/ppml/trusted-big-data-ml/azure/spark-driver-template-az.yaml \
@ -472,11 +473,3 @@ bash bigdl-ppml-submit.sh \
INPUT_DIR is the TPC-H's data dir. INPUT_DIR is the TPC-H's data dir.
OUTPUT_DIR is the dir to write the query result. OUTPUT_DIR is the dir to write the query result.
The optional parameter [QUERY] is the number of the query to run e.g 1, 2, ..., 22 The optional parameter [QUERY] is the number of the query to run e.g 1, 2, ..., 22

View file

@ -89,4 +89,4 @@ The result should look something like this:
> Pi is roughly 3.146760 > Pi is roughly 3.146760
<br /> <br />

View file

@ -23,6 +23,7 @@ You can find more details in [Intel SGX Developer Guide](https://download.01.org
```eval_rst ```eval_rst
.. mermaid:: .. mermaid::
graph LR graph LR
subgraph SGX enclave subgraph SGX enclave
MRENCLAVE(fa:fa-file-signature MRENCLAVE) MRENCLAVE(fa:fa-file-signature MRENCLAVE)

View file

@ -172,4 +172,4 @@ When the job is done, you can decrypt and read result of the job. More details i
## Video Demo ## Video Demo
<video src="https://user-images.githubusercontent.com/61072813/184758643-821026c3-40e0-4d4c-bcd3-8a516c55fc01.mp4" width="100%" controls></video> <video src="https://user-images.githubusercontent.com/61072813/184758643-821026c3-40e0-4d4c-bcd3-8a516c55fc01.mp4" width="100%" controls></video>

View file

@ -202,4 +202,4 @@
The optional parameter [QUERY] is the number of the query to run e.g 1, 2, ..., 22. The optional parameter [QUERY] is the number of the query to run e.g 1, 2, ..., 22.
The result is in OUTPUT_DIR. There should be a file called TIMES.TXT with content formatted like: The result is in OUTPUT_DIR. There should be a file called TIMES.TXT with content formatted like:
>Q01 39.80204010 >Q01 39.80204010