update (#5362)
This commit is contained in:
		
							parent
							
								
									19e6099483
								
							
						
					
					
						commit
						c8dd8d6e5e
					
				
					 1 changed files with 8 additions and 52 deletions
				
			
		| 
						 | 
				
			
			@ -32,18 +32,20 @@ Create Linux VM through Azure [CLI](https://docs.microsoft.com/en-us/azure/devel
 | 
			
		|||
For size of the VM, please choose DC-V3 Series VM with more than 4 vCPU cores.
 | 
			
		||||
 | 
			
		||||
#### 2.2.3 Pull BigDL PPML image and run on Linux client
 | 
			
		||||
* Go to Azure Marketplace, search "BigDL PPML" and find `BigDL PPML` product. Click "Create" button which will lead you to `Subscribe` page.
 | 
			
		||||
* Go to Azure Marketplace, search "BigDL PPML" and find `BigDL PPML: Secure Big Data AI on Intel SGX` product. Click "Create" button which will lead you to `Subscribe` page.
 | 
			
		||||
On `Subscribe` page, input your subscription, your Azure container registry, your resource group, location. Then click `Subscribe` to subscribe BigDL PPML to your container registry.
 | 
			
		||||
 | 
			
		||||
* Go to your Azure container regsitry, check `Repostirories`, and find `intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene`
 | 
			
		||||
* Login to the created VM. Then login to your Azure container registry, pull BigDL PPML image using such command:
 | 
			
		||||
```bash
 | 
			
		||||
docker pull myContainerRegistry/bigdl-ppml-trusted-big-data-ml-python-graphene:2.1.0-SNAPSHOT
 | 
			
		||||
docker pull myContainerRegistry/intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene
 | 
			
		||||
```
 | 
			
		||||
* Start container of this image
 | 
			
		||||
```bash
 | 
			
		||||
#!/bin/bash
 | 
			
		||||
 | 
			
		||||
export LOCAL_IP=YOUR_LOCAL_IP
 | 
			
		||||
export DOCKER_IMAGE=intelanalytics/bigdl-ppml-trusted-big-data-ml-python-graphene:2.1.0-SNAPSHOT
 | 
			
		||||
export DOCKER_IMAGE=intelanalytics/bigdl-ppml-trusted-big-data-ml-python-graphene
 | 
			
		||||
 | 
			
		||||
sudo docker run -itd \
 | 
			
		||||
    --privileged \
 | 
			
		||||
| 
						 | 
				
			
			@ -110,8 +112,8 @@ az storage fs directory upload -f myFS --account-name myDataLakeAccount -s "path
 | 
			
		|||
### 2.4.2  Access data in Hadoop through ABFS(Azure Blob Filesystem) driver
 | 
			
		||||
You can access Data Lake Storage in Hadoop filesytem by such URI:  ```abfs[s]://file_system@account_name.dfs.core.windows.net/<path>/<path>/<file_name>```
 | 
			
		||||
#### Authentication
 | 
			
		||||
The ABFS driver supports two forms of authentication so that the Hadoop application may securely access resources contained within a Data Lake Storage Gen2 capable account. 
 | 
			
		||||
 - Shared Key: This permits users access to ALL resources in the account. The key is encrypted and stored in Hadoop configuration.
 | 
			
		||||
The ABFS driver supports two forms of authentication so that the Hadoop application may securely access resources contained within a Data Lake Storage Gen2 capable account.
 | 
			
		||||
- Shared Key: This permits users access to ALL resources in the account. The key is encrypted and stored in Hadoop configuration.
 | 
			
		||||
 | 
			
		||||
- Azure Active Directory OAuth Bearer Token: Azure AD bearer tokens are acquired and refreshed by the driver using either the identity of the end user or a configured Service Principal. Using this authentication model, all access is authorized on a per-call basis using the identity associated with the supplied token and evaluated against the assigned POSIX Access Control List (ACL).
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -175,54 +177,8 @@ Take note of principalId of the first line as System Managed Identity of your VM
 | 
			
		|||
##### b. Set access policy for AKS VM ScaleSet
 | 
			
		||||
Example command:
 | 
			
		||||
```bash
 | 
			
		||||
az keyvault set-policy --name myKeyVault --object-id <systemManagedIdentityOfVMSS> --secret-permissions get --key-permissions all --certificate-permissions all
 | 
			
		||||
az keyvault set-policy --name myKeyVault --object-id <systemManagedIdentityOfVMSS> --secret-permissions get --key-permissions all
 | 
			
		||||
```
 | 
			
		||||
#### 2.5.3.2 Set access for AKS
 | 
			
		||||
##### a. Enable Azure Key Vault Provider for Secrets Store CSI Driver support
 | 
			
		||||
Example command:
 | 
			
		||||
```bash
 | 
			
		||||
az aks enable-addons --addons azure-keyvault-secrets-provider --name myAKSCluster --resource-group myResourceGroup
 | 
			
		||||
```
 | 
			
		||||
* Verify the Azure Key Vault Provider for Secrets Store CSI Driver installation
 | 
			
		||||
Example command:
 | 
			
		||||
```bash
 | 
			
		||||
kubectl get pods -n kube-system -l 'app in (secrets-store-csi-driver, secrets-store-provider-azure)'
 | 
			
		||||
```
 | 
			
		||||
Be sure that a Secrets Store CSI Driver pod and an Azure Key Vault Provider pod are running on each node in your cluster's node pools.
 | 
			
		||||
* Enable Azure Key Vault Provider for Secrets Store CSI Driver to track of secret update in key vault
 | 
			
		||||
```bash
 | 
			
		||||
az aks update -g myResourceGroup -n myAKSCluster --enable-secret-rotation
 | 
			
		||||
```
 | 
			
		||||
#### b. Provide an identity to access the Azure Key Vault
 | 
			
		||||
There are several ways to provide identity for Azure Key Vault Provider for Secrets Store CSI Driver to access Azure Key Vault: `An Azure Active Directory pod identity`, `user-assigned identity` or `system-assigned managed identity`. In our solution, we use user-assigned managed identity.
 | 
			
		||||
* Enable managed identity in AKS
 | 
			
		||||
```bash
 | 
			
		||||
az aks update -g myResourceGroup -n myAKSCluster --enable-managed-identity
 | 
			
		||||
```
 | 
			
		||||
* Get user-assigned managed identity that you created when you enabled a managed identity on your AKS cluster
 | 
			
		||||
Run:
 | 
			
		||||
```bash
 | 
			
		||||
az aks show -g myResourceGroup -n myAKSCluster --query addonProfiles.azureKeyvaultSecretsProvider.identity.clientId -o tsv
 | 
			
		||||
```
 | 
			
		||||
The output would be like:
 | 
			
		||||
```bash
 | 
			
		||||
f95519c1-3fe8-441b-a7b9-368d5e13b534
 | 
			
		||||
```
 | 
			
		||||
Take note of this output as your user-assigned managed identity of Azure KeyVault Secrets Provider
 | 
			
		||||
* Grant your user-assigned managed identity permissions that enable it to read your key vault and view its contents
 | 
			
		||||
Example command:
 | 
			
		||||
```bash
 | 
			
		||||
az keyvault set-policy -n myKeyVault --key-permissions get --spn f95519c1-3fe8-441b-a7b9-368d5e13b534
 | 
			
		||||
az keyvault set-policy -n myKeyVault --secret-permissions get --spn f95519c1-3fe8-441b-a7b9-368d5e13b534
 | 
			
		||||
```
 | 
			
		||||
#### c. Create a SecretProviderClass to access your Key Vault
 | 
			
		||||
On your client docker container, edit `/ppml/trusted-big-data-ml/azure/secretProviderClass.yaml` file, modify `<client-id>` to your user-assigned managed identity of Azure KeyVault Secrets Provider, and modify `<key-vault-name>` and  `<tenant-id>` to your real key vault name and tenant id.
 | 
			
		||||
 | 
			
		||||
Then run:
 | 
			
		||||
```bash
 | 
			
		||||
kubectl apply -f /ppml/trusted-big-data-ml/azure/secretProviderClass.yaml
 | 
			
		||||
```
 | 
			
		||||
to create secretProviderClass in your AKS.
 | 
			
		||||
 | 
			
		||||
## 3. Run Spark PPML jobs
 | 
			
		||||
Login to your client VM and enter your BigDL PPML container:
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in a new issue