2.5 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			2.5 KiB
		
	
	
	
	
	
	
	
DLlib Quickstarts
Run in Google Colab  
View source on GitHub
In this guide we will demonstrate how to use DLlib keras style api and DLlib NNClassifier for classification.
Step 0: Prepare Environment
We recommend using conda to prepare the environment. Please refer to the install guide for more details.
conda create -n my_env python=3.7 # "my_env" is conda environment name, you can use any name you like.
conda activate my_env
pip install bigdl-dllib
Step 1: Data loading and processing using Spark DataFrame
df = spark.read.csv(path, sep=',', inferSchema=True).toDF("num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age", "class")
We process the data using Spark API and split the data into train and test set.
vecAssembler = VectorAssembler(outputCol="features")
vecAssembler.setInputCols(["num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age"])
train_df = vecAssembler.transform(df)
changedTypedf = train_df.withColumn("label", train_df["class"].cast(DoubleType())+lit(1))\
    .select("features", "label")
(trainingDF, validationDF) = changedTypedf.randomSplit([0.9, 0.1])
Step 3: Define classification model using DLlib keras style api
x1 = Input(shape=(8,))
dense1 = Dense(12, activation='relu')(x1)
dense2 = Dense(8, activation='relu')(dense1)
dense3 = Dense(2)(dense2)
model = Model(x1, dense3)
Step 4: Create NNClassifier and Fit NNClassifier
classifier = NNClassifier(model, CrossEntropyCriterion(), [8]) \
    .setOptimMethod(Adam()) \
    .setBatchSize(32) \
    .setMaxEpoch(150)
nnModel = classifier.fit(trainingDF)
Step 5: Evaluate the trained model
predictionDF = nnModel.transform(validationDF).cache()
predictionDF.sample(False, 0.1).show()
evaluator = MulticlassClassificationEvaluator(
    labelCol="label", predictionCol="prediction", metricName="accuracy")
accuracy = evaluator.evaluate(predictionDF)