* refactor toc * refactor toc * Change to pydata-sphinx-theme and update packages requirement list for ReadtheDocs * Remove customized css for old theme * Add index page to each top bar section and limit dropdown maximum to be 4 * Use js to change 'More' to 'Libraries' * Add custom.css to conf.py for further css changes * Add BigDL logo and search bar * refactor toc * refactor toc and add overview * refactor toc and add overview * refactor toc and add overview * refactor get started * add paper and video section * add videos * add grid columns in landing page * add document roadmap to index * reapply search bar and github icon commit * reorg orca and chronos sections * Test: weaken ads by js * update: change left attrbute * update: add comments * update: change opacity to 0.7 * Remove useless theme template override for old theme * Add sidebar releases component in the home page * Remove sidebar search and restore top nav search button * Add BigDL handouts * Add back to homepage button to pages except from the home page * Update releases contents & styles in left sidebar * Add version badge to the top bar * Test: weaken ads by js * update: add comments * remove landing page contents * rfix chronos install * refactor install * refactor chronos section titles * refactor nano index * change chronos landing * revise chronos landing page * add document navigator to nano landing page * revise install landing page * Improve css of versions in sidebar * Make handouts image pointing to a page in new tab * add win guide to install * add dliib installation * revise title bar * rename index files * add index page for user guide * add dllib and orca API * update user guide landing page * refactor side bar * Remove extra style configuration of card components & make different card usage consistent * Remove extra styles for Nano how-to guides * Remove extra styles for Chronos how-to guides * Remove dark mode for now * Update index page description * Add decision tree for choosing BigDL libraries in index page * add dllib models api, revise core layers formats * Change primary & info color in light mode * Restyle card components * Restructure Chronos landing page * Update card style * Update BigDL library selection decision tree * Fix failed Chronos tutorials filter * refactor PPML documents * refactor and add friesian documents * add friesian arch diagram * update landing pages and fill key features guide index page * Restyle link card component * Style video frames in PPML sections * Adjust Nano landing page * put api docs to the last in index for convinience * Make badge horizontal padding smaller & small changes * Change the second letter of all header titles to be small capitalizd * Small changes on Chronos index page * Revise decision tree to make it smaller * Update: try to change the position of ads. * Bugfix: deleted nonexist file config * Update: update ad JS/CSS/config * Update: change ad. * Update: delete my template and change files. * Update: change chronos installation table color. * Update: change table font color to --pst-color-primary-text * Remove old contents in landing page sidebar * Restyle badge for usage in card footer again * Add quicklinks template on landing page sidebar * add quick links * Add scala logo * move tf, pytorch out of the link * change orca key features cards * fix typo * fix a mistake in wording * Restyle badge for card footer * Update decision tree * Remove useless html templates * add more api docs and update tutorials in dllib * update chronos install using new style * merge changes in nano doc from master * fix quickstart links in sidebar quicklinks * Make tables responsive * Fix overflow in api doc * Fix list indents problems in [User guide] section * Further fixes to nested bullets contents in [User Guide] section * Fix strange title in Nano 5-min doc * Fix list indent problems in [DLlib] section * Fix misnumbered list problems and other small fixes for [Chronos] section * Fix list indent problems and other small fixes for [Friesian] section * Fix list indent problem and other small fixes for [PPML] section * Fix list indent problem for developer guide * Fix list indent problem for [Cluster Serving] section * fix dllib links * Fix wrong relative link in section landing page Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> Co-authored-by: Juntao Luo <1072087358@qq.com>
270 lines
11 KiB
Markdown
270 lines
11 KiB
Markdown
# Regularizer
|
|
|
|
--------
|
|
|
|
## L1 Regularizer ##
|
|
|
|
**Scala:**
|
|
```scala
|
|
val l1Regularizer = L1Regularizer(rate)
|
|
```
|
|
**Python:**
|
|
```python
|
|
regularizerl1 = L1Regularizer(rate)
|
|
```
|
|
|
|
L1 regularizer is used to add penalty to the gradWeight to avoid overfitting.
|
|
|
|
In our code implementation, gradWeight = gradWeight + alpha * abs(weight)
|
|
|
|
For more details, please refer to [wiki](https://en.wikipedia.org/wiki/Regularization_(mathematics)).
|
|
|
|
**Scala example:**
|
|
```scala
|
|
|
|
import com.intel.analytics.bigdl.dllib.utils.RandomGenerator.RNG
|
|
import com.intel.analytics.bigdl.dllib.tensor._
|
|
import com.intel.analytics.bigdl.dllib.optim._
|
|
import com.intel.analytics.bigdl.numeric.NumericFloat
|
|
import com.intel.analytics.bigdl.dllib.nn._
|
|
|
|
RNG.setSeed(100)
|
|
|
|
val input = Tensor(3, 5).rand
|
|
val gradOutput = Tensor(3, 5).rand
|
|
val linear = Linear(5, 5, wRegularizer = L1Regularizer(0.2), bRegularizer = L1Regularizer(0.2))
|
|
|
|
val output = linear.forward(input)
|
|
val gradInput = linear.backward(input, gradOutput)
|
|
|
|
scala> input
|
|
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
0.54340494 0.67115563 0.2783694 0.4120464 0.4245176
|
|
0.52638245 0.84477615 0.14860484 0.004718862 0.15671109
|
|
0.12156912 0.18646719 0.67074907 0.21010774 0.82585275
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x5]
|
|
|
|
scala> gradOutput
|
|
gradOutput: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
0.4527399 0.13670659 0.87014264 0.5750933 0.063681036
|
|
0.89132196 0.62431186 0.20920213 0.52334774 0.18532822
|
|
0.5622963 0.10837689 0.0058171963 0.21969749 0.3074232
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x5]
|
|
|
|
scala> linear.gradWeight
|
|
res2: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
0.9835552 1.3616763 0.83564335 0.108898684 0.59625006
|
|
0.21608911 0.8393639 0.0035243928 -0.11795368 0.4453743
|
|
0.38366735 0.9618148 0.47721142 0.5607486 0.6069793
|
|
0.81469804 0.6690552 0.18522228 0.08559488 0.7075894
|
|
-0.030468717 0.056625083 0.051471338 0.2917061 0.109963015
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor of size 5x5]
|
|
|
|
```
|
|
|
|
**Python example:**
|
|
```python
|
|
|
|
from bigdl.dllib.nn.layer import *
|
|
from bigdl.dllib.nn.criterion import *
|
|
from bigdl.dllib.optim.optimizer import *
|
|
from bigdl.dllib.util.common import *
|
|
|
|
input = np.random.uniform(0, 1, (3, 5)).astype("float32")
|
|
gradOutput = np.random.uniform(0, 1, (3, 5)).astype("float32")
|
|
linear = Linear(5, 5, wRegularizer = L1Regularizer(0.2), bRegularizer = L1Regularizer(0.2))
|
|
output = linear.forward(input)
|
|
gradInput = linear.backward(input, gradOutput)
|
|
|
|
> linear.parameters()
|
|
{u'Linear@596d857b': {u'bias': array([ 0.3185505 , -0.02004393, 0.34620118, -0.09206461, 0.40776938], dtype=float32),
|
|
u'gradBias': array([ 2.14087653, 1.82181644, 1.90674937, 1.37307787, 0.81534696], dtype=float32),
|
|
u'gradWeight': array([[ 0.34909648, 0.85083449, 1.44904375, 0.90150446, 0.57136625],
|
|
[ 0.3745544 , 0.42218602, 1.53656614, 1.1836741 , 1.00702667],
|
|
[ 0.30529332, 0.26813674, 0.85559171, 0.61224306, 0.34721529],
|
|
[ 0.22859855, 0.8535381 , 1.19809723, 1.37248564, 0.50041491],
|
|
[ 0.36197871, 0.03069445, 0.64837945, 0.12765063, 0.12872688]], dtype=float32),
|
|
u'weight': array([[-0.12423037, 0.35694697, 0.39038274, -0.34970999, -0.08283543],
|
|
[-0.4186025 , -0.33235055, 0.34948507, 0.39953214, 0.16294235],
|
|
[-0.25171402, -0.28955361, -0.32243955, -0.19771226, -0.29320192],
|
|
[-0.39263198, 0.37766701, 0.14673658, 0.24882999, -0.0779015 ],
|
|
[ 0.0323218 , -0.31266898, 0.31543773, -0.0898933 , -0.33485892]], dtype=float32)}}
|
|
```
|
|
|
|
|
|
|
|
|
|
## L2 Regularizer ##
|
|
|
|
**Scala:**
|
|
```scala
|
|
val l2Regularizer = L2Regularizer(rate)
|
|
```
|
|
**Python:**
|
|
```python
|
|
regularizerl2 = L2Regularizer(rate)
|
|
```
|
|
|
|
L2 regularizer is used to add penalty to the gradWeight to avoid overfitting.
|
|
|
|
In our code implementation, gradWeight = gradWeight + alpha * weight * weight
|
|
|
|
For more details, please refer to [wiki](https://en.wikipedia.org/wiki/Regularization_(mathematics)).
|
|
|
|
**Scala example:**
|
|
```scala
|
|
|
|
import com.intel.analytics.bigdl.dllib.utils.RandomGenerator.RNG
|
|
import com.intel.analytics.bigdl.dllib.tensor._
|
|
import com.intel.analytics.bigdl.dllib.optim._
|
|
import com.intel.analytics.bigdl.numeric.NumericFloat
|
|
import com.intel.analytics.bigdl.dllib.nn._
|
|
|
|
RNG.setSeed(100)
|
|
|
|
val input = Tensor(3, 5).rand
|
|
val gradOutput = Tensor(3, 5).rand
|
|
val linear = Linear(5, 5, wRegularizer = L2Regularizer(0.2), bRegularizer = L2Regularizer(0.2))
|
|
|
|
val output = linear.forward(input)
|
|
val gradInput = linear.backward(input, gradOutput)
|
|
|
|
scala> input
|
|
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
0.54340494 0.67115563 0.2783694 0.4120464 0.4245176
|
|
0.52638245 0.84477615 0.14860484 0.004718862 0.15671109
|
|
0.12156912 0.18646719 0.67074907 0.21010774 0.82585275
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x5]
|
|
|
|
scala> gradOutput
|
|
gradOutput: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
0.4527399 0.13670659 0.87014264 0.5750933 0.063681036
|
|
0.89132196 0.62431186 0.20920213 0.52334774 0.18532822
|
|
0.5622963 0.10837689 0.0058171963 0.21969749 0.3074232
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x5]
|
|
|
|
scala> linear.gradWeight
|
|
res0: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
1.0329735 0.047239657 0.8979603 0.53614384 1.2781229
|
|
0.5621818 0.29772854 0.69706535 0.30559152 0.8352279
|
|
1.3044653 0.43065858 0.9896795 0.7435816 1.6003494
|
|
0.94218314 0.6793372 0.97101355 0.62892824 1.3458569
|
|
0.73134506 0.5975239 0.9109101 0.59374434 1.1656629
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor of size 5x5]
|
|
|
|
```
|
|
|
|
**Python example:**
|
|
```python
|
|
from bigdl.dllib.nn.layer import *
|
|
from bigdl.dllib.nn.criterion import *
|
|
from bigdl.dllib.optim.optimizer import *
|
|
from bigdl.dllib.util.common import *
|
|
|
|
input = np.random.uniform(0, 1, (3, 5)).astype("float32")
|
|
gradOutput = np.random.uniform(0, 1, (3, 5)).astype("float32")
|
|
linear = Linear(5, 5, wRegularizer = L2Regularizer(0.2), bRegularizer = L2Regularizer(0.2))
|
|
output = linear.forward(input)
|
|
gradInput = linear.backward(input, gradOutput)
|
|
|
|
> linear.parameters()
|
|
{u'Linear@787aab5e': {u'bias': array([-0.43960261, -0.12444571, 0.22857292, -0.43216187, 0.27770036], dtype=float32),
|
|
u'gradBias': array([ 0.51726723, 1.32883406, 0.57567948, 1.7791357 , 1.2887038 ], dtype=float32),
|
|
u'gradWeight': array([[ 0.45477036, 0.22262168, 0.21923628, 0.26152173, 0.19836383],
|
|
[ 1.12261093, 0.72921795, 0.08405925, 0.78192139, 0.48798928],
|
|
[ 0.34581488, 0.21195598, 0.26357424, 0.18987852, 0.2465664 ],
|
|
[ 1.18659711, 1.11271608, 0.72589797, 1.19098675, 0.33769298],
|
|
[ 0.82314551, 0.71177536, 0.4428404 , 0.764337 , 0.3500182 ]], dtype=float32),
|
|
u'weight': array([[ 0.03727285, -0.39697152, 0.42733836, -0.34291714, -0.13833708],
|
|
[ 0.09232076, -0.09720675, -0.33625153, 0.06477787, -0.34739712],
|
|
[ 0.17145753, 0.10128133, 0.16679128, -0.33541158, 0.40437087],
|
|
[-0.03005157, -0.36412898, 0.0629965 , 0.13443278, -0.38414535],
|
|
[-0.16630849, 0.06934392, 0.40328237, 0.22299488, -0.1178569 ]], dtype=float32)}}
|
|
```
|
|
|
|
## L1L2 Regularizer ##
|
|
|
|
**Scala:**
|
|
```scala
|
|
val l1l2Regularizer = L1L2Regularizer(l1rate, l2rate)
|
|
```
|
|
**Python:**
|
|
```python
|
|
regularizerl1l2 = L1L2Regularizer(l1rate, l2rate)
|
|
```
|
|
|
|
L1L2 regularizer is used to add penalty to the gradWeight to avoid overfitting.
|
|
|
|
In our code implementation, we will apply L1regularizer and L2regularizer sequentially.
|
|
|
|
For more details, please refer to [wiki](https://en.wikipedia.org/wiki/Regularization_(mathematics)).
|
|
|
|
**Scala example:**
|
|
```scala
|
|
|
|
import com.intel.analytics.bigdl.dllib.utils.RandomGenerator.RNG
|
|
import com.intel.analytics.bigdl.dllib.tensor._
|
|
import com.intel.analytics.bigdl.dllib.optim._
|
|
import com.intel.analytics.bigdl.numeric.NumericFloat
|
|
import com.intel.analytics.bigdl.dllib.nn._
|
|
|
|
RNG.setSeed(100)
|
|
|
|
val input = Tensor(3, 5).rand
|
|
val gradOutput = Tensor(3, 5).rand
|
|
val linear = Linear(5, 5, wRegularizer = L1L2Regularizer(0.2, 0.2), bRegularizer = L1L2Regularizer(0.2, 0.2))
|
|
|
|
val output = linear.forward(input)
|
|
val gradInput = linear.backward(input, gradOutput)
|
|
|
|
scala> input
|
|
input: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
0.54340494 0.67115563 0.2783694 0.4120464 0.4245176
|
|
0.52638245 0.84477615 0.14860484 0.004718862 0.15671109
|
|
0.12156912 0.18646719 0.67074907 0.21010774 0.82585275
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x5]
|
|
|
|
scala> gradOutput
|
|
gradOutput: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
0.4527399 0.13670659 0.87014264 0.5750933 0.063681036
|
|
0.89132196 0.62431186 0.20920213 0.52334774 0.18532822
|
|
0.5622963 0.10837689 0.0058171963 0.21969749 0.3074232
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor$mcF$sp of size 3x5]
|
|
|
|
scala> linear.gradWeight
|
|
res1: com.intel.analytics.bigdl.tensor.Tensor[Float] =
|
|
1.069174 1.4422078 0.8913989 0.042112567 0.53756505
|
|
0.14077617 0.8959319 -0.030221784 -0.1583686 0.4690558
|
|
0.37145022 0.99747723 0.5559263 0.58614403 0.66380215
|
|
0.88983417 0.639738 0.14924419 0.027530536 0.71988696
|
|
-0.053217214 -8.643427E-4 -0.036953792 0.29753304 0.06567569
|
|
[com.intel.analytics.bigdl.tensor.DenseTensor of size 5x5]
|
|
```
|
|
|
|
**Python example:**
|
|
```python
|
|
from bigdl.dllib.nn.layer import *
|
|
from bigdl.dllib.nn.criterion import *
|
|
from bigdl.dllib.optim.optimizer import *
|
|
from bigdl.dllib.util.common import *
|
|
|
|
input = np.random.uniform(0, 1, (3, 5)).astype("float32")
|
|
gradOutput = np.random.uniform(0, 1, (3, 5)).astype("float32")
|
|
linear = Linear(5, 5, wRegularizer = L1L2Regularizer(0.2, 0.2), bRegularizer = L1L2Regularizer(0.2, 0.2))
|
|
output = linear.forward(input)
|
|
gradInput = linear.backward(input, gradOutput)
|
|
|
|
> linear.parameters()
|
|
{u'Linear@1356aa91': {u'bias': array([-0.05799473, -0.0548001 , 0.00408955, -0.22004321, -0.07143869], dtype=float32),
|
|
u'gradBias': array([ 0.89119786, 1.09953558, 1.03394508, 1.19511735, 2.02241182], dtype=float32),
|
|
u'gradWeight': array([[ 0.89061081, 0.58810186, -0.10087357, 0.19108151, 0.60029608],
|
|
[ 0.95275503, 0.2333075 , 0.46897018, 0.74429053, 1.16038764],
|
|
[ 0.22894514, 0.60031962, 0.3836292 , 0.15895618, 0.83136207],
|
|
[ 0.49079862, 0.80913013, 0.55491877, 0.69608945, 0.80458677],
|
|
[ 0.98890561, 0.49226439, 0.14861123, 1.37666655, 1.47615671]], dtype=float32),
|
|
u'weight': array([[ 0.44654208, 0.16320795, -0.36029238, -0.25365737, -0.41974261],
|
|
[ 0.18809238, -0.28065765, 0.27677274, -0.29904234, 0.41338971],
|
|
[-0.03731538, 0.22493915, 0.10021331, -0.19495697, 0.25470355],
|
|
[-0.30836752, 0.12083009, 0.3773002 , 0.24059358, -0.40325543],
|
|
[-0.13601269, -0.39310011, -0.05292636, 0.20001481, -0.08444868]], dtype=float32)}}
|
|
```
|