# Constrained de-noising AutoEncoder

This post discusses an implementation of simple *constrained de-noising autoencoder* using TensorFlow. Here is implementation.
An auto-encoder is an *unsupervised* learning model. It takes some input, runs it through “encoder”
to get *encodings* of the input, and then it attempts to reconstruct original input based only on obtained encodings.
The autoencoder is called *constrained*, if it’s decoder uses transposed matrices from encoder (instead of learning them from scratch).
It is called *de-noising*, if during training, it randomly sets parts of it’s input to 0, but still attempts to re-construct it’s original input.
This can help with over-fitting and, potentially, to capture only the most important parts of the data.

The idea is that *encodings* will encode the most important information in the data.

Example run:

```
python ~/repos/autoencoder/autoencoder_use_mnist.py --encoder_network=784,128,10 --noise_level=0.0 --batch_size=64 --num_epochs=60 --logdir=LD_784_128_10_N0
```

Then start Tensorboard:

```
tensorboard --logdir=LD_784_128_10_N0
```

# Comments on implementation

I intentionally wanted the keep it as simple as possible. The class AutoEncoder (in autoencoder.py) is not application specific.
Currently, the only supported cost function is *mean squared error*:

```
self._loss = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(self._x, self._z))))
```

Notice, that it does not contain any regularization terms. You can regularize the model corrupting the input by setting noise level > 0. You might still find yourself in the situation when adding regularization terms to cost function might be usefull.

When building autoencoder model, it is natual to first create *encoding* part of the network and then add *decoding* part of the network on top of it.
Since we are building a *constrained* autoencoder, matrices in the decoder part of the network are not learned, but instead just equal to the
transpose of corresponding encoding matrices. However, we still need to learn biases for the decoding layer as well.
Hence, all free parameters of the model are stored in these lists (order is important).

```
self._encoding_matrices = []
self._encoding_biases = []
self._decoding_biases = []
```

The network is fully defined by its activations, weight matrices and biases in the encoding and decoding layers. All we need to do to build a proper Tensorflow graph is to connect them using operations. We then add loss function and optimizer which we can use during training.

```
def encode(self, x):
inpt = x
for i in range(0, len(self._encoding_matrices)):
W = self._encoding_matrices[i]
b = self._encoding_biases[i]
logits = tf.nn.bias_add(tf.matmul(inpt, W), b)
...
def decode(self, encoding):
inpt = encoding
for i in range(len(self._encoding_matrices)-1, -1, -1):
Wd = tf.transpose(self._encoding_matrices[i])
bd = self._decoding_biases[i]
logits = tf.nn.bias_add(tf.matmul(inpt, Wd), bd)
...
```

Note, that in the snippet above, the decoder traverses *encoding* matrices, but in the reversed order and uses their transposes.

# Example usage:

```
dA = AutoEncoder(FLAGS.encoder_network, FLAGS.noise_level,
True, FLAGS.acitivation_kind, FLAGS.optimizer_kind, FLAGS.learning_rate)
```

and then feed input `x`

for training and eval:

```
feed_dict = {dA.x: batch_data}
tloss, _ = sess.run([dA.loss, dA.train_op], feed_dict=feed_dict)
```

See this post for examples Compressing MNIST using AutoEncoders