10/11/17

Upon introducing multiple streams for our convnet regressor, the architectual possibilities really explode. Suppose we use a third stream feeding each by one of three consecutive frames. We'd like to trade our fully connected layers for a couple RNN layers, hopefully incorporating more temporal context into the predictions.

The necessary changes can be summarized as follows:

- Modify DataLoader to pack sequences of consecutive images
- Add placeholders, one for each time step
- Modify feed_dict to reflect new placeholders
- tf.stack flattened convnet codes
- transpose stacked codes perm=[1, 0, 2]
- Define layers, MultiRNNCell, outputs/states with dynamic_rnn
- tf.concat states along axis 1
- Apply dense layer to concatenated states

Trying 2 hidden RNN layers with 8 neurons, I quickly reach very low MSE.

Here is a plot of the predicted vs. actual in validation.

And the residuals for comparison