Making a smart home less dumb
How smart is smart home really?
So my home has a sensor to tell me when mail has been posted, and a camera
pointed to the door step to capture whoever is posting mail. When mail is
posted, it captures a picture, fires it off in a message to me over telegram
captioned “You’ve got mail!“:
Which means I can check to see if it’s either: a) the postman - and worth
going to get the mail or b) a junk mailer - and can be left on the door mat.
It’s neat, but not exactly ‘smart’.
What if the house identified the person posting and then decided to
tell me whether it’s worth going to get the mail?
Sounds like a problem for a neural network!
I’d collected a snapshot every time mail had been posted or the doorbell rung
from the last 6 months or so - this amounted to 500 images. I manually went
through this and dropped them into one of two folders for ‘positive’ (postie)
and ‘negative’ (everyone else).
Sounds tedious, but you can zip through them in a file browser with large
thumbnails in no time at all. The classes split 171 positive / 329 negative
I then randomly split these 80% training set / 20% validation set.
It’s not a huge number of images, but as the classification is just into two
classes, hopefully it should suffice. As an observation the positive examples
tend to be distinctive in that our postie always carries a standard issue
Royal Mail red bag, and wears either a light blue shirt, or a reflective
overall - so this is what we’re hoping the neural network learns to spot.
Incidentally, our postie is usually the same lovely chap, but when he’s off
we’ll get another covering, so hopefully there’s enough variety in the
training data that the network doesn’t overfit to only match exactly our usual
Using a convolutional neural network with the great python library Keras
(tensorflow backend), I went about training a model.
The input to the neural network was the training data as a tensor downscaled
to 168x95 (a multiple of the full size HD images from the camera to minimise
The neural network was 3 layers of convolutions/max pooling, a 32 unit dense
layer and finally a single unit dense layer with sigmoid activation for
output. Since it’s a 2 class output it was fitted using the binary
crossentropy loss function.
There was also a dropout before the final layer to help prevent overfitting.
Defined as follows:
img_width, img_height = 168, 95
input_shape = (img_height, img_width, 3)
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Conv2D(32, (3, 3)))
model.add(Conv2D(64, (3, 3)))
The input images were augmented by rescaling to 1⁄255 (ie 0-1) and zoomed by 0
I used a batch size of 16 and trained for 25 epochs. The final model reached
an accuracy of 95% on the validation set - pretty happy with that result!
The next challenge was deploying the model - most my home automation runs on a
relatively lowly SBC (single board computer - an Odroid C2). Like the
Raspberry Pi this runs an ARM processor and tensorflow support for ARM is
lacking - it’s hard finding up to date pre-built binaries and I didn’t really
fancy compiling tensorflow from source on this system.
In the end I deployed the classifier wrapped into a simple REST API onto the
AWS Lambda serverless platform. This has sufficient grunt and as it’s Intel
the processors are better supported. There’s still a bit of pain in getting
the whole package under the size limits imposed in Lambda, but that’s for
another blog post.
There are comments.