NEW - Now using the Alexa Cameras Recap API allowing for stored recordings to be viewed.
Use Alexa's Smart Home Skill API with standalone IP cameras to stream live video and recorded video to an Alexa device without needing any camera cloud service.
Many people like myself have IP cameras without a cloud service that perhaps they'd like to control using Amazon's Alexa Smart Home Skill API. This API is great but assumes you are using a cloud service for your cameras and has very specific security and streaming requirements that makes it challenging to connect standalone cameras. In my case I have several Axis IP cameras around the house connected via a LAN to a local Linux machine and are configured to do motion detection and store recordings on the Linux box. I also use Zoneminder as a Network Video Recorder and its mobile companion app zmNinja. No camera cloud service is needed in my system which avoids associated recurring costs BUT the system is not Alexa compatible "out of the box".
So I created the alexa-ip-camera project to solve this problem and enable me to view my home's live and recorded camera streams on Amazon devices such as the Echo Show, the Echo Spot and the FireTV. To do that I had to develop an Alexa Smart Home skill plus some supporting components running on the local Linux machine.
Note: Please see my related smart-zoneminder project that enables fast upload of ZoneMinder alarm frame images to an S3 archive where they are analyzed by Amazon Rekognition or locally by Tensorflow and made accessible by voice via Alexa.
I hope others find this useful. I've described the project in some detail and outlined the steps below that I used to create this skill.
The system consists of the following main components.
You'll need the following setup before starting this project.
Copy config-template.json to a file called config.json. There are several values in that file that need to be changed to suit your setup. Some of them are described below.
The Steps to Build a Smart Home Skill and Build Smart Home Camera Skills on the Amazon Alexa Developers site give detailed instructions on how to create the skill and how the API works. Replace the Lambda code in the template example with the code in index.js in lambda directory of this repo. The code emulates the camera configuration data that would normally come from a 3rd party camera cloud service. You'll have to edit config.json to make it reflect your camera names and specs.
emtunc's blog provides excellent instructions on how to setup the proxy from Live555. I needed to set OutPacketBuffer::maxSize to 400000 bytes in live555ProxyServer.cpp to stop the feed from getting truncated. I didn't make the other changes that emtunc made (port and stream naming).
The RTSP proxy needs to be on a different port than the individual streams. In my case the proxy port is 8554 since the cameras have their RTSP port set to 554. The proxy is therefore started with
-p 8554 on the command line. You have to make sure nothing else is using that port on the server running the proxy.
The proxy-start script is run as a cronjob as root at boot to start the RTSP proxy. The cronjob is delayed by 60 secs to allow networking to come up first.
I followed the corresponding steps in CameraPi almost exactly except I'm using GoDaddy to manage domains and DNS instead of AWS Route 53. Note: Let’s Encrypt CA issues short-lived certificates (90 days). Make sure you renew the certificates at least once in 3 months.
Per the Alexa Smart Home camera documentation you can provide the API a local or remote camera URI. I'm currently providing a local URI but did try remote as well since I was a little concerned about putting a private IP address in a DNS record. But local results in lower latency over remote but its not a lot, only about 500 ms and I didn't have to open a port to the Internet in my firewall. The biggest drawback is that I won't be able to view my cameras on an Echo device outside my home, for example at work.
stunnel is a ubuntu package so it easy to install using apt-get as root. The configuration I used is in the file stunnel.conf which is placed in /etc/stunnel/stunnel.conf on my machine. stunnel is run as a cronjob as root at boot to start it. The cronjob is delayed by 60 secs to allow networking to come up first.
I created a user for Alexa access and a streaming profile for each camera. The settings for the profile are shown in the table below. Note the specific settings. These are the only values that have been tested so you should use the same or be prepared to experiment.
|Encoder Max Frame Rate||Unlimited||NA|
|Encoder Bit Rate Control||Variable||NA|
Most modern IP cameras allow you to store a recording to a local drive triggered from motion detection or another event. This needs to be enabled to use the Alexa Cameras Recap API which allows you to view those recordings.
You'll need to change config.json to point the process-events.js app that processes the recordings and most likely the app itself to suit the particular way your camera stores recordings. The code here has only been tested against Axis cameras. The process-events.js app is run as a Linux service using systemd.
As mentioned above the recording metadata is sent to the Alexa Gateway. This is done asynchronously and so you must provide the proper authentication information with the request. Follow the steps outlined in Authenticate a Customer to Alexa with Permissions to make this happen. You'll also need to add the relevant information to config.json.
A webserver is required to serve up the camera recordings to the skill's lambda function running in the AWS cloud. I'm using Apache and for this purpose. I just created a virtual host that pointed to the directory where the cameras store their recordings. Note that Alexa Smart Home API requires this connection to be over https and self-signed certs may not be used as outlined above in the SSL cert section.
Once everything is setup you need to enable your skill in the Alexa companion mobile app or web app. Then ask Alexa to "discover devices" and your camera(s) should be found, Alexa will tell you that and they'll be visible in the app. After that just ask Alexa to "show front porch camera" (or what every you named them) and the camera video will be streamed to your Echo device with a screen. Or you can say "Alexa, show the event that just happened at the front porch camera" to see the last recorded event.
Overall the skill works well but the latency between asking Alexa to show a camera and the video appearing on the Echo's or FireTV screen is a little too long for a great experience, on average 3 secs or so. I haven't yet tracked down the cause of it.
Also I've seen the video re-buffer occasionally which can be irritating and once in a great while the video freezes during rebuffering. I've found that the camera settings above minimize the buffering across all the Amazon Alexa devices I've tested. I think the source of the buffering is the video decode time in the device which varies across device type due to hardware capabilities. Since the the video is delivered to the device from the camera via TCP (stunnel uses SSL over TCP) the network will not let the device discard packets when gets it gets behind in its decoding. I don't know a way to use stunnel with UDP which would obviate this issue. The table below shows the device types I've tested and video quality performance.
|Fire TV Cube||Never||Expected since the device is optimized for video.|
|Fire TV Stick 4K||Never||Expected since the device is optimized for video.|
|Echo Show Gen 1||Rarely|
|Echo Show Gen 2||Rarely|
|Fire HD 10 Tablet||Occasionally||Expected given its hardware capabilities.|
I looked for other projects on GitHub for code to leverage but didn't find anything exactly solving my particular problem. However I did find an excellent repo called CameraPi from Sam Machin that describes how to use Alexa to control a camera connected to a Raspberry PI that I used as a basis for my effort. Thank you Sam!
I used emtunc's very cool blog to learn how to setup the RTSP proxy. Thank you emtunc!