Playing with Pi

A few months ago I decided to join the party and pickup a Raspberry Pi. It's a $25 full fledged ARM based computer the size of a credit card. There's also a $35 version, of which I ended up buying a handful so far. Due to the cost, this allows you to use a computer in dedicated applications where it otherwise wouldn't be justified or practical. Since then I've been pouring over the different things people have done with their Pi. Here are some that interest me:

  • Setting up security cameras or other dedicated cameras like a traffic cam or bird feeder camera
  • RaspyFi - streaming music player
  • Offsite backup
  • Spotify server
  • Carputer - blackbox for your car
  • Dashcam for my car
  • Home alarm system
  • Digital signage for the family business
  • Console emulator for old school consoles
  • Grocery inventory tracker

Since the Pi runs an ARM based version of Linux, I'm already familiar with practically everything on that list. The OS I've loaded is Raspbian, a Debian variant. This makes it a lot easier to get up and running with.

After recently divesting myself of some large business responsibilities, I've had more personal time to dedicate to things like this. Add in the vacation I took during Christmas and New Years and I had the perfect recipe to dive head-first into a Pi project. What I chose was something that I've always wanted.

The database and Big Data lover in me wants data, lots of it. So I've gone with building a black box for my car that runs all the time the car is on, and logs as much data as I can capture. This includes:

  • OBD2
  • GPS
  • Dashcam
  • and more

Once you've got a daemon running, and the inputs are being saved then the rest is all just inputs. Doesn't matter what it is. It's just input data.

My initial goal is to build a blackbox that constantly logs OBD2 data and stores it to a database. Looking around at what's out there for OBD2 software, I don't see anything that's built for long term logging. All the software out there is meant for 2 use cases: 1)live monitoring 2)tuning the ECU to get more power out of the car. What I want is a 3rd use case: long term logging of all available OBD2 data to a database for analysis.

In order to store all this data I decided to build an OBD2 storage architecture that's comprised of

  • MySQL database
  • JSON + REST web services API
  • SDK that existing OBD2 software would use to store the data it's capturing
  • Wrapping up existing open source OBD2 capture data so it runs as a daemon on the Pi
  • Logging data to a local storage buffer, which then gets synced to the aforementioned cloud storage service when there's an internet connection.

Right now I'm just doing this for myself. But I'm also reaching out to developers of OBD2 software to gauge interest in adding this storage service to their work. In addition to the storage, an API can be added for reading back the data such as pulling DTS (error) codes, getting trends and summary data, and more.

The first SDK I wrote was in Python. It's available on GitHub. It includes API calls to register an email address to get an API key. After that, there are some simple logging functions to save a single PID (OBD2 data point such as RPM or engine temp). Since this has to run without an internet connection I've implemented a buffer. The SDK writes to a buffer in local storage and when there's any internet connection a background sync daemon pulls data off the buffer, sends it to the API and removes the item from the buffer. Since this is all JSON data and very simple key:value data I've gone with a NoSQL approach and used MongoDB for the buffer.

The API is built in PHP and runs on a standard Linux VPS in apache. At this point the entire stack has been built. The code's nowhere near production-ready and is missing some features, but it works enough to demo. I've built a test utility that simulates a client car logging 10 times/second. Each time it's logging 10 different PIDs. This all builds up in the local buffer and the sync script then clears it out and uploads it to the API. With this estimate, the client generates 100 data points per second. For a car being driven an average of 101 minutes per day, that's 606,000 data points per day.

The volume of data will add up fast. For starters, the main database table I'm using stores all the PIDs as strings and stores each one as a separate record. In the future, I'll evaluate pivoting this data so that each PID has it's own field (and appropriate data type) in a table. We'll see which method proves more efficient and easier to query. The OBD2 spec lists all the possible PIDs. Car manufacturers aren't required to use them all, and they can add in their own proprietary ones too. Hence my ambivalence for now about creating a logging table that contains a field for each PID. If most of the fields are empty, that's a lot of wasted storage. 

Systems integration is much more of a factor in this project than coding each underlying piece. Each underlying piece, from what I've found, has already been coded somewhere by some enthusiast. The open source Python code already exists for reading OBD2 data. That solves a major coding headache and makes it easier to plug my SDK into it.

There are some useful smartphone apps that can connect to a Bluetooth OBD2 reader to pull the data. Even if they were to use my SDK, it's still not an ideal solution for logging. In order to log this data, you need a dedicated device that's always on when the car's on and always logging. Using a smartphone can get you most of the way there, but there'll be gaps. That's why I'm focusing on using my Pi as a blackbox for this purpose.