The Boat Computer: a tragedy in six parts

26 May 2016

Categories

Boat (27) 
Not the Boat (12) 

Tags

Recent articles

25 Apr 2017

Ubuntu 16.10 LXC host on ZFS Root, with EFI and Time Machine

Still completely unrelated to boats, but I needed somewhere to put this. Here is a blow-by-blow guide to installing a minimal Ubuntu 16.10 to a ZFS root, booted from EFI, which as used as a LXC host to act as an Apple "Time Machine" destination.
mike 25 Apr 2017 at 17:20
14 Mar 2017

How to connect any serial device to the internet

A completely generic script that will proxy serial devices over HTTP, turning USB-things into internet-things.
mike 14 Mar 2017 at 23:00

The boat needs a computer. Easy. I do computers, and have done since I were a nipper. I've built over a dozen and have been using Linux for 20 years. Building boats, that's new to me, but computers? Yawn, whatever. Anyway, my needs are actually pretty simple:

  • Will run from 12V or less
  • Has no moving parts - this means no fan for cooling, and solid-state storage
  • Low power, but gutsy enough to run my Java application on Linux (more on that later)
  • Can support a lot of USB devices - although many of these are my custom circuit boards, the "U" in USB stands for Universal. It works everywhere. Right?

Rant follows. If you can't be bothered, I'll sum up. I spent close to £1000, about £900 of which was destroyed, thrown away or otherwise useless. Oh, and it took close to two years. Hooray for computers.

Round 1: Beaglebone Black

My first system of choice was the Beaglebone Black. This offered (at the time) a much faster CPU than the Raspberry Pi, and had lots of nice GPIO pins I could tap into. I even designed a nice custom cape for it which had a GPS, six UARTs, magnetometer, barometer and a bunch of other bits. This all worked well.

What didn't work was USB, at least not reliably. At the time I had a choice of two kernels to boot from: an outdated one with working USB but no GPIO, and a newer one with GPIO but unreliable USB. And that was how it stayed for some months. It turned out there was only one guy maintaining the kernel for the BeagleBone Black, and he was a volunteer, not an employee.

That was it for me. My lovingly crafted cape went in the rubbish.

Round 2: Hummingboard

Another Raspberry-Pi alike but with more power, the Hummingboard Pro i2eX had promise but failed to deliver in two crucial areas.

First, it ran hot, or rather the mSATA storage did. To quantify this, when I removed it from its plastic case I found the case scorched and almost melted through. The computer was rated to only 40°C, so as the temperature of the mSATA storage climbed through 100°C things went pear shaped pretty quickly.

Also, the CPU package had a tendency to fall off when the board was bumped. Into the rubbish went the Hummingboard.

Round 3: Fit PC3

Next was a Fit-PC3. My original choice, I'd decided it was too power-hungry and had relegated it to run as a development testbed in my basement for a couple of years. After my previous two disappointments with more hobbyist-type devices, I pressed it into service but sadly I never fully populated it with USB devices. I didn't have the chance, as while I was wiring it into the UPS I accidentally reversed the polarity of the power connecter. It no longer worked. Although it did get very, very hot.

Into the rubbish it went.

Round 4: Fit PC4

This is starting to get silly.

I splashed out on a Fit-PC4, the replacement for the Fit-PC3 I'd just cooked. Another nice design, it was wired into the boat: four USB hubs, 20+ USB devices, many of which were my own custom boards.

And it worked too, for a time.

The first signs it wouldn't last were log messages showing one of the four CPU cores had hung - soon after the machine would lock and I would have to power-cycle it. The log messages stopped, but the lockups continued, and my uptime slowly dropped from weeks to days. There was no pattern to it and trying to debug it was impossible - there were no logs, and with no screen plugged in I had no idea what was going on.

Eventually I managed to track this down with some certainty to the USB system, and not just my custom devices. It was under warranty so after complaining long and loud, it was sent it back to the seller. Of course, they were unable to reproduce the problem and sent it back after several weeks.

The computer has to be reliable and must be quickly replaceable, so with only weeks to launch date this wasn't going to work out. I had to find another solution.

The not-so-universal serial bus

You may wonder why I didn't just go for a Raspberry Pi.

When I started this process the Pi was still generation one, which was very, very slow. But the Pi 3 was released around the same time the FitPC4 went back to the shop, so I started testing. And there was a problem.

Over the last three years I'd designed and built a very large number of custom circuit boards for the boat, handling everything from engine ignition to the fridge. All of these are based around the same microcontroller, the Atmel Atmega32U4, which I chose because it had native support for USB serial via the "CDC" class of USB device. This is the chip used in the Arduino Leonardo, Arduino Micro, Teensy 2.0 and a number of other similar boards. CDC is defined in USB 1.1 and can transmit at up to 12Mbps.

Now, if you plug a USB 1.0 device into a USB 2.0 hub, the hub has to translate the content. Most hubs have a single translator per hub (known as "Single TT") rather than a translator per port ("Multi TT"). Apparently this isn't an issue for most computers, but - uniquely - the Raspberry Pi internal hub is also Single TT: a quirk that comes from the mobile heritage of the BCM283x chips that power them.

All this means if you plug multiple USB 1.1 devices into the Raspberry Pi without connecting them through a Multi-TT USB hub, bad things happen. Like what, you may ask? Well, I checked: half the devices fail to enumerate and the ones that do lose much of their data. It's quite unworkable, and it means if I did switch to a Raspberry Pi I would have to replace all the USB hubs on the boat with Multi-TT hubs.

But the only hubs I could find had four ports, and with so many devices, I would exceed the maximum length of devices in the USB "chain" - you can only have so many hubs between the computer and a device. So the Raspberry Pi was out.

Round 5: Odroid C2

Equipped with this knowledge, I started looking for a small single-board computer that didn't have this limitation. The Odroid C2 had just been released and looked like it might do the job. I bought one, fought it for day to get it set it up and took it down to the boat.

Not even close. When plugged into the four USB hubs and 20+ devices I had, I was unable to communicate at all with the computer, due to what appeared to be the complete failure of the USB system. No keyboard, no USB wifi card, no way in.

On the plus side, I hadn't set this one on fire, so we're making progress. Put aside to sell eBay.

The search for USB hubs

Enough is enough. I started looking for Multi-TT USB hubs with more than four ports, and eventually I found one. It looked surprisingly like an existing hub I had which I knew was Single-TT, but case designs are pretty generic. These hubs were only available in the US and the company wouldn't ship to the UK, so I used a shipping forwarder. £90 of shipping and duty later, I had four hubs. I plugged them in and they were Single-TT, not Multi-TT as advertised.

At this point I have left annoyed well behind. I have reached thoroughly f***ked off. Hubs returned to the US, at my expense, for a refund.

I trawl every forum I can find, and finally assemble a shortlist of exactly three Multi-TT hubs with more than four ports:

  • Actionstar LinXcel, manufactured in Taiwan but apparently only sold in the US. No thanks, not going there again.
  • Elektron Overhub, a 7 port hub manufactured by a company in Sweden to fill a niche I hadn't considered, namely low-speed USB devices used by musicians who can't abide the latency introduced by Single-TT. Nice hub and the manufacturer confirmed it was definitely Multi-TT, but it's pricey.
  • UUGear 7 port hub. Finally! A relatively inexpensive 7-port Multi-TT hub, designed and built in the Czech Republic explicitly for the Raspberry Pi. Inexpensive because it's just a bare circuit board, but I have a few of those already so can live with it. I order five, and a Raspberry Pi 3.

Round 6: Raspberry Pi 3, and a new set of hubs

Raspberry Pi 3 and UUGear hub, squashed into a Pibow case. Bottom layer replaced with FR4 for mounting Which brings us up to date, and finally things are looking good. I have a Pi 3 connected to a UUGear hub, which I've managed to wedge into a PiBow case, with the addition of some M2.5 standoffs and a few tactical breakages of the acrylic case. I threw away their stupid plastic screws and used proper metal ones. Why do Pimoroni ship stupid plastic screws? Buy some M2.5 x 30mm socket caps people, they're not expensive.

The new hubs take a 5V supply from a Micro-USB socket, not 12V like my existing hubs. I found some neatly potted Buck Converter with Micro USB plug on eBay - they arrive very quickly, thank you Hong Kong post.

I threw away my crappy old generic Micro-SD cards and bought a few Samsung Evo Plus cards. This was a blindingly good decision, and has probably doubled the apparent speed of the Pi for only a few quid more.

Installing the new system

Yesterday I ripped out all the old hubs, replacing them all with 3 x UUGear 7 port hubs, and appropriate power connectors. And you know what? It works. To give you an idea of the pain I am causing the USB subsystem, here is the output of lsusb -t

/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc_otg/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/5p, 480M
        |__ Port 1: Dev 3, If 0, Class=Vendor Specific Class, Driver=smsc95xx, 480M
        |__ Port 2: Dev 4, If 0, Class=Hub, Driver=hub/7p, 480M
            |__ Port 2: Dev 7, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 2: Dev 7, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 2: Dev 7, If 2, Class=Human Interface Device, Driver=usbhid, 12M
            |__ Port 3: Dev 11, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 3: Dev 11, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 4: Dev 15, If 0, Class=Hub, Driver=hub/4p, 480M
                |__ Port 4: Dev 19, If 0, Class=Audio, Driver=snd-usb-audio, 12M
                |__ Port 4: Dev 19, If 1, Class=Audio, Driver=snd-usb-audio, 12M
                |__ Port 4: Dev 19, If 2, Class=Audio, Driver=snd-usb-audio, 12M
                |__ Port 4: Dev 19, If 3, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 3: Dev 5, If 0, Class=Hub, Driver=hub/7p, 480M
            |__ Port 1: Dev 8, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 1: Dev 8, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 2: Dev 12, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 2: Dev 12, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 3: Dev 16, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 3: Dev 16, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 4: Dev 20, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 4: Dev 20, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 6: Dev 22, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 6: Dev 22, If 1, Class=CDC Data, Driver=cdc_acm, 12M
        |__ Port 4: Dev 6, If 0, Class=Hub, Driver=hub/7p, 480M
            |__ Port 1: Dev 10, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 12M
            |__ Port 5: Dev 14, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 5: Dev 14, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 6: Dev 18, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 480M
        |__ Port 5: Dev 9, If 0, Class=Hub, Driver=hub/7p, 480M
            |__ Port 3: Dev 13, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 3: Dev 13, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 4: Dev 17, If 0, Class=Vendor Specific Class, Driver=ath9k_htc, 480M
            |__ Port 5: Dev 21, If 0, Class=Communications, Driver=cdc_acm, 12M
            |__ Port 5: Dev 21, If 1, Class=CDC Data, Driver=cdc_acm, 12M
            |__ Port 6: Dev 23, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 480M
            |__ Port 7: Dev 24, If 0, Class=Human Interface Device, Driver=usbhid, 12M
    

I currently have 10 CDC serial devices, with at least two more to follow. There are also three other serial devices, a wireless card and and audio device, plus the four hubs (and one 4-port hub hanging off one of those), and a few more things will go in before splash - 17 devices and 5 hubs. Given how I got here I won't properly relax until I've had a couple of weeks uptime, but it's looking good. Total power draw of all of these devices is about 650-700mA, which is half what it was with the Fit-PC4.

If you're the kind of individual that likes looking at pictures of USB hubs and lots of still-to-be-tidied wiring, today is going to be a good day for you. Here are some pictures.

Hub 1, behind the switch panel. Hub 2, under a seat in the space forward of the centreboard trunk. Hub 3 and 4 (with Pi), in the port quarter under the cockpit.

PostScript: 7th June

So close, so close. Most of the time things are ticking over quite nicely, but I am seeing occasional corruption in the streams of data coming from the devices. Specifically, I am seeing chunks of bytes go missing in the middle of the data. It took me a while to figure out what caused this one: the software stack on the Pi was not reading the data fast enough from the serial ports.

Lets say every second my device sends out an update of 800 bytes over USB. The firmware I'm using on my boards (derived from the excellent Teensy 2.0 firmware written by Paul Stoffregen) only has a transmit buffer of 64 bytes, so any further writes when this is full will block, and eventually time out after 15ms. That's a long time at this level of communication, but with a slow device like the Pi trying to read with a large number of devices potentially all transmitting at once, occasionally the buffers wouldn't clear in time and data would be discarded. The Pi would then catch up and normal tranmission would resume, minus the skipped bytes in the middle.

I'll reflash all the boards to have a timeout of 50ms and have shaved down my Serial read thread for speed, so we'll see what happens.

PostScript: 15th June

Rewriting the Serial read thread has done the trick - no exceptions for a week. I've tested the firmware change and it causes no issues so will run with both, but at least I've confirms that was definitely the issue. The computer is now officially stable!

PostScript: 9th September

Stable, but also heavily loaded. Too heavily loaded to reliably play audio via a USB audio card - I get frequent skips and pops. While this is a bit of a known issue with the Raspberry Pi, the normally proposed solution of setting the boot option dwc_otg.speed=1 caused massive packet loss on the USB bus. So no USB audio for me.

I didn't need to reflash the boards to increase the serial timeout as described above, although the boards I did do this to are working. As it turned out the single most signifant improvement, other than the fancy hubs, came from using the Oracle JVM instead of the OpenJDK JVM. On the Pi, the latter is so slow it's next to useless: after switching, the average CPU load on the Pi dropped from over 1 down to about 0.15. No more latency issues.