The Boat Computer: a tragedy in six parts
Categories
Tags
Recent articles
Ubuntu 16.10 LXC host on ZFS Root, with EFI and Time Machine
How to connect any serial device to the internet
The boat needs a computer. Easy. I do computers, and have done since I were a nipper. I've built over a dozen and have been using Linux for 20 years. Building boats, that's new to me, but computers? Yawn, whatever. Anyway, my needs are actually pretty simple:
Rant follows. If you can't be bothered, I'll sum up. I spent close to £1000, about £900 of which was destroyed, thrown away or otherwise useless. Oh, and it took close to two years. Hooray for computers.
Round 1: Beaglebone Black
My first system of choice was the Beaglebone Black.
This offered (at the time) a much faster CPU
than the Raspberry Pi, and had lots of nice GPIO pins I could tap into. I even designed
a nice custom
cape for it which had a GPS, six UARTs, magnetometer, barometer and a bunch of other
bits. This all
worked well.
What didn't work was USB, at least not reliably. At the time I had a choice of two kernels to boot from: an outdated one with working USB but no GPIO, and a newer one with GPIO but unreliable USB. And that was how it stayed for some months. It turned out there was only one guy maintaining the kernel for the BeagleBone Black, and he was a volunteer, not an employee.
That was it for me. My lovingly crafted cape went in the rubbish.
Round 2: Hummingboard
Another Raspberry-Pi alike but with more power, the
Hummingboard Pro i2eX
had promise but failed to deliver in two crucial areas.
First, it ran hot, or rather the mSATA storage did. To quantify this, when I removed it from its plastic case I found the case scorched and almost melted through. The computer was rated to only 40°C, so as the temperature of the mSATA storage climbed through 100°C things went pear shaped pretty quickly.
Also, the CPU package had a tendency to fall off when the board was bumped. Into the rubbish went the Hummingboard.
Round 3: Fit PC3
Next was a Fit-PC3.
My original choice, I'd decided it was too power-hungry and had
relegated it to run as a development testbed in my basement for a couple of years.
After my previous two disappointments with more hobbyist-type devices, I pressed it
into service but sadly
I never fully populated it with USB devices. I didn't have the chance, as while I
was wiring it into
the UPS I accidentally reversed the polarity of the power connecter. It no longer
worked.
Although it did get very, very hot.
Into the rubbish it went.
Round 4: Fit PC4
This is starting to get silly.
I splashed out on a Fit-PC4, the replacement for the Fit-PC3 I'd just cooked. Another nice design, it was wired into the boat: four USB hubs, 20+ USB devices, many of which were my own custom boards.
And it worked too, for a time.
The first signs it wouldn't last were log messages showing one of the four CPU cores had hung - soon after the machine would lock and I would have to power-cycle it. The log messages stopped, but the lockups continued, and my uptime slowly dropped from weeks to days. There was no pattern to it and trying to debug it was impossible - there were no logs, and with no screen plugged in I had no idea what was going on.
Eventually I managed to track this down with some certainty to the USB system, and not just my custom devices. It was under warranty so after complaining long and loud, it was sent it back to the seller. Of course, they were unable to reproduce the problem and sent it back after several weeks.
The computer has to be reliable and must be quickly replaceable, so with only weeks to launch date this wasn't going to work out. I had to find another solution.
The not-so-universal serial bus
You may wonder why I didn't just go for a Raspberry Pi.
When I started this process the Pi was still generation one, which was very, very slow. But the Pi 3 was released around the same time the FitPC4 went back to the shop, so I started testing. And there was a problem.
Over the last three years I'd designed and built a very large number of custom circuit boards for the boat, handling everything from engine ignition to the fridge. All of these are based around the same microcontroller, the Atmel Atmega32U4, which I chose because it had native support for USB serial via the "CDC" class of USB device. This is the chip used in the Arduino Leonardo, Arduino Micro, Teensy 2.0 and a number of other similar boards. CDC is defined in USB 1.1 and can transmit at up to 12Mbps.
Now, if you plug a USB 1.0 device into a USB 2.0 hub, the hub has to translate the content. Most hubs have a single translator per hub (known as "Single TT") rather than a translator per port ("Multi TT"). Apparently this isn't an issue for most computers, but - uniquely - the Raspberry Pi internal hub is also Single TT: a quirk that comes from the mobile heritage of the BCM283x chips that power them.
All this means if you plug multiple USB 1.1 devices into the Raspberry Pi without connecting them through a Multi-TT USB hub, bad things happen. Like what, you may ask? Well, I checked: half the devices fail to enumerate and the ones that do lose much of their data. It's quite unworkable, and it means if I did switch to a Raspberry Pi I would have to replace all the USB hubs on the boat with Multi-TT hubs.
But the only hubs I could find had four ports, and with so many devices, I would exceed the maximum length of devices in the USB "chain" - you can only have so many hubs between the computer and a device. So the Raspberry Pi was out.
Round 5: Odroid C2
Equipped with this knowledge, I started looking for a small single-board computer
that didn't have this limitation. The
Odroid C2
had just been released and looked like
it might do the job.
I bought one, fought it for day to get it set it up and took it down to the boat.
Not even close. When plugged into the four USB hubs and 20+ devices I had, I was unable to communicate at all with the computer, due to what appeared to be the complete failure of the USB system. No keyboard, no USB wifi card, no way in.
On the plus side, I hadn't set this one on fire, so we're making progress. Put aside to sell eBay.
The search for USB hubs
Enough is enough. I started looking for Multi-TT USB hubs with more than four ports, and eventually I found one. It looked surprisingly like an existing hub I had which I knew was Single-TT, but case designs are pretty generic. These hubs were only available in the US and the company wouldn't ship to the UK, so I used a shipping forwarder. £90 of shipping and duty later, I had four hubs. I plugged them in and they were Single-TT, not Multi-TT as advertised.
At this point I have left annoyed well behind. I have reached thoroughly f***ked off. Hubs returned to the US, at my expense, for a refund.
I trawl every forum I can find, and finally assemble a shortlist of exactly three Multi-TT hubs with more than four ports:
Round 6: Raspberry Pi 3, and a new set of hubs
Which brings us up to date, and finally things are looking good. I have a Pi 3 connected
to
a UUGear hub, which I've managed to wedge into a
PiBow case,
with the addition of some M2.5 standoffs and a few tactical breakages of the acrylic
case.
I threw away their stupid plastic screws and used proper metal ones. Why do Pimoroni
ship
stupid plastic screws? Buy some M2.5 x 30mm socket caps people, they're not expensive.
The new hubs take a 5V supply from a Micro-USB socket, not 12V like my existing hubs. I found some neatly potted Buck Converter with Micro USB plug on eBay - they arrive very quickly, thank you Hong Kong post.
I threw away my crappy old generic Micro-SD cards and bought a few Samsung Evo Plus cards. This was a blindingly good decision, and has probably doubled the apparent speed of the Pi for only a few quid more.
Installing the new system
Yesterday I ripped out all the old hubs, replacing them all with 3 x UUGear 7 port
hubs, and appropriate
power connectors. And you know what? It works. To give you an idea of the pain I am
causing the USB
subsystem, here is the output of lsusb -t
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc_otg/1p, 480M |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/5p, 480M |__ Port 1: Dev 3, If 0, Class=Vendor Specific Class, Driver=smsc95xx, 480M |__ Port 2: Dev 4, If 0, Class=Hub, Driver=hub/7p, 480M |__ Port 2: Dev 7, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 2: Dev 7, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 2: Dev 7, If 2, Class=Human Interface Device, Driver=usbhid, 12M |__ Port 3: Dev 11, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 3: Dev 11, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 4: Dev 15, If 0, Class=Hub, Driver=hub/4p, 480M |__ Port 4: Dev 19, If 0, Class=Audio, Driver=snd-usb-audio, 12M |__ Port 4: Dev 19, If 1, Class=Audio, Driver=snd-usb-audio, 12M |__ Port 4: Dev 19, If 2, Class=Audio, Driver=snd-usb-audio, 12M |__ Port 4: Dev 19, If 3, Class=Human Interface Device, Driver=usbhid, 12M |__ Port 3: Dev 5, If 0, Class=Hub, Driver=hub/7p, 480M |__ Port 1: Dev 8, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 1: Dev 8, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 2: Dev 12, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 2: Dev 12, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 3: Dev 16, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 3: Dev 16, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 4: Dev 20, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 4: Dev 20, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 6: Dev 22, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 6: Dev 22, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 4: Dev 6, If 0, Class=Hub, Driver=hub/7p, 480M |__ Port 1: Dev 10, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 12M |__ Port 5: Dev 14, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 5: Dev 14, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 6: Dev 18, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 480M |__ Port 5: Dev 9, If 0, Class=Hub, Driver=hub/7p, 480M |__ Port 3: Dev 13, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 3: Dev 13, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 4: Dev 17, If 0, Class=Vendor Specific Class, Driver=ath9k_htc, 480M |__ Port 5: Dev 21, If 0, Class=Communications, Driver=cdc_acm, 12M |__ Port 5: Dev 21, If 1, Class=CDC Data, Driver=cdc_acm, 12M |__ Port 6: Dev 23, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 480M |__ Port 7: Dev 24, If 0, Class=Human Interface Device, Driver=usbhid, 12M
I currently have 10 CDC serial devices, with at least two more to follow. There are also three other serial devices, a wireless card and and audio device, plus the four hubs (and one 4-port hub hanging off one of those), and a few more things will go in before splash - 17 devices and 5 hubs. Given how I got here I won't properly relax until I've had a couple of weeks uptime, but it's looking good. Total power draw of all of these devices is about 650-700mA, which is half what it was with the Fit-PC4.
If you're the kind of individual that likes looking at pictures of USB hubs and lots of still-to-be-tidied wiring, today is going to be a good day for you. Here are some pictures.



PostScript: 7th June
So close, so close. Most of the time things are ticking over quite nicely, but I am seeing occasional corruption in the streams of data coming from the devices. Specifically, I am seeing chunks of bytes go missing in the middle of the data. It took me a while to figure out what caused this one: the software stack on the Pi was not reading the data fast enough from the serial ports.
Lets say every second my device sends out an update of 800 bytes over USB. The firmware I'm using on my boards (derived from the excellent Teensy 2.0 firmware written by Paul Stoffregen) only has a transmit buffer of 64 bytes, so any further writes when this is full will block, and eventually time out after 15ms. That's a long time at this level of communication, but with a slow device like the Pi trying to read with a large number of devices potentially all transmitting at once, occasionally the buffers wouldn't clear in time and data would be discarded. The Pi would then catch up and normal tranmission would resume, minus the skipped bytes in the middle.
I'll reflash all the boards to have a timeout of 50ms and have shaved down my Serial read thread for speed, so we'll see what happens.
PostScript: 15th June
Rewriting the Serial read thread has done the trick - no exceptions for a week. I've tested the firmware change and it causes no issues so will run with both, but at least I've confirms that was definitely the issue. The computer is now officially stable!
PostScript: 9th September
Stable, but also heavily loaded. Too heavily loaded to reliably play audio via a
USB audio card - I get frequent skips and pops. While this is a bit of a known issue
with the
Raspberry Pi, the normally proposed solution of setting the boot option dwc_otg.speed=1
caused massive packet loss on the USB bus. So no USB audio for me.
I didn't need to reflash the boards to increase the serial timeout as described above, although the boards I did do this to are working. As it turned out the single most signifant improvement, other than the fancy hubs, came from using the Oracle JVM instead of the OpenJDK JVM. On the Pi, the latter is so slow it's next to useless: after switching, the average CPU load on the Pi dropped from over 1 down to about 0.15. No more latency issues.