Known-Good Hardware

Apr 7, 2024 · 1262 words · 6 minute read

Known-Good Hardware 🔗

Or, making the case for dual testbeds.

Breaking new ground 🔗

When developing a new product, it often involves new hardware, electronics, and software. Every engineer loves to feast on new hardware, I think it triggers something deeply within. An urge to discover, perhaps.

But the combination of so many new moving parts at the same time, makes it hard to understand where the problem is, when things don’t work as expected. And not working as expected, should be expected, when everything is new. After all, that’s the job - transform unexpected and unknown into expected and understandable.

Typically, new custom hardware is also available in limited quantities. Ports and pins might be hard to reach and measure at - perhaps there are no test points, or an epoxy conformal coating stops the small probe from efficient contact?

Compounding uncertainty 🔗

The problem is exacerbated by different domains. A software engineer probably has limited understanding of the electrical domain, and even less knowledge of the design and intricate details of the shiny new board that lies in front of them.

When the new firmware doesn’t work, or unpredictable results show up, the engineer probably faults the things they can affect. Or they never do that, and fault what they can’t affect. Neither is good, since the truth is likely a combination and lies somewhere in the middle.

Every problemsolving comes with some uncertainty. But more uncertainty at the same time makes it not just a little harder, but dramatically harder. Too much at the same time makes manageable problems into impossible missions. It turns out, that uncertainties are exponentially compounding.

Temporarily eliminating a large source of uncertainty helps greatly in gaining development speed.

A bridge from old to new 🔗

So what can we do?

It should be clear from the title, that what I say is good practice, is to do early bring-up and development on Known-Good Hardware. This removes many uncertainties. It’s not rocket surgery, basically all good science does this already by only manipulating one variable at a time and keeping all else fixed.

The main way we do this in the embedded world, is to use development boards, evaluation boards, breakout boards.

Implementing a sensor driver? Use a microcontroller development board plus a sensor breakout board.

Implementing a wireless protocol? Dev board.

Then, when it works on that, we may turn to custom hardware.

A whole new world 🔗

Not only do we remove many of the uncertainties there may be, development boards are also geared at easily accessible pins and ports. They often have pin headers for easy connect and measurement.

They come with well-tuned RF paths. Power supplies are well specified and tested. It’s convenient, with JTAG connectors and serial outputs.

Once the driver has been shown to work on Known-Good Hardware, and the behavior well understood and measured, the engineer can move onto the new, custom hardware. If the behavior doesn’t match, it’s so much easier to nail down the root cause.

Serial communication with a sensor often fails? Time to bring out the oscilloscope and inspect the signal integrity. Perhaps capacitive loading makes shark fins out of the assumed-to-be square waves. Or a serial bus collission, eg I2C slave address is the same on two different sensors on the same bus. I’ve seen both these problems on custom hardware.

Severally limited lifetime and batteries draining? Hm. Is the radio utilization too high? The known-good hardware testbed reports this to be in line with expectations, so probably something else. Turned out the custom hardware had an insufficient ground plane under the radio module, throwing the radio tuning under the bus. This in turn increased power consumption with 10 mA, and drained batteries way too fast.

New custom hardware is often only available in limited quantity, and they are expensive. Development boards on the other hand, are plentiful and cheaper. That means more engineers can work in parallell, reducing time to market. And if a board burns? So be it, bring on the next.

A development board also allows the engineer team to start working early before the custom hardware is designed, or manufactured.

…tainted 🔗

But all is not all and well. Sometimes the development boards available doesn’t match what you are designing for, your own custom hardware.

In those cases, a close-enough board might be enough. For example, if you are going to use a Texas Instruments CC1352R wireless SoC, you can mostly use the CC1312R Launchpad development board.

This whole article works under the assumption that the development board is Known-Good Hardware. And this is something you basically take for granted, why would they be selling something that would paint their product in a poor light?

Well, it sometimes happens that the Known-Good Hardware by assumption, is just not so good.

Once I developed a driver for an RFID-system. An industrial filter would have a read/write-able RFID tag attached. The filter unit would update this with usage hours. Since the custom hardware was yet unavailable, I used an evaluation module from the RFID IC manufacturer. And it just didn’t work. Everything looked right, every serial communication with the IC was as expected. But it couldn’t read tags.

Did I have the wrong tags? No, ISO 14443A, per spec. Were the tags broken? New from another manufacturer, no dice. Output the transmit signal to a debug pin, looked good.

After many many troubleshootings, I found the problem. The power supply on the provided evaluation board had a wrong component mounted. That meant that the current output was insufficient, and thus the transmit power was not enough, so it couldn’t energize and talk to the RFID tag.

The lesson we learn from this, is that it is good to leave a healthy dose of scepticism if things doesn’t work. Sometimes the Known-Good just isn’t Known-Good.

Dual Testbeds 🔗

Bonus material.

Development boards serve an ulterior purpose even once the initial development effort is in the past. Any complex product can show surprising behavior at scale, in-situ, after a long time running, or some edge condition occurs.

That’s one reason for having a testbed even when development is “over”, since development is never really over. There will always crop up bugs, undefined or unexpected behavior. A testbed is useful to push the limits of the product, find flaws, try new things.

I’m a proponent of having several testbeds. There should be an integrated testbed, where the full integrated behavior can be observed, measured, evaluated. Sufficient and useful metrics should be gathered and reported, and compared against expected limits and ranges. A metric out of range should be automatically flagged and get attention, eg if the network as a whole exceeds a network utilization upper mark.

But with any system of complexity, the root cause of a misbehaving system can be hard to tell. That’s where a testbed of development boards can help.

Say a wireless networked product testbed often experiences reboots, or excessive network utilization, or any other metric. Since you’d need to take the whole system into consideration, the root cause can be hard to find.

A testbed made with developement boards, sending dummy data instead of sensor data, will closely resemble the full system, but is more useful when it comes to understanding the complex system from other perspectives.

For example, a testbed of production units experiences drop-outs fairly regularly. Devices stop sending data. Why? The development board testbed does not, even if it sends the same amount of data, with the same frequency and everything else alike. This gives us clues to better understand our complex product.

iot embedded