Diagnostics

I’m in Tampa for a few days of scuba diving and touring the area. I was able to dive the Crystal River north of Tampa for two days. It’s a really cool place to learn to dive. There’s no current, it’s not very deep and the water’s clear. In addition to all that you sometimes will run into a Manatee or a Dolphin. And if you show up in November the river is loaded with hundreds of Manatee’s. But that attracts a lot of tourists so there’s really no peace and quiet in the water.

In the midst of that I worked in a few customer teleconferences. Diagnostics was a very hot topic on yesterday’s call. Specifically it was about our CAN protocols; DeviceNet, J1939 and CANopen. The architecture of all these protocols is the same. They are simply formatted application layer
protocols that ride inside CAN messages.

Think about TCP. EtherNet/IP Explicit messages ride inside a TCP packet. Profinet IO acyclic messages ride inside a TCP packet. Those protocols are just specific ways to format the bytes inside a TCP message.

It’s almost exactly the same with CAN. DeviceNet, J1939 and CANopen are ways of formatting the messages that ride inside of a CAN message. The CAN controller chips which are now usually part of the silicon of the microprocessor move CAN messages from place to place just like TCP messages are moved around an Ethernet network. The controllers (Ethernet or CAN) know nothing at all about what is in those messages, they just know to move them from point A to point B.

And just like Ethernet, CAN messages have a CRC to verify the validity of a message. A receiver knows that the message is valid because the message doesn’t get received by the target node unless it is valid.

In CAN it’s even a little stronger than that. CAN messages use a recessive bit to identify valid messages. A zero, the recessive bit, in the ACK bit field indicates that devices on the network have validated the message and found it to be valid. Any node that doesn’t think it is valid writes a one, a dominant bit. No other node that thinks the message is valid can override that dominant bit from the node that disqualified it.  IN a CAN network every valid message has to be accepted by every node on the network to be valid.

I described this in our discussion yesterday and my customer still didn’t think that was enough validation. I described how our entire line of gateway products (Modbus, Ethernet, Profinet IO gateways) all have diagnostics screens. And these diagnostic data that can be accessed from any network. They can tell a target node what nodes are online and what have failed, what nodes are online but experiencing problems and so on. The RTA Gateways Diagnostics are pretty powerful.

They decided in the end to do a full loop check in addition to all the diagnostic data available to both sides in an RTA gateway. With this full loop check, a device that writes data through our gateway is going to be able to read that value back to check that it was written. This is largely not part of our gateway but has to be programmed at the application level by our customers devices.

To implement full loop diagnostics like this, one side will write a value. Our gateway will send that value on the other network. The receiving device will accept that value, validate it as a valid application layer value and then send it back to the original sender through the RTA gateway.

In mission critical systems like this, this truly is the best way to make sure that a message is received by the target. It’s just like diving. It’s mission critical that you don’t run out of air and that you make a decompression stop if you dive below a certain limit. That’s how you stay alive. And that’s how you make sure that your mission critical applications stay online.