High packet loss in simple system

Hello. I am using the Ubiswitch, standard baseboard, and only ports 1-8 for my setup. However, I am experiencing a high packet loss. PC to camera get 70-80 % loss, and PC to my RPi is around 5-10%.
I know cables are the prime suspect, but find it a bit strange that the loss varies so much between units. Is there anything that can be related to the Ubiswitch itself and settings of the switch or of any of the attached systems?

Is the camera streaming via UDP protocol? If so, it can happen under some circumstances that some camera packets arrive at exactly the same time as RPi packets at the switch and both compete for the PC port. If the switching buffer cannot accomodate all this traffic until it sends is out the PC port, the packets get lost. TCP can handle that with retransmissions. UDP cannot.

Just a note - what is important here is not the average bandwidth consumed by the camera. What is important is how long is the time interval during which it sends one frame. It usually sends the data using the maximum, 1 Gbps speed, and then pauses until next frame arrives. So the average bitrate is a composition of a “bit-banging” usage of full speed and no speed. And these full speed intervals are where the RPi data can clash with the camera data, as both should go out of a single 1 Gbps port.

We have a sensor that sends a constant 125 Mbps stream, but chunked in such an unfortunate manner, that on most switches, it is enough to send 10 Mbps from another port to the same destination and the sensor packets start being dropped pretty drastically.

1 Like

Thanks. Further investigation seem to indicate the problem is for each system connected, since removing the camera does not help the performance of the Pi, and vice versa. Will do some testing to check if the switch itself can be the reason, we have more of them so we can swap out and test. Speed is not high on any systems, the gimbal is compressing and sending at 2 Mbps at the moment, but it is only using 4-wire Ethernet (10/100 max) so I am not sure how that may affect other systems connected.

In terms of variation between different UbiSwitch modules, the only way that would be manifest is in variation in manufacture.

Say, an inconsistent copper quality or inconsistent manufacturing on the transformers. This is unlikely; such issues tend to fail in a pretty noticable way, and tend to be repeatable.

All ports the each module (and baseboard) are tested to check for 1000Mbps (nominally this is around 947Mbps) using an iPerf test, prior to shipment from the factory too. That’s not to say a faulty unit can’t slip through, but it unlikely to affect more than 1% of units (at least, that’s roughly the failure rate we tend to see after testing).

I would recommend trying with a different switch to see if the issue is related to a buffer overflow (though of course, a different switch may have a larger buffer which may solve the issue). Based on what you’ve described, @peci1 's description sounds feasible.

Thanks. Interestingly, removing the camera and only having the PC - swithc - Pi in there, I get lower loss rates when pinging at higher rates. So “ping -i 1 ip” gives 15%, “ping -i 0.01 ip” gives me less than 2 % loss so to me it looks like the loss is more a function of “loss per time” rather than “loss per data quantity” for the Pi at least. And the return-time is excellent (comparable to removing the switch from the setup, but then I also get 0% loss, so it doesn’t look like it is the pc or Pi themself). I will do some testing of alternative switches tomorrow and see what I can learn.

I can jump in here and tell that we “only” use the cables that came along with the switch. We have also tried to replace them as well. No changes.

Do you see similar issues?

Sorry, should have introduced my self, I’m working on the same project, using the same stuff. But sits at another location (where the HW is).
I can also mention that I just measure the supply voltage with an oscilloscope, and it’s very nice and stable, at 11,9V.

When using the camera and connect it directly to the PC (with another cable), everything works perfect.

When then going trough the switch ( RPI disconnected), it stops to work. I get an image for 1-2s and then it’s out for 2-4s. I can see this in wireshark and the graph in there. I get a spike for each short video-stream (2Mbit), and then 0 Mbit.
It’s very visble! But I don’t see any of this on the power to the switch, so it’s not a power problem.

We made 3 more units of the same kind, last week. They showed the same problem, but didn’t have time to test them any more. I think they have the exact same problem/error. That’s why I doubt that this is a HW problem. Unfortunately I don’t have these units here, so we can’t measure or compare them.

Here are two pictures of the network, with good and bad behavior (good is without switch and bad is when switch is put in between).


The bad/poor behavior…

So the camera streams over TCP?

UDP (I’m a HW guy, so a little bit lost what goes over what)…
Wireshark says UDP. :slight_smile:

But in the first image you selected, it shows a TCP packet.

There are both TCP and UDP in there. UDP is for videostream and TCP is for other settings and control (if I have understood it correct).
There are some MPEG TS as well (if that says anything).

Can you try periodically running netstat -su when the camera is streaming? Is any error counter increasing, and stops increasing when you disconnect the camera?

You can also have a look at my guide on replicating the switch buffer overflow issue for a different sensor (and different switches): Ouster network problem - Google Docs .

1 Like

This is what netstat -su says:

IcmpMsg:
InType3: 214558
InType11: 18
OutType3: 214888
Udp:
1302155 packets received
229811 packets to unknown port received
0 packet receive errors
1323282 packets sent
0 receive buffer errors
0 send buffer errors
IgnoredMulti: 794
UdpLite:
IpExt:
InMcastPkts: 363
OutMcastPkts: 168
InBcastPkts: 794
OutBcastPkts: 4
InOctets: 486115516
OutOctets: 177298238
InMcastOctets: 54034
OutMcastOctets: 19111
InBcastOctets: 313332
OutBcastOctets: 128
InNoECTPkts: 1762231
MPTcpExt:

Hmm, no errors here…


Maybe this say something. Now I have zoomed in on the graph and also looks at the messages, in the beginning of the burst. When there is no data sent, then I see all these UDP.

I guess the next step would be trying the same with some off-the-shelf switch…

Or - as an alternative - using a 4-wire cable to forcefully downgrade the camera link to 100 Mbps (if that’s enough for it). This also helps the buffers.

Hi. I have now set up a test where I have switched back and forth between the Ubiswitch and a 5-port Gigabit switch (Gigablox I think it is) I had conveniently here. Note that I am testing with a different Ubiswitch (but from the same order/delivery/batch) as referred to earlier.

PC - Ubiswitch - RPi (all services on RPi using network turned off) gives 5-15% packet loss when pinging
PC - Gigablox - RPi gives 0% packet loss
I am using ping and mtr for testing (mtr is convenient when I leave it running for a while)

I also connected an ArkCam to tets, upped the bandwidth to constant 20 Mbps and highest quality and resolution (unfortunately only 1280 for the one I have), and it has no effect on the PC-switch-RPi results whatsoever.

The only test I have found so far to bring the loss numbers down when using mtr is to run ping -i 0.01 in another terminal window. I do not know what that means.

Does anyone know if there are any reasons the Gigablox in this case gives better performance? I cannot understand if it is a buffer issue when the only traffic are pings (and the worst performance is when I ping at 1 Hz).