I’ve got a system that I want to connect to the outside world through an Ubiswitch. The system is an Nvidia Jetson Orin based system running Ubuntu L4T version 35.3.1. After a lot of troubleshooting networking issues seeing slower speeds than we should based on performance of other systems on the same bench, we figured out the commonality was the Ubiswitch we had some of our systems connecting through.
Here’s the overall picture. We have our Orin board and the Ubiswitch in a customer box with a CAT6 ethernet coming out of the box. The customer would use this ethernet to connect to their network. That ethernet is connected to a BB-UBS-B-1 botblox Ubiswitch with the BB-UD1-C-1 baseboard. Right now the only other device connected to the Ubiswitch within the customer box is the Orin, also with a CAT6 ethernet cable. Both cables are using SFP RJ45 modules in 2 out of 3 of the 10 gig ports of the Ubiswtich.
Our slower than expected internet speeds go away if we just take the external facing ethernet cord and unplug from the Ubiswitch and plug it directly into the Orin. Similarly if we take both existing CAT6 ethernet cables and connect them via a CAT6 coupler instead of the Ubiswitch, we see faster speeds.
We don’t always see slower speeds though. We have a 10g ethernet adapter for laptops and if we connect it to the external ethernet, we can get close to 10g speeds with and without the Ubiswitch, pretty much equal performance. So no issues there. But if we connect the external ethernet to our office network (this particular bench switch advertises 100 mpbs link connection), we see slower speeds with the ubiswitch in the loop than without. We test it on the office network by doing an iperf public test (iperf3 -c iperf.he.net -p 5201 ). And the result is that we can hit the iperf server at what I suspect is the max that we can (about 500 mbps) if the ubiswitch is not in the loop. Which is much faster than the link speed our office switch advertises. But with the ubiswitch in the loop, we get maybe 80-100 mbps.
A similar thing happens when we iperf test using certain 1 gig adapters on laptops, testing with one device being the server (doesn’t seem to matter if the orin is the iperf server or the laptop is the iperf server). For some combos, we see link speed advertised as 1 gig but speeds about 600-700 mbps through the ubiswitc. Without the ubiswitch, we see 900 mbps.
Our ubiswitch is running the latest firmware, and we’ve verified that by connecting through serial. The sfp 10 gig ports appear to not support turning off eee mode or turning off auto negotiation. And from the orin’s end, we’ve tried using ethtool to turn off auto negotiation and request certain link speeds, but nothing seems to fix seeing the slower speeds.
Anyone have some troubleshooting advice or ideas? Has anyone seen this before? This is my first time posting here as well, so apologies if this is not the right topic or format for troubleshooting.
This is almost certainly an autonegotiation issue between the PHY side of the 10G SFP, and the MAC side of the 10G SFP.
Let me explain. When use a SFP into a SFP port, there are essentially two connections being made.
The first is the MAC side. That’s the side on UbiSwitch (inside the SFP cage on the board).
The second is the PHY side. That’s the side of the SFP that faces the outside world. In your case that’s an RJ-45 that supports 10G/1G/100/10BASE-T.
Both the MAC side and the PHY side run with a certain predefined protocol. Sometimes they can run at different underlying speeds.
For example, to support 10Gbps, the MAC side has to run at 10GBASE-KR if a 10GBASE-T SFP is plugged in. When the PHY side of the SFP is set to 10GBASE-T, then the underlying speeds match and there is no issue.
However, when the PHY 10G SFP plugged in is running at 1GBASE-T, or slower, there’s now a mismatch between the 10GBASE-KR MAC backplane and the 1G/100MBASE-T PHY front end.
Now if that really was an issue, then there wouldn’t be any link at all. However there is a mechanism of rate matching defined that allows data to be interspersed to fix that issue. The problem with that, is that it doesn’t always work very well. We don’t exactly know why; this is all defined within the chips.
Anyway, what that means is there can be issues when you use a 10GBASE-T SFP but run it at slower speeds. This issue isn’t unique to UbiSwitch, we’ve seen it with other off-the-shelf switches too.
A couple things to try..
Try setting the SFP ports to UXSGMII mode port <int> mac mode usxgmii
Where is either 0, 9 or 10
If that doesn’t fix it, you may need to try some different SFPs.
Thanks for this info and suggestion Josh, very informative and would explain what we’re seeing. We just gave your suggestion a try using 10G copper SFPs from BotBlox (PN SFP-10G-T) and it seems like the link goes down when we change the mac mode to usxgmii. The link goes down as soon as we set it on either of the SFP ports we’re using and stays down when it’s set on both. We tried running “port save” and then power cycled the Ubiswitch to see if our Orin would figure things out with the Ubiswitch saved in that mode on bootup. But still couldn’t get a connection through the Ubiswitch in that configuration. We also tried a couple non-BotBlox SFPs we have and observed the same behavior.
Let me know if you have any other suggestions or if we could provide any information/outputs that would help!
Also of note, we attempted to use one of the 1Gb ports on the Ubi to hook into our network via a 1Gb network switch. With autoneg on or autoneg off, 100, full, the setup would iperf at around 60Mbps. Attempting to set this 1Gb port to autoneg off, 1000, full would kill the link. The 1Gb port would iperf at ~920Mbps when connected to a laptop with 10Gb port for both autoneg on and autoneg off, 1000, full.
Viewing the configuration of the SFP ports all show as Port: MAC EEE: off, and PHY: EEE:off. but changing EEE off/on is not supported for these ports, from what I can tell.
USXGMII on the SFP ports meant the links went down.
You then tried the one of the 1G copper ports on UbiSwitch connected to your benchmark switch (which is only 100Mbps) in your external network, and you say you saw only 60Mbps via Iperf.
However when you connect that same benchmark switch directly to your Orin (no UbiSwitch in the middle), you see closer to 100Mbps on an iPerf test?
You’ve also confirm that EEE is turned off on the copper PHY ports using the port <int> eee off command?
Can you tell me what revision of UbiSwitch this is (what’s printed on the board)?
We may have to dig into wireshark captures to solve this one, but let’s get that info first and we can take it from there.