Forums › Devices › OSD335x-SM › Poor Ethernet performance (AR8035)
Tagged: ethernet poor performance
Hi:
As you may already know, I developed a simple board based mostly on your RED platform. I just kept the bare minimum to make a bootable board, but I left an expansion header with Ethernet signals and others for further experimentation.
Some time ago I also built an Ethernet board to be connected to my base board, also based on what’s present on the RED platform (AR8035 chipset).
I did all necessary device tree modifications and both boards “turn on” and “work”. The bootloader (u-boot) sees something on the Ethernet port, but seems to fail to recognize the Ethernet chipset on address 0 (which looks “normal” as it should be address 4… I may need some tweaking there):
1 2 3 4 5 6 | BeagleBone Black: Model: Octavo Systems OSD3358-SM-RED: ... Net: eth0: MII MODE Could not get PHY for cpsw: addr 0 cpsw, usb_ether |
And when the kernel starts it recognizes the chipset and load drivers for it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | [ 1.044704] libphy: Fixed MDIO Bus: probed [ 1.099537] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6, bus freq 1000000 [ 1.099553] davinci_mdio 4a101000.mdio: detected phy mask ffffffef [ 1.113908] libphy: 4a101000.mdio: probed [ 1.113936] davinci_mdio 4a101000.mdio: phy[4]: device 4a101000.mdio:04, driver Atheros 8035 ethernet [ 1.114946] cpsw 4a100000.ethernet: Detected MACID = 98:5d:ad:d4:2a:da [ 1.115071] cpsw 4a100000.ethernet: initialized cpsw ale version 1.4 [ 1.115081] cpsw 4a100000.ethernet: ALE Table size 1024 [ 1.115120] cpsw 4a100000.ethernet: cpts: overflow check period 1250 (jiffies) ... [ 14.706238] net eth0: initializing cpsw version 1.12 (0) [ 14.786768] Atheros 8035 ethernet 4a101000.mdio:04: attached PHY driver [Atheros 8035 ethernet] (mii_bus:phy_addr=4a101000.mdio:04, irq=POLL) [ 14.818177] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 17.857558] cpsw 4a100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 17.857645] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 18.062133] 8021q: 802.1Q VLAN Support v1.8 [ 18.062219] 8021q: adding VLAN 0 to HW filter on device eth0 |
I can even ping Google, download files, etc. The problem is that I’ve been experiencing very poor Ethernet performance. I even get some kernel panics from time to time:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | [ 1337.180817] skbuff: skb_over_panic: text:c0a05ed4 len:1959 put:1959 head:dc3ef900 data:dc3ef942 tail:0xdc3f00e9 end:0xdc3eff40 dev:eth0 [ 1337.193197] ------------[ cut here ]------------ [ 1337.197839] kernel BUG at net/core/skbuff.c:104! [ 1337.202475] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [ 1337.208335] Modules linked in: 8021q garp mrp stp llc evdev uio_pdrv_genirq uio iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_filter spidev pru_rproc pruss pruss_intc ip_tables x_tables [ 1337.229729] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.71-ti-r80 #1 [ 1337.236371] Hardware name: Generic AM33XX (Flattened Device Tree) [ 1337.242490] task: c15089c0 task.stack: c1500000 [ 1337.247052] PC is at skb_panic+0x6c/0x70 [ 1337.251006] LR is at wake_up_klogd+0x7c/0xa8 [ 1337.255293] pc : [] lr : [] psr: 600c0113 [ 1337.261586] sp : c1501c28 ip : c1501af8 fp : c1501c54 [ 1337.266831] r10: c0119c3c r9 : 000007a7 r8 : 00090000 [ 1337.272077] r7 : dc3ef900 r6 : dc3ef942 r5 : dc3f00e9 r4 : dc3eff40 [ 1337.278632] r3 : b674d272 r2 : b674d272 r1 : c1119a00 r0 : 0000007b [ 1337.285189] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 1337.292356] Control: 10c5387d Table: 9a2f4019 DAC: 00000051 [ 1337.298126] Process swapper/0 (pid: 0, stack limit = 0xc1500218) [ 1337.304158] Stack: (0xc1501c28 to 0xc1502000) ... [ 1337.750562] Kernel panic - not syncing: Fatal exception in interrupt [ 1337.756957] ---[ end Kernel panic - not syncing: Fatal exception in interrupt |
And apart from those kernel panics, the Ethernet connectivity is almost unusable. It is very very slow. I checked for lost packets or packets being rejected but nothing. It looks normal to me.
I’ve been tracking down the issue and it seems to happen only on some Ethernet networks. If I share my PC’s Wi-Fi Internet Connection (Windows 10) through the Ethernet cable, it seems to work just fine. But if I connect the Ethernet board directly to my router, it has very poor performance (only a few Kbps) and kernel panics start to appear from time to time.
Do you guys have any idea what could be happening? I’m attaching the board’s schematic. If you need the board design (Gerbers or other file) let me know. The only difference between my design and the RED platform is the Ethernet connector (I used Molex 93626-3508).
Best regards,
Andy
Andy,
U-Boot in the BeagleBoard.org images expects the Ethernet PHY to be at address 0. As long as you don’t need to use Ethernet during U-Boot (i.e. no TFTP or network boot) then you can just ignore that error message. If you do need Ethernet during U-Boot, then you will have to modify U-Boot to use the different PHY address. I believe that you would need to modify the “phy_interface” value in the “cpsw_pdata” structure (approx line 915) in board.c (http://git.denx.de/?p=u-boot.git;a=blob;f=board/ti/am335x/board.c).
Looking at the performance problem, it looks like this might be related to the Ethernet driver and malformed packets. Looking around, I found a couple of links that could help debug the problem:
https://e2e.ti.com/support/processors/f/791/t/530328
My suggestion would be to first use WireShark (https://www.wireshark.org/) to look at the network traffic on the switch to understand what packets are being sent to the OSD335x. Then, you can take a look in the Ethernet driver and see if it might have a similar issue to the Xilinx driver. Given that everything works fine when you are routing things through your computer, it would indicate this is more of a software issue than a hardware issue.
Additionally, what kernel version / image are you using?
Thanks,
Erik
Thank you Erik for the valuable information & suggestions. Indeed, it appears to be a software issue. I’ve been performing some test on other networks and it works very good with no performance issues and/or kernel panics. I even performed some speed tests and I got maximum speed for my current ISP service.
Will perform more tests with Wireshark capturing packet information on the network with issues and try to narrow down which packet is causing the problem (or type of packets). The network that causes issues is my company network. I would guess that there are some kind of jumbo packets and/or VLANs or some other “company specific” packet that may cause the problem. My home network works like a charm.
I’m using latest IoT image available from Beagleboard.org:
Linux beaglebone 4.14.71-ti-r80 #1 SMP PREEMPT Fri Oct 5 23:50:11 UTC 2018 armv7l GNU/Linux
Will keep you all posted on my findings.
Regards,
Andy
Hi Erik:
For my tests, I’m using a USB-to-Ethernet adapter that came with my laptop PC. In fact, I have two of these adapters: one that came with my personal laptop, and one that came with my office laptop (both ASUS brand).
And so far, I’ve performed the following tests:
1. Used my personal laptop’s USB-to-Ethernet adapter to share laptop’s Wi-Fi connection (to my home network): it works very good with no problems at all (at least no noticeable issues). I can download things at max speed, upgrade things with apt, perform network speed tests, communicate with external servers, ping, etc.
2. Used my office laptop’s USB-to-Ethernet adapter to share laptop’s Wi-Fi connection (to my office network): it has very poor performance, quite unusable, only a few Kb/s with many errors and timeouts.
3. Board directly connected to a GbE port on my office network switch: same as 2 but with some Kernel panics from time to time.
I noticed that my personal laptop’s USB-to-Ethernet adapter is based on AX88772B chipset (10/100 Mbps), and that my office laptop one is based on a Realtek GbE chipset.
So the next obvious thing to do was to test my AX88772B USB-to-Ethernet adapter on my office network to share the Wi-Fi throught my office laptop, and guess what: it worked just fine just like in case 1.
In both cases where a GbE port is used (2 & 3) the network performance is very poor and unusable (with kernel panics in the case of 3). So it seems like there’s something to do to with the Ethernet speed (or adapter chipset) rather than the network itself (I’m guessing here).
Also, due to the fact that my (small) office router is very basic and doesn’t have traffic capture capabilities, I’ve been unable to find the packet that is causing the kernel to panic seen on 3.
To that end, I enabled tcpdump on the target board to see if I could find some more information of the offending packet, but the trace looks pretty normal to me. Almost all packets are 1400 bytes in length (max) which relates to my MTU I guess.
I’m attaching the file in case you want to take a look. Board’s IP address was 192.168.0.175.
After the last packet (1373) in **capture.cap** file I got a kernel panic, but unfortunately, that offending packet did not get captured, at least not entirely as Wireshark complains about an incomplete packet on the capture file.
Used a tool to repair the capture file and a new last packet appeared (1374, incomplete) but its headers look normal to me too. I doesn’t seem like an overly large packet.
I’m very puzzled to be honest. Do you think GbE has something to do with this? I don’t really think the USB-to-Ethernet adapter’s chipsets has anything to do here, as there may be many chipsets in routers/switches out there. It’s hard for me to believe that the board will fail depending on the router/switch chipset.
I have a RED board laying around and haven’t done the test with it yet. Will try to do that ASAP. Maybe there are HW problems with my custom made Ethernet board and/or its routing. I’m attaching the Eagle BRD file just in case you want to take a look.
Thanks for any help.
Andy
Edit: I was unable to upload the files. Please download them from here https://www.dropbox.com/s/rorc9lsafqae4mv/eth.zip?dl=1
Andy,
We took a quick look at the layout and pcap files and didn’t see anything that jumped out at us. From your description, it feels more like a network issue at your office. One thing you could check is to take your office adapter and try it on your home network. There could be some small noise issues on your office network which is causing the GbE adapaters to not perform as well as the 10/100 adapters. I don’t think it is anything inherent with GbE but could be a bad cable since GbE requires more twisted pairs in the cable than 10/100 Ethernet.
Let us know how the testing goes.
Thanks,
Erik
Hi Erik:
I did the test with my GbE adapter on my home network and same results as the office: poor performance, very slow, almost unusable.
Further testing with “ethtool”, I forced the connection to 100Mbps as follows (still with the GbE adapter):
1 2 | [ 1281.435683] cpsw 4a100000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx [ 1281.435788] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready |
Did a few tests, and it worked great just like if I had connected the board to a 100Mbps adapter.
I really think this has something to do with GbE but not sure how. Maybe, as you suggested, my cable has an important role. I’m testing with a cable almost identical to this one: https://www.ebay.com/itm/AWM-E212689-STYLE-2854-32-AWG-30V-2-FEET-CAT-5E-B0816F281J-CABLE-BRAND-NEW-/202639361149 (except mine is black, but has same markings on the cable: AWM E212689 STYLE 2854 80°C 30V CABLE).
Anyway, cabling at office allows 1000Mbps speeds with my USB GbE adapter, at least with my laptop PC, with no problems. Connecting that same cable to my test board produces poor results with some kernel panics from time to time, as stated before.
Maybe USB GbE is more forgiving to link issues than the AR8035 chipset/driver. I don’t really know. Do you think my board’s Ethernet connector choice may have something to do with it (Molex 93626-3508)? As you said… there are not apparent routing problems or board design issues.
Next test will be to connect the GbE USB-to-Ethernet directly to the USB host on my test board and see if the network connection works at full 1000Mbps speed directly connected to my office network.
Will keep you posted. Thank you!
Andy
One more thing to consider: when in GbE mode, I’m getting a lot of Rx CRC Errors:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | debian@beaglebone:~$ sudo ethtool -S eth0 NIC statistics: Good Rx Frames: 59616 Broadcast Rx Frames: 490 Multicast Rx Frames: 792 Pause Rx Frames: 0 Rx CRC Errors: 1484 Rx Align/Code Errors: 0 Oversize Rx Frames: 0 Rx Jabbers: 0 Undersize (Short) Rx Frames: 0 Rx Fragments: 407 Rx Octets: 74277375 Good Tx Frames: 34889 Broadcast Tx Frames: 30 Multicast Tx Frames: 182 Pause Tx Frames: 0 Deferred Tx Frames: 0 Collisions: 0 Single Collision Tx Frames: 0 Multiple Collision Tx Frames: 0 Excessive Collisions: 0 Late Collisions: 0 Tx Underrun: 0 Carrier Sense Errors: 0 Tx Octets: 22824024 Rx + Tx 64 Octet Frames: 82 Rx + Tx 65-127 Octet Frames: 31297 Rx + Tx 128-255 Octet Frames: 1112 Rx + Tx 256-511 Octet Frames: 643 Rx + Tx 512-1023 Octet Frames: 506 Rx + Tx 1024-Up Octet Frames: 62349 Net Octets: 97780181 Rx Start of Frame Overruns: 0 Rx Middle of Frame Overruns: 0 Rx DMA Overruns: 0 Rx DMA chan 0: head_enqueue: 4 Rx DMA chan 0: tail_enqueue: 59488 Rx DMA chan 0: pad_enqueue: 0 Rx DMA chan 0: misqueued: 0 Rx DMA chan 0: desc_alloc_fail: 0 Rx DMA chan 0: pad_alloc_fail: 0 Rx DMA chan 0: runt_receive_buf: 0 Rx DMA chan 0: runt_transmit_bu: 0 Rx DMA chan 0: empty_dequeue: 0 Rx DMA chan 0: busy_dequeue: 51693 Rx DMA chan 0: good_dequeue: 58831 Rx DMA chan 0: requeue: 3 Rx DMA chan 0: teardown_dequeue: 533 Tx DMA chan 0: head_enqueue: 26814 Tx DMA chan 0: tail_enqueue: 8075 Tx DMA chan 0: pad_enqueue: 0 Tx DMA chan 0: misqueued: 8075 Tx DMA chan 0: desc_alloc_fail: 0 Tx DMA chan 0: pad_alloc_fail: 0 Tx DMA chan 0: runt_receive_buf: 0 Tx DMA chan 0: runt_transmit_bu: 103 Tx DMA chan 0: empty_dequeue: 26817 Tx DMA chan 0: busy_dequeue: 29 Tx DMA chan 0: good_dequeue: 34889 Tx DMA chan 0: requeue: 0 Tx DMA chan 0: teardown_dequeue: 0 |
When in 100Mbps mode, that number does not increase.
I read some forums about the trace length from my PHY to the microprocessor could be too long for the GbE speeds. Due to the fact that I’m connecting the PHY through a header connector to my base board, and then to the microprocessor, they indeed could be too long.
Do you think that may be the cause?
Andy
Erik:
I did the test of connecting my GbE USB-to-Ethernet adapter directly to my base test board (through the USB host connector). It was recognized as a 1000Mbps (GbE) full-duplex link and loaded as “eth1”. I connected to my office network switch/router and it worked like a charm (unlike my Ethernet daugther board based on AR8035 chipset).
Just for completeness, also did the test of connecting my Ethernet daugther board to my office network but downgrading the link to 100Mbps full-duplex and it also worked just fine.
So it is confirmed that for some reason, my Ethernet daugther board does not work properly at GbE speeds. Could be the trace lengths, routing problems, noise issues due to the board-to-board connector, you name it 🙂
But it seems like using the USB adapter at GbE speeds is more forgiving and works just fine. I assume that’s because the way it is constructed, how the USB driver works and other variables.
Regards,
Andy
Andy,
There could be some trace length issues. You might want to look at the RX bus on the oscilloscope. Focus on the RX_CLK and one of the data lines to see if there are issues with the clock and data not lining up. 100Mbps requires a 25MHz clock while 1Gbps requires a 125MHz clock, hence the timing requirements are much stricter. Additionally the RGMII requires data to be clocked on both the rising and falling edges.
Thanks,
Erik
Will try to perform the test with the oscilloscope as soon as possible. Thank you for all the help so far.
Andy
Octavo Systems LLC all rights reserved
OCTAVO is registered in the U.S. Patent and Trademark Office. OSD, C-SiP, and the Octavo Logo are trademarks of Octavo Systems LLC.
"*" indicates required fields
"*" indicates required fields
"*" indicates required fields
"*" indicates required fields