Will try to perform the test with the oscilloscope as soon as possible. Thank you for all the help so far.
Andy
Erik:
I did the test of connecting my GbE USB-to-Ethernet adapter directly to my base test board (through the USB host connector). It was recognized as a 1000Mbps (GbE) full-duplex link and loaded as “eth1”. I connected to my office network switch/router and it worked like a charm (unlike my Ethernet daugther board based on AR8035 chipset).
Just for completeness, also did the test of connecting my Ethernet daugther board to my office network but downgrading the link to 100Mbps full-duplex and it also worked just fine.
So it is confirmed that for some reason, my Ethernet daugther board does not work properly at GbE speeds. Could be the trace lengths, routing problems, noise issues due to the board-to-board connector, you name it 🙂
But it seems like using the USB adapter at GbE speeds is more forgiving and works just fine. I assume that’s because the way it is constructed, how the USB driver works and other variables.
Regards,
Andy
One more thing to consider: when in GbE mode, I’m getting a lot of Rx CRC Errors:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | debian@beaglebone:~$ sudo ethtool -S eth0 NIC statistics: Good Rx Frames: 59616 Broadcast Rx Frames: 490 Multicast Rx Frames: 792 Pause Rx Frames: 0 Rx CRC Errors: 1484 Rx Align/Code Errors: 0 Oversize Rx Frames: 0 Rx Jabbers: 0 Undersize (Short) Rx Frames: 0 Rx Fragments: 407 Rx Octets: 74277375 Good Tx Frames: 34889 Broadcast Tx Frames: 30 Multicast Tx Frames: 182 Pause Tx Frames: 0 Deferred Tx Frames: 0 Collisions: 0 Single Collision Tx Frames: 0 Multiple Collision Tx Frames: 0 Excessive Collisions: 0 Late Collisions: 0 Tx Underrun: 0 Carrier Sense Errors: 0 Tx Octets: 22824024 Rx + Tx 64 Octet Frames: 82 Rx + Tx 65-127 Octet Frames: 31297 Rx + Tx 128-255 Octet Frames: 1112 Rx + Tx 256-511 Octet Frames: 643 Rx + Tx 512-1023 Octet Frames: 506 Rx + Tx 1024-Up Octet Frames: 62349 Net Octets: 97780181 Rx Start of Frame Overruns: 0 Rx Middle of Frame Overruns: 0 Rx DMA Overruns: 0 Rx DMA chan 0: head_enqueue: 4 Rx DMA chan 0: tail_enqueue: 59488 Rx DMA chan 0: pad_enqueue: 0 Rx DMA chan 0: misqueued: 0 Rx DMA chan 0: desc_alloc_fail: 0 Rx DMA chan 0: pad_alloc_fail: 0 Rx DMA chan 0: runt_receive_buf: 0 Rx DMA chan 0: runt_transmit_bu: 0 Rx DMA chan 0: empty_dequeue: 0 Rx DMA chan 0: busy_dequeue: 51693 Rx DMA chan 0: good_dequeue: 58831 Rx DMA chan 0: requeue: 3 Rx DMA chan 0: teardown_dequeue: 533 Tx DMA chan 0: head_enqueue: 26814 Tx DMA chan 0: tail_enqueue: 8075 Tx DMA chan 0: pad_enqueue: 0 Tx DMA chan 0: misqueued: 8075 Tx DMA chan 0: desc_alloc_fail: 0 Tx DMA chan 0: pad_alloc_fail: 0 Tx DMA chan 0: runt_receive_buf: 0 Tx DMA chan 0: runt_transmit_bu: 103 Tx DMA chan 0: empty_dequeue: 26817 Tx DMA chan 0: busy_dequeue: 29 Tx DMA chan 0: good_dequeue: 34889 Tx DMA chan 0: requeue: 0 Tx DMA chan 0: teardown_dequeue: 0 |
When in 100Mbps mode, that number does not increase.
I read some forums about the trace length from my PHY to the microprocessor could be too long for the GbE speeds. Due to the fact that I’m connecting the PHY through a header connector to my base board, and then to the microprocessor, they indeed could be too long.
Do you think that may be the cause?
Andy
Hi Erik:
I did the test with my GbE adapter on my home network and same results as the office: poor performance, very slow, almost unusable.
Further testing with “ethtool”, I forced the connection to 100Mbps as follows (still with the GbE adapter):
1 2 | [ 1281.435683] cpsw 4a100000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx [ 1281.435788] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready |
Did a few tests, and it worked great just like if I had connected the board to a 100Mbps adapter.
I really think this has something to do with GbE but not sure how. Maybe, as you suggested, my cable has an important role. I’m testing with a cable almost identical to this one: https://www.ebay.com/itm/AWM-E212689-STYLE-2854-32-AWG-30V-2-FEET-CAT-5E-B0816F281J-CABLE-BRAND-NEW-/202639361149 (except mine is black, but has same markings on the cable: AWM E212689 STYLE 2854 80°C 30V CABLE).
Anyway, cabling at office allows 1000Mbps speeds with my USB GbE adapter, at least with my laptop PC, with no problems. Connecting that same cable to my test board produces poor results with some kernel panics from time to time, as stated before.
Maybe USB GbE is more forgiving to link issues than the AR8035 chipset/driver. I don’t really know. Do you think my board’s Ethernet connector choice may have something to do with it (Molex 93626-3508)? As you said… there are not apparent routing problems or board design issues.
Next test will be to connect the GbE USB-to-Ethernet directly to the USB host on my test board and see if the network connection works at full 1000Mbps speed directly connected to my office network.
Will keep you posted. Thank you!
Andy
Hi Erik:
For my tests, I’m using a USB-to-Ethernet adapter that came with my laptop PC. In fact, I have two of these adapters: one that came with my personal laptop, and one that came with my office laptop (both ASUS brand).
And so far, I’ve performed the following tests:
1. Used my personal laptop’s USB-to-Ethernet adapter to share laptop’s Wi-Fi connection (to my home network): it works very good with no problems at all (at least no noticeable issues). I can download things at max speed, upgrade things with apt, perform network speed tests, communicate with external servers, ping, etc.
2. Used my office laptop’s USB-to-Ethernet adapter to share laptop’s Wi-Fi connection (to my office network): it has very poor performance, quite unusable, only a few Kb/s with many errors and timeouts.
3. Board directly connected to a GbE port on my office network switch: same as 2 but with some Kernel panics from time to time.
I noticed that my personal laptop’s USB-to-Ethernet adapter is based on AX88772B chipset (10/100 Mbps), and that my office laptop one is based on a Realtek GbE chipset.
So the next obvious thing to do was to test my AX88772B USB-to-Ethernet adapter on my office network to share the Wi-Fi throught my office laptop, and guess what: it worked just fine just like in case 1.
In both cases where a GbE port is used (2 & 3) the network performance is very poor and unusable (with kernel panics in the case of 3). So it seems like there’s something to do to with the Ethernet speed (or adapter chipset) rather than the network itself (I’m guessing here).
Also, due to the fact that my (small) office router is very basic and doesn’t have traffic capture capabilities, I’ve been unable to find the packet that is causing the kernel to panic seen on 3.
To that end, I enabled tcpdump on the target board to see if I could find some more information of the offending packet, but the trace looks pretty normal to me. Almost all packets are 1400 bytes in length (max) which relates to my MTU I guess.
I’m attaching the file in case you want to take a look. Board’s IP address was 192.168.0.175.
After the last packet (1373) in **capture.cap** file I got a kernel panic, but unfortunately, that offending packet did not get captured, at least not entirely as Wireshark complains about an incomplete packet on the capture file.
Used a tool to repair the capture file and a new last packet appeared (1374, incomplete) but its headers look normal to me too. I doesn’t seem like an overly large packet.
I’m very puzzled to be honest. Do you think GbE has something to do with this? I don’t really think the USB-to-Ethernet adapter’s chipsets has anything to do here, as there may be many chipsets in routers/switches out there. It’s hard for me to believe that the board will fail depending on the router/switch chipset.
I have a RED board laying around and haven’t done the test with it yet. Will try to do that ASAP. Maybe there are HW problems with my custom made Ethernet board and/or its routing. I’m attaching the Eagle BRD file just in case you want to take a look.
Thanks for any help.
Andy
Edit: I was unable to upload the files. Please download them from here https://www.dropbox.com/s/rorc9lsafqae4mv/eth.zip?dl=1
Thank you Erik for the valuable information & suggestions. Indeed, it appears to be a software issue. I’ve been performing some test on other networks and it works very good with no performance issues and/or kernel panics. I even performed some speed tests and I got maximum speed for my current ISP service.
Will perform more tests with Wireshark capturing packet information on the network with issues and try to narrow down which packet is causing the problem (or type of packets). The network that causes issues is my company network. I would guess that there are some kind of jumbo packets and/or VLANs or some other “company specific” packet that may cause the problem. My home network works like a charm.
I’m using latest IoT image available from Beagleboard.org:
Linux beaglebone 4.14.71-ti-r80 #1 SMP PREEMPT Fri Oct 5 23:50:11 UTC 2018 armv7l GNU/Linux
Will keep you all posted on my findings.
Regards,
Andy
Thank you Greg! Hope to see that reference board soon 🙂
Hey Greg, thank you the reply.
10/100 Mbps Ethernet connectivity should be a must with a PHY IC easily accesible through common distributors like Mouser and Digikey… a Microchip PHY maybe… they’re quite common and good supported on Linux.
Also I’m currently developing some audio applications so it would be great to see some (multichannel?) audio subsystem on the reference board if possible. I might be asking too much, but you asked 🙂
On the other hand, do you have a suggested price point for the SiP @ 100 units? I assume it should be cheaper than current OSD335X offerings.
Thank you, and I can’t wait to start prototyping with this new promising SiP.
Andy
Hey Erik! That makes perfect sense 🙂
Just for completeness… where is that documented?
Thank you so much,
Andy
Thank you Greg!
I’m trying to figure out what are BBB device tree “defaults” regarding to MAC0 and see if that looks OK for my LAN8710 chipset in MII mode. I can see that the pin mux defaults to MII mode (in file am335x-bone-common.dtsi):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | cpsw_default: cpsw_default { pinctrl-single,pins = < /* Slave 1 */ 0x108 (PIN_INPUT | MUX_MODE0) /* mii1_col.mii1_col */ 0x10c (PIN_INPUT | MUX_MODE0) /* mii1_crs.mii1_crs */ AM33XX_IOPAD(0x910, PIN_INPUT_PULLUP | MUX_MODE0) /* mii1_rxerr.mii1_rxerr */ AM33XX_IOPAD(0x914, PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txen.mii1_txen */ AM33XX_IOPAD(0x918, PIN_INPUT_PULLUP | MUX_MODE0) /* mii1_rxdv.mii1_rxdv */ AM33XX_IOPAD(0x91c, PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd3.mii1_txd3 */ AM33XX_IOPAD(0x920, PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd2.mii1_txd2 */ AM33XX_IOPAD(0x924, PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd1.mii1_txd1 */ AM33XX_IOPAD(0x928, PIN_OUTPUT_PULLDOWN | MUX_MODE0) /* mii1_txd0.mii1_txd0 */ AM33XX_IOPAD(0x92c, PIN_INPUT_PULLUP | MUX_MODE0) /* mii1_txclk.mii1_txclk */ AM33XX_IOPAD(0x930, PIN_INPUT_PULLUP | MUX_MODE0) /* mii1_rxclk.mii1_rxclk */ AM33XX_IOPAD(0x934, PIN_INPUT_PULLUP | MUX_MODE0) /* mii1_rxd3.mii1_rxd3 */ AM33XX_IOPAD(0x938, PIN_INPUT_PULLUP | MUX_MODE0) /* mii1_rxd2.mii1_rxd2 */ AM33XX_IOPAD(0x93c, PIN_INPUT_PULLUP | MUX_MODE0) /* mii1_rxd1.mii1_rxd1 */ AM33XX_IOPAD(0x940, PIN_INPUT_PULLUP | MUX_MODE0) /* mii1_rxd0.mii1_rxd0 */ >; }; cpsw_sleep: cpsw_sleep { pinctrl-single,pins = < /* Slave 1 reset value */ 0x108 (PIN_INPUT_PULLDOWN | MUX_MODE7) 0x10c (PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x910, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x914, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x918, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x91c, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x920, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x924, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x928, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x92c, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x930, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x934, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x938, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x93c, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x940, PIN_INPUT_PULLDOWN | MUX_MODE7) >; }; davinci_mdio_default: davinci_mdio_default { pinctrl-single,pins = < /* MDIO */ AM33XX_IOPAD(0x948, PIN_INPUT_PULLUP | SLEWCTRL_FAST | MUX_MODE0) /* mdio_data.mdio_data */ AM33XX_IOPAD(0x94c, PIN_OUTPUT_PULLUP | MUX_MODE0) /* mdio_clk.mdio_clk */ >; }; davinci_mdio_sleep: davinci_mdio_sleep { pinctrl-single,pins = < /* MDIO reset value */ AM33XX_IOPAD(0x948, PIN_INPUT_PULLDOWN | MUX_MODE7) AM33XX_IOPAD(0x94c, PIN_INPUT_PULLDOWN | MUX_MODE7) >; }; |
Indeed, that pinmux looks just fine for MAC0 in MII mode. Also the MAC section is configured as:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | &cpsw_emac0 { phy_id = <&davinci_mdio>, <0>; phy-mode = "mii"; }; &mac { pinctrl-names = "default", "sleep"; pinctrl-0 = <&cpsw_default>; pinctrl-1 = <&cpsw_sleep>; slaves = <1>; status = "okay"; }; &davinci_mdio { pinctrl-names = "default", "sleep"; pinctrl-0 = <&davinci_mdio_default>; pinctrl-1 = <&davinci_mdio_sleep>; status = "okay"; }; |
Which also looks just fine, except that I noticed that the RED board device tree has the following “phy_id”:
1 | phy_id = <&davinci_mdio>, <4>; |
I can’t find any documentation regarding what that “<0>” or “<4>” actually mean? I looked for any reference to that number in the Kernel’s device tree bindings documentation but found none. Why are they different and what that number references to?
Thank you!
Hi Eshtaartha:
Going back to the LAN7810… I designed a test board using that Ethernet chipset, based mostly on the Ethernet part of the “SBC Reference Design” (https://octavosystems.com/docs/osd3358-bas-sbc-reference-design-schematic).
My board is custom design so I have my own device tree and I’m wondering where can I find the “SBC Reference Design” device tree source (like the GitHub repo with RED board’s device tree) to take a look at what should I put on mine. It looks difficult to figure that out by myself specially if it may be already up there.
Thank you!
Hi Neeraj:
I reworked my RTC circuit replacing the crystal and supporting capacitors and now it oscillates with a good 32 kHz sine wave (as can be seen on my oscilloscope). I really don’t know what could have caused the old crystal to stop working at some point.
With the RTC up and running with the external oscillator, I did a quick test again with “sudo hwclock -f /dev/rtc0 –debug” and it got a tick as expected, so running the “sudo poweroff” command turned my board off just fine with no kernel panics (finally!).
So, I think we can close this topic now and hope all my process will help someone else facing the same problem.
Thank’s again for all the help and support.
Best regards,
Andy
I’m not sure what I did different this time, but I modified RTC_OSC_REG with value 0x40 again to set internal oscillator.
I did a quick test with “sudo hwclock -f /dev/rtc0 –debug” and I got a clock tick (this command failed with external oscillator), so the RTC is working with the internal oscillator with no problems.
After that check I did a “sudo poweroff” and now the board was powered off just fine with no kernel panics. I can see my board’s power indication LED being turned off.
So it is confirmed, it is a hardware issue with my RTC oscillator circuit. Will try to fix that and see what happens.
Neeraj:
I tested “devmem2” tool as suggested and it indeed says that value of register RTC_OSC_REG (address 0x44E3E054) is 0x48 which means that RTC source should be external and it should be enabled. The DC voltage level on OSC1_IN pin is about 0.9V at power up.
If I write this register with value 0x10 (disable the oscillator and apply high impedance to the output) the voltage level on OSC1_PIN jumps up to a DC level of about 1.8V.
FYI, I booted with EEPROM programed as a RED board with no special DTB file. Just latest IoT BB image as-is in order to discard DTB issues. So, to summarize, after power up, registers of interest have the following values:
RTC_OSC_REG = 0x48
RTC_CTRL_REG = 0x01
RTC_STATUS_REG = 0x02
RTC_SYSCONFIG = 0x03
RTC_PMIC = 0x10011
I also tried changing RTC_OSC_REG value to 0x40 (internal oscillator selected and enabled) and a read back confirmed the value change (as I wrote both KICK registers with unlock values), but still got the same kernel panic message. Maybe the kernel needs more than just an on-the-fly register change.
I noticed that, according to AM335x TRM, default reset value of RTC_OSC_REG is 0x10 (RTC disabled and high impedance), so there is one place where this register is being changed to 0x48… just for completeness: can you point me to the exact place where this happens? I would say that it is the rtc32k_enable() function on u-boot’s “board/ti/am335x/board.c” or is it the “omap_rtc” Linux driver.
Bottom line, it seems like I do have a hardware issue at hand. Will try removing RTC circuit components and put new ones to see if that solves the problem. I’ll keep you posted with my findings.
Thank you and happy new year BTW!
Neeraj:
I really appreciate all the help.
I did more testing and to my surprise, the oscillator is not oscillating at all. I’m 100% sure I measured it in the past with a good 32kHz oscillation but now it does not oscillate. I can only see a DC voltage of about 1.5V on OSC1_IN pin.
The BB image I’m now testing (latest IoT image) is different from the one I tested when I saw the oscillator working. The problem is that I don’t remember which one I had on my SD card at that time. When I tested back on those days, the EEPROM was blank and my u-boot had some patches to ignore EEPROM checks. Now, the EEPROM has been programmed with the same values to be recognized as a RED board (as specified here: https://octavosystems.com/app_notes/osd335x-eeprom-during-boot/#_Toc382081431) and I’m testing the BB image “as is” (except that I’m putting my board’s DTB file on uEnv.txt).
So, I have a couple of questions:
1) Is the 32 kHz oscillator supposed to start oscillating at board power-on?
2) If not, who’s responsible to initialize and turn it on? u-boot? the Linux kernel somewhere?
I inspected one of RCN patches and I noticed he modified some RTC suff. In particular, when my u-boot runs, I can see the “RTC 32KCLK Source: External” (as in https://rcn-ee.com/repos/git/u-boot-patches/v2018.03/0001-am335x_evm-uEnv.txt-bootz-n-fixes.patch @ line 267). But to be honest, that doesn’t say if the clock is enabled (turned on) or not.
Also, I put debug strings on the function “rtc32k_enable()” (line 219 of the same patch) to see if the RTC was being initialized or not, but I never saw those lines during u-boot start, so I assume the function didn’t get called.
It seems that the function is only called if macro CONFIG_SPL_AM33XX_ENABLE_RTC32K_OSC is defined. I compiled u-boot with “make ARCH=arm CROSS_COMPILE=${CC} am335x_evm_defconfig” which does not define that macro. Although, the generated .config file contains a line “CONFIG_SPL_AM33XX_ENABLE_RTC32K_OSC=y”, so I’m not sure if my u-boot is being compiled with that macro turned on or not.
Anyway, I don’t understand why the RED board doesn’t need any of that and the RTC works just fine by default (that’s why I asked for the DTB file to check that there is no difference with mine on that regard). What really puzzles me is that the oscillator once worked but now it doesn’t at all.
So if you could give some answers to the two questions above, would give me more light on to where to look next. Will check the documentation you provided too, and also see if I can compile the rtc_omap driver and debug that part.
Once again, thank’s for all the help.
Octavo Systems LLC all rights reserved
OCTAVO is registered in the U.S. Patent and Trademark Office. OSD, C-SiP, and the Octavo Logo are trademarks of Octavo Systems LLC.
"*" indicates required fields
"*" indicates required fields
"*" indicates required fields
"*" indicates required fields