Working custom design works great for 6 months – now dead

Forums Devices OSD335x-SM Working custom design works great for 6 months – now dead

Viewing 7 reply threads
  • Author
    Posts
    • #10567
      Matthew Sommerfieldmsommerfield
      Participant

      We have a custom design that uses the OSD3358-512M-BSM chip in a configuration that is somewhere between a Pocket Beagle and a BBBW. It uses Kingston EMMC, 4GBEMMC04G-M627-X02U as the storage device. We have a handful of these boards produced and running and have primarily been using an SD card for the application development since the source and build files for the application exceed 4GB. The final application with Linux will be about 2 GB.

      We have been running the device 24/7 for many months. I noticed one morning the system was shutdown. It had power to the board, but was completely non-functional. I power cycled the power supply and it started to boot. It crashed during Linux load and later discovered that the SD card had been filled by some logging and that likely triggered the shutdown. Thinking maybe the SD card failed, I removed the SD card so it would boot from the eMMC. It has booted successfully many times from the eMMC so expected it to do so again. It started to, but then failed in the uBoot. I don’t have that exact log, but do remember the last entry being a “Card did not respond to voltage select!” message.

      The board is now dead. I provide it power and the power rail LED light flashes for a very quick instant on power up or pressing the reset button.

      I provided the uBoot log below from a board that is still functional and runs off the eMMC and am now noticing the voltage select error there as well. Please advise what might be wrong with the design/application and how I may be able to recover the dead board.

      Matt

      Attachments:
    • #10596
      Neeraj Dantu
      Moderator

      Matt,

      It looks like there are 2 separate issues here:

      1. Boot log texts that don’t really cause any faulty behavior: According to Beagle, these are normal informational messages. Please see https://forum.digikey.com/t/question-for-card-did-not-respond-to-voltage-select/2763

       

      2. Failed board:

      This may be caused by a number of reasons. Note that eMMC and SD cards have a certain amount of read/write cycles they are good for. After that, they may fail randomly. We recommend you maintain a RAM based filesystem for logging and push the SD card/eMMC once in a while to reduce the wear and tear. For a crash like you described, reflashing the eMMC with a new image could be necessary  as data could be corrupted.

      From the power up behavior you described, a short is a likely cause. A good way to get to the bottom of this is to test the power up behavior of the failed board to determine which rail/pin is faulty. Please take a look at the power application note for power up sequence to verify: https://octavosystems.com/app_notes/osd335x-sm-power-application-note/.

      Best,

      Neeraj

    • #10597
      Matthew Sommerfieldmsommerfield
      Participant

      Neeraj,

      Thanks for the prompt response!

      The voltage select issue makes sense. Looks like a red herring to the second.

      The dead board was 100% working for many months. No changes were made nor was it exposed to any potential shorts, errant connections, etc. between working and not. It won’t boot from SD or eMMC. I checked all the voltage lines and they are not shorted and the only output I get on any of the lines is visible with the LED on the 3.3 line for a barely perceptible amount of time. We have reviewed the power up procedures/guides and here are few factors that may be in play:

      – we had trouble with random resets with just power on the 5 VIN line and found a forum that suggested we connect USB VIN and VIN to resolve. It did and those lines are connected in our setup.
      – at the moment this board failed, it was powered from a laptop on the USB line AND a VIN source. We had many months like this with no issue, but maybe it caused one now. All the other boards (with much less use) have already been disconnected from the PC USB power to prevent being dual sourced.
      – as our reference design started from a pocket beagle, we do not have the clamping circuit

      Could an eMMC fail in a way that prevents the Octavo chip from coming up at all? I get nothing on the debug port in terms of data.

      Hopefully this information can help us find the root cause.

      Thanks!

      Matt

    • #10605
      Neeraj Dantu
      Moderator

      Matt,

      VIN/USB power sources should be okay to power as the PMIC can decide which power input it can use. Take a look at section 8.3.9 in PMIC datasheet(https://www.ti.com/lit/ds/symlink/tps65217.pdf)

      Plotting the power up sequence of the failed part will allow you to determine which power rail is failing. My suspicion is that a short on an IO pin can cause failure on VDDSHV_3V3 and that can lead to the PMIC shutting down after it tries to bring the rail up. So, you don’t necessarily have to have a short on the power rail for the power up sequence to fail. This would align with the theory that the eMMC failure leading to an IO short could be the cause for your boot fail.

      Please also check the PMIC – SoC interface signals for shorts as well. They are listed here: https://octavosystems.com/app_notes/osd335x-design-tutorial/bare-minimum-boot/power-management/

      Best,

      Neeraj

    • #11333
      Matthew Sommerfieldmsommerfield
      Participant

      So I have some updates to this issue.

      We had the eMMC pulled from the board that died. I powered it up and same issue – the voltage LED flashes for a barely perceptible amount of time, then off. Then I pushed the reset button for a few seconds and when I released the button, it came to life! I put the SD card in and booted fine. This worked for a few cycles, then died again.

      We had a second board experience similar issue. It was running great for many months, then no more. I set it aside, moved on to other working boards, then miraculously, it came alive again with no changes. It ran without issue for a few months, then died again. This time, I had the debug terminal logging and captured the following as it died:

      [108176.172460] musb-hdrc musb-hdrc.1: VBUS_ERROR in a_wait_bcon (88, <AValid), retry #3, port1 0008010c

      I did a power cycle, and the following happened:

      U-Boot SPL 2019.04-dirty (Mar 10 2020 – 21:03:25 -0400)
      Trying to boot from MMC2
      Loading Environment from EXT4… OK

      U-Boot 2019.04-dirty (Mar 10 2020 – 21:03:25 -0400)

      CPU : AM335X-GP rev 2.1
      I2C: ready
      DRAM: 512 MiB

      That’s it…..after a couple of power cycles, I got nothing again.

      I hoping the final data dump helps provide some insight in what may be happening here. We have some in the field and more to ship and want to make sure we have a robust product.

      Thanks for your help!

      Matt

    • #11335
      Neeraj Dantu
      Moderator

      Hey Matt,

      It looks like the fault is intermittent. Most likely this is a mechanical issue. Please look for any hardware connectivity issues between SiP pins and the PCB. Cold solder joints for example can cause unreliable connections and could progressively get worse with time. We recommend doing an X-Ray on the boards to determine whether there is a connectivity issue.

      Please take a look at sections 9.1, 9.2 and 9.3 on the datasheet(https://octavosystems.com/docs/osd335x-sm-datasheet/) to make sure you have the correct landing pad sizes and reflow profiles.

      Please let us know when you have an update.

      Best,

      Neeraj

    • #12353
      Matthew Sommerfieldmsommerfield
      Participant

      So an update on this…

      The board boots fine now, I think there was an issue with the power cable connector pin crimp but that seems to be a solid connection now so mechanical as root cause of not booting seemed correct.

      It now seems to have migrated to an issue where it boots and runs fine for a couple of hours then completely shuts down.  Before it does, however, it send the following message over the debug port:

      [ 6335.602430] musb-hdrc musb mdrc.1: VBUS_ERROR in a_wait_vise (88, <AValid, retry #3, port1 0003010c

      At this point, it powers off and I need to power cycle the power supply and then it will reboot fine.

      Can you help me understand what this means and what might be causing it?

      Matt

    • #12358
      Neeraj Dantu
      Moderator

      Matt,

      From https://forum.beagleboard.org/t/uart-to-usb-adapter-causes-usb-port-to-fail/2115, it could be that you are drawing too much power from USB port resulting in a shutdown. Can you check your USB devices, may be remove them to see if you get the same error?

      Best,
      Neeraj

Viewing 7 reply threads
  • You must be logged in to reply to this topic.