DDR Memory Organization

Forums Devices OSD32MP15x DDR Memory Organization

Viewing 6 reply threads
  • Author
    Posts
    • #13548
      Stephen BeckwithStephen Beckwith
      Participant

        Greetings, We are utilizing an OSD32MP15x module for one of our products.  We have prototyped most of the code on the STM32MP157-DK2 system. In analyzing the differences in performance between the 800MHz DK2 and 650MHz OSD platform, we noticed some inconsistencies that we are trying to explain.  We know that the DK2 kit uses a x16 DDR interface (assuming at 533MHz, though we have no specifics on the CL latencies).  Assuming that the OSD runs the DDR interface at the same clock rate, does the OSD use a x32 DDR interface hookup?  Also, what are the CL latencies of the RAM devices being used?

        Thank You for your reply, we are grateful.

        Regards,

        Stephen Beckwith

      • #13551
        Neeraj Dantu
        Moderator

          Stephen,

          OSD32MP15x-512M uses x16 DDR. The DDR configuration file can be found here: https://github.com/octavosystems/OSD32MP1-RED-Device-tree/blob/main/tf-a-v2.4-r0/osd32mp1_ddr_1x4Gb.dtsi. This was generated using ST’s CubeMX tool and can be used with OpenSTLinux. Please take a look at https://github.com/octavosystems/meta-octavo-osd32mp1 for yocto compatible meta layer for OSD32MP1-RED.

          Best,

          Neeraj

        • #13552
          Aedan Cullen
          Participant

            Information about latency:

            The DRAM device used on the STM32MP157C-DK2 and STM32MP157F-DK2 boards is Micron MT41K256M16TW-107:P. This is an 1866MT/s-grade device being run at 1066 speed. With the provided configurations, both the DK2 boards and Octavo’s boards use conservative 8-8-8 (CL=8) timings at 533MHz, which produces an absolute CAS latency of 15 ns.

            (If you diff the DDR configuration file for the DK2 boards against Octavo’s configuration, you can see that the only differences are some PHY tuning parameters. ST AN5168 has more details on how this works if you’re interested.)

            Considering this, I wouldn’t expect to see huge memory performance differences, so it could be quite interesting to find the root cause of your performance issue if you suspected it to be memory-related. Were you seeing worse or better performance on the 650MHz OSD32MP1?

          • #13554
            Stephen BeckwithStephen Beckwith
            Participant

              Neeraj;

              Thank  you for  your reply.  That’s what I was looking for.  So the OSD32 uses the same x16 interface as the STM32MP-DK2.  Combined with the information from Aedan on the latency, the situation now gets “stickier”. . . .

              Per your comment Aedan:

              – We are not seeing a performance decrease, rather we are seeing a performance increase on the OSD32 which is clocked ~ 20% slower than the DK2.  I was able to “force” the DK2 to run at 800MHz (verified by checking:  /sys/devices/system/cpu/cpufreq/policy0/cpuinfo_curr_freq.  Initial testing revealed that the DK2 was throttling between 400 and 800 MHz, where the OSD seems to be “fixed” at 650MHz

              cat /sys/devices/system/cpu/cpufreq/policy0/cpuinfo_cur_freq

              650000

              For the same test conditions on both platforms we are seeing:

              DK2 Kit:  Median CPU load:  20.9800%

              OSD32:  Median CPU load:  18.7000%

              This is based upon a 30 minute capture of sysstat data.  It goes against “logic” that a system that is clocked ~ 20% slower would actually show a increase in performance.  (less load).  I would have expected the OSD32 to be ~ 25% when you “do the math”.   A slower CPU clock would mean things take longer and would put more “pressure” on the system, not less.

              A x32 memory interface with better timing might be a good explanation for this, however, if you are saying that the OSD has the same x16 interface, then I’m somewhat beside myself as to why this is the case.

              Given that the OSD is “all inside” the chip, and doesn’t have to deal with DDR->Phy->bondout->pin->PCB->DDR pin->DDR Device path, I would think you “could” run this interface quicker.  Also, why not use a x32 interface.

              Any thoughts?  Any suggestions as to where to look for differences?

              Thank you for your speedy reply.

               

              Regards,

              Stephen Beckwith

               

               

               

            • #13557
              Neeraj Dantu
              Moderator

                Stephen,

                This is unexpected behavior. We would expect to see similar performance to discrete MP1 implementation.

                CPU load would depend on what applications are running. There are some differences between DK2 and RED board in terms of package configuration. Please take a look at “top” output to see what the difference is. Also review your clock tree(not just cpu)? See https://wiki.st.com/stm32mpu/wiki/Clock_overview to view clk_summary.

                Best,

                Neeraj

              • #13561
                Stephen BeckwithStephen Beckwith
                Participant

                  Neeraj;

                  Thank you for your feedback.  The DK2 system has a few other applications loaded, but “pidstat” shows they are not running or consuming fractional percentage of CPU time.  I will will be doing a “deep dive” next week on the data I’ve collected using “sysstat” utilities.  Thanks for the pointer on the clocks, I will take a closer look-see.

                  Thank You.

                   

                  Regards,

                  Stephen Beckwith

                • #13549
                  Aedan Cullen
                  Participant

                    According to previous info from Octavo, the OSD32MP157C uses the same DRAM device as the STM32MP157C-DK2 and STM32MP157F-DK2: Micron MT41K256M16TW-107:P. The clock is indeed 533MHz, and the interface is x16. Both the DK2 boards and Octavo’s board support appear to use 8-8-8 timings (CL=8), which puts the absolute CAS latency at 15 ns. Since the SiP should maintain this same DDR choice as the DK2, I wouldn’t expect to see huge memory performance differences.

                    DK2 configuration https://github.com/u-boot/u-boot/blob/master/arch/arm/dts/stm32mp15-ddr3-1x4Gb-1066-binG.dtsi

                    OSD32MP1 configuration: https://github.com/octavosystems/OSD32MP1-RED-Device-tree/blob/main/u-boot-v2020.10-r0/stm32mp15-osd32mp1-ddr3-1x4Gb.dtsi

                    You’ll notice the two are identical with the exception of some PHY tuning parameters near the end. For more information on the details, see ST AN5168.

                     

                Viewing 6 reply threads
                • You must be logged in to reply to this topic.