Forums › Devices › OSD32MP15x › DDR Memory Organization
Tagged: OSD32MP15x DDR Memory
Greetings, We are utilizing an OSD32MP15x module for one of our products. We have prototyped most of the code on the STM32MP157-DK2 system. In analyzing the differences in performance between the 800MHz DK2 and 650MHz OSD platform, we noticed some inconsistencies that we are trying to explain. We know that the DK2 kit uses a x16 DDR interface (assuming at 533MHz, though we have no specifics on the CL latencies). Assuming that the OSD runs the DDR interface at the same clock rate, does the OSD use a x32 DDR interface hookup? Also, what are the CL latencies of the RAM devices being used?
Thank You for your reply, we are grateful.
Regards,
Stephen Beckwith
Stephen,
OSD32MP15x-512M uses x16 DDR. The DDR configuration file can be found here: https://github.com/octavosystems/OSD32MP1-RED-Device-tree/blob/main/tf-a-v2.4-r0/osd32mp1_ddr_1x4Gb.dtsi. This was generated using ST’s CubeMX tool and can be used with OpenSTLinux. Please take a look at https://github.com/octavosystems/meta-octavo-osd32mp1 for yocto compatible meta layer for OSD32MP1-RED.
Best,
Neeraj
Information about latency:
The DRAM device used on the STM32MP157C-DK2 and STM32MP157F-DK2 boards is Micron MT41K256M16TW-107:P. This is an 1866MT/s-grade device being run at 1066 speed. With the provided configurations, both the DK2 boards and Octavo’s boards use conservative 8-8-8 (CL=8) timings at 533MHz, which produces an absolute CAS latency of 15 ns.
(If you diff the DDR configuration file for the DK2 boards against Octavo’s configuration, you can see that the only differences are some PHY tuning parameters. ST AN5168 has more details on how this works if you’re interested.)
Considering this, I wouldn’t expect to see huge memory performance differences, so it could be quite interesting to find the root cause of your performance issue if you suspected it to be memory-related. Were you seeing worse or better performance on the 650MHz OSD32MP1?
Neeraj;
Thank you for your reply. That’s what I was looking for. So the OSD32 uses the same x16 interface as the STM32MP-DK2. Combined with the information from Aedan on the latency, the situation now gets “stickier”. . . .
Per your comment Aedan:
– We are not seeing a performance decrease, rather we are seeing a performance increase on the OSD32 which is clocked ~ 20% slower than the DK2. I was able to “force” the DK2 to run at 800MHz (verified by checking: /sys/devices/system/cpu/cpufreq/policy0/cpuinfo_curr_freq. Initial testing revealed that the DK2 was throttling between 400 and 800 MHz, where the OSD seems to be “fixed” at 650MHz
cat /sys/devices/system/cpu/cpufreq/policy0/cpuinfo_cur_freq
650000
For the same test conditions on both platforms we are seeing:
DK2 Kit: Median CPU load: 20.9800%
OSD32: Median CPU load: 18.7000%
This is based upon a 30 minute capture of sysstat data. It goes against “logic” that a system that is clocked ~ 20% slower would actually show a increase in performance. (less load). I would have expected the OSD32 to be ~ 25% when you “do the math”. A slower CPU clock would mean things take longer and would put more “pressure” on the system, not less.
A x32 memory interface with better timing might be a good explanation for this, however, if you are saying that the OSD has the same x16 interface, then I’m somewhat beside myself as to why this is the case.
Given that the OSD is “all inside” the chip, and doesn’t have to deal with DDR->Phy->bondout->pin->PCB->DDR pin->DDR Device path, I would think you “could” run this interface quicker. Also, why not use a x32 interface.
Any thoughts? Any suggestions as to where to look for differences?
Thank you for your speedy reply.
Regards,
Stephen Beckwith
Stephen,
This is unexpected behavior. We would expect to see similar performance to discrete MP1 implementation.
CPU load would depend on what applications are running. There are some differences between DK2 and RED board in terms of package configuration. Please take a look at “top” output to see what the difference is. Also review your clock tree(not just cpu)? See https://wiki.st.com/stm32mpu/wiki/Clock_overview to view clk_summary.
Best,
Neeraj
Neeraj;
Thank you for your feedback. The DK2 system has a few other applications loaded, but “pidstat” shows they are not running or consuming fractional percentage of CPU time. I will will be doing a “deep dive” next week on the data I’ve collected using “sysstat” utilities. Thanks for the pointer on the clocks, I will take a closer look-see.
Thank You.
Regards,
Stephen Beckwith
According to previous info from Octavo, the OSD32MP157C uses the same DRAM device as the STM32MP157C-DK2 and STM32MP157F-DK2: Micron MT41K256M16TW-107:P. The clock is indeed 533MHz, and the interface is x16. Both the DK2 boards and Octavo’s board support appear to use 8-8-8 timings (CL=8), which puts the absolute CAS latency at 15 ns. Since the SiP should maintain this same DDR choice as the DK2, I wouldn’t expect to see huge memory performance differences.
DK2 configuration https://github.com/u-boot/u-boot/blob/master/arch/arm/dts/stm32mp15-ddr3-1x4Gb-1066-binG.dtsi
OSD32MP1 configuration: https://github.com/octavosystems/OSD32MP1-RED-Device-tree/blob/main/u-boot-v2020.10-r0/stm32mp15-osd32mp1-ddr3-1x4Gb.dtsi
You’ll notice the two are identical with the exception of some PHY tuning parameters near the end. For more information on the details, see ST AN5168.
Octavo Systems LLC all rights reserved
OCTAVO is registered in the U.S. Patent and Trademark Office. OSD, C-SiP, and the Octavo Logo are trademarks of Octavo Systems LLC.
"*" indicates required fields
"*" indicates required fields
"*" indicates required fields
"*" indicates required fields