Forums › Devices › OSD32MP15x › OSDMP157C: DCMI overruns triggered by SPI5 transfers when DMA is enabled
Hi,
We’re using DCMI (80MHz pixel clock, 12-bit) to connect a custom sensor (2048×256) and we can’t afford to miss any frames.
The DCMI is assigned to CA7 cores, stock Linux driver stm32-dcmi is used.
SPI5, assigned to CA7 cores as well, is used to send data to a mipi-dbi TFT screen (15MHz, 160×80).
Everything is working like a charm when SPI5 uses interrupt mode.
However, we’d like to use DMA mode for SPI5 to reduce CPU consumption: unfortunately, when using DMA a screen refresh triggers DCMI overrun errors.
I tried to change DCMI’s DMA priority to very high: no improvement.
Any advice/suggestion?
As I’m a bit desperate, I tried to use SPI6 instead of SPI5 because this SPI instance seems closer from the CA7 cores than the former:
– SPI6 in Interrupt mode: TFT screen is working
– SPI6 in DMA mode: TFT screen not working and MDMA driver reports BSE (Block Size Error).
How to solve this transfer error ?
Thanks in advance,
Sylvain.
Please, find below dmesg output that shows the BSE error:
[ 4927.476761] spi_stm32 5c001000.spi: cpol=0 cpha=0 lsb_first=0 cs_high=0
[ 4927.476777] spi_stm32 5c001000.spi: stm32_spi_can_dma: true
[ 4927.476817] spi_stm32 5c001000.spi: stm32_spi_can_dma: true
[ 4927.476833] spi_stm32 5c001000.spi: transfer communication mode set to 1
[ 4927.476848] spi_stm32 5c001000.spi: data frame of 16-bit, data packet of 2 data frames
[ 4927.476862] spi_stm32 5c001000.spi: speed set to 8328125Hz
[ 4927.476876] spi_stm32 5c001000.spi: transfer of 25600 bytes (12800 data frames)
[ 4927.476888] spi_stm32 5c001000.spi: dma enabled
[ 4927.476904] spi_stm32 5c001000.spi: Tx DMA config buswidth=2, maxburst=1
[ 4927.476934] dma dma0chan8: hwdesc:Â 0xd8000000
[ 4927.476948] dma dma0chan8: CTCR:Â Â Â 0x02000042
[ 4927.476961] dma dma0chan8: CBNDTR:Â 0x00006400
[ 4927.476974] dma dma0chan8: CSAR:Â Â Â 0xd8048000
[ 4927.476987] dma dma0chan8: CDAR:Â Â Â 0x5c001020
[ 4927.477000] dma dma0chan8: CBRUR:Â Â 0x00000000
[ 4927.477013] dma dma0chan8: CLAR:Â Â Â 0x00000000
[ 4927.477026] dma dma0chan8: CTBR:Â Â Â 0x00000023
[ 4927.477038] dma dma0chan8: CMAR:Â Â Â 0x00000000
[ 4927.477051] dma dma0chan8: CMDR:Â Â Â 0x00000000
[ 4927.477051]
[ 4927.477075] dma dma0chan8: vchan fbf730a5: issued
[ 4927.477090] dma dma0chan8: CCR:Â Â Â Â 0x00000006
[ 4927.477104] dma dma0chan8: CTCR:Â Â Â 0x02000042
[ 4927.477117] dma dma0chan8: CBNDTR:Â 0x00006400
[ 4927.477130] dma dma0chan8: CSAR:Â Â Â 0xd8048000
[ 4927.477143] dma dma0chan8: CDAR:Â Â Â 0x5c001020
[ 4927.477156] dma dma0chan8: CBRUR:Â Â 0x00000000
[ 4927.477169] dma dma0chan8: CLAR:Â Â Â 0x00000000
[ 4927.477182] dma dma0chan8: CTBR:Â Â Â 0x00000023
[ 4927.477194] dma dma0chan8: CMAR:Â Â Â 0x00000000
[ 4927.477207] dma dma0chan8: CMDR:Â Â Â 0x00000000
[ 4927.477221] dma dma0chan8: vchan fbf730a5: started
[ 4927.477235] spi_stm32 5c001000.spi: enable controller
[ 4927.477259] dma dma0chan8: Transfer Err: stat=0x00000880
[ 4927.734857] st7735s spi1.0: SPI transfer timed out
[ 4927.738283] spi_stm32 5c001000.spi: stm32_spi_can_dma: true
[ 4927.738307] spi_stm32 5c001000.spi: disable controller
[ 4927.738354] spi_master spi1: failed to transfer one message from queue
Do you have readouts of those MDMA registers from a successful 25600-byte transfer on SPI5?
Aedan, below is a dmesg output showing successful 25600-byte transfer on SPI5 (DMA):
[Â 291.136104] spi_stm32 44009000.spi: cpol=0 cpha=0 lsb_first=0 cs_high=0
[Â 291.136119] spi_stm32 44009000.spi: stm32_spi_can_dma: true
[Â 291.136158] spi_stm32 44009000.spi: stm32_spi_can_dma: true
[Â 291.136174] spi_stm32 44009000.spi: transfer communication mode set to 1
[Â 291.136189] spi_stm32 44009000.spi: data frame of 16-bit, data packet of 2 data frames
[Â 291.136202] spi_stm32 44009000.spi: speed set to 13054870Hz
[Â 291.136216] spi_stm32 44009000.spi: transfer of 25600 bytes (12800 data frames)
[Â 291.136228] spi_stm32 44009000.spi: dma enabled
[Â 291.136244] spi_stm32 44009000.spi: Tx DMA config buswidth=2, maxburst=1
[Â 291.136282] stm32-dma 48000000.dma-controller: vchan a624546f: txd d3e6b206[316]: submitted
[Â 291.136302] dma dma1chan4: vchan a624546f: issued
[Â 291.136318] dma dma1chan4: SCR:Â Â 0x00002c56
[Â 291.136332] dma dma1chan4: NDTR:Â 0x00003200
[Â 291.136345] dma dma1chan4: SPAR:Â 0x44009020
[Â 291.136358] dma dma1chan4: SM0AR: 0xd8048000
[Â 291.136370] dma dma1chan4: SM1AR: 0xd8048000
[Â 291.136383] dma dma1chan4: SFCR:Â 0x00000021
[Â 291.136397] dma dma1chan4: vchan a624546f: started
[Â 291.136410] spi_stm32 44009000.spi: enable controller
[Â 291.152230] spi_stm32 44009000.spi: disable controller
[Â 291.152336] spi_stm32 44009000.spi: stm32_spi_can_dma: true
[Â 291.152363] spi_stm32 44009000.spi: disable controller
Here’s the next thing I’d try:
In your device tree node for SPI6
, set the dmas
property as follows:
dmas = <&mdma1 34 0x0 0x40008 0x0 0x0 0x0>,
<&mdma1 35 0x0 0x10040002 0x0 0x0 0x0>;
Compared to stm32mp151.dtsi
, which has 0x40002 as the third parameter in the TX DMA, the value 0x10040002 should enable block transfer mode. If this doesn’t work (we might need to tweak more things for the block transfer to complete successfully), send the error/register dump from dmesg again. (Unfortunately I don’t have a similar SPI display laying around to actually test on.)
Full thought process, starting with what I think caused the overruns with your original SPI5 configuration:
– DCMI
is always going to be using one of the standard DMA controllers, DMA1
or DMA2
(selected by the DMAMUX
) [1].
– SPI5
will also be automatically assigned a stream by the DMAMUX
driver in the same way [2]. Most likely, all eight streams of DMA1
are not yet occupied, and so both DCMI
and SPI5
end up on the same DMA1
controller.
– DMA1
/DMA2
only arbitrate between streams once a request is completely finished [3]. That is, DMA1
will wait until your 26Kbyte display frame is completed before possibly servicing the DCMI
, even if the DCMI
stream has a higher priority.
– Probably, the display frame transfer takes long enough over SPI5
that a DCMI
frame is guaranteed to start before the display frame transfer is completed. Check whether this is true with whatever framerate you use for your image sensor.
– So then the DCMI
FIFO is guaranteed to overflow anytime you transfer a display frame, regardless of priority.
So we either need SPI display frame transfers to be much faster, or we need to use a separate DMA controller. Incidentally, your SPI6 configuration achieves exactly that and uses MDMA instead [1]. (The separate DMA controller is what we need, not proximity to the ARM cores [4].)
MDMA breaks because TRGM[1:0] is left as 00 in CTCR (as if to configure a buffer transfer) but the data length is set in CBNDTR (for a block transfer). The TLEN (for buffer length) is left as the default one byte, which can’t be packed into a 16-bit write to the peripheral, so you get the BSE error. The MDMA driver seems to deal with the data length registers properly (you must use a block transfer above 128 bytes), but it doesn’t automatically switch to a block transfer in CTCR. In the device tree property above, I’m setting TRGM[1:0] in CTCR to 01 instead of 00. More debugging might be required after this, but you definitely need the MDMA to be in block transfer mode.
[1] per stm32mp151.dtsi
.
[2] line 110 of stm32-dmamux.c
.
[3] section 2.2.3 of ST AN4031. F7, H7, and MP1 use the same DMA IP as far as I know.
[4] The A7 cores are not involved during DMA transfers between memory and a peripheral.
Octavo, is there some sort of allow-list that the forum has for trusted users so that I could post links without being caught by the spam filter? 🙂
Aedan, thank you for your very detailed answer, it’s fantastic. You are right about every point!
I cannot speedup the SPI display transfers so I need to use of separate DMA controller; however to avoid re-spinning a new board, I’d like to continue using SPI5
and to use DMA2
for DCMI
only.
The question is HOW to get DCMI
to use DMA2
?
I enabled DMA for I2C1
 in the hope that DCMI
ends up using DMA2
…Â and it did, everything is working as expected : no more overruns!
New /sys/kernel/debug/dmaengine/summary
output:
dma0 (58000000.dma-controller): number of channels: 32
dma0chan0Â Â Â | 48000000.dma-controller:ch0
dma0chan1Â Â Â | 48000000.dma-controller:ch1
dma0chan2Â Â Â | 48000000.dma-controller:ch2
dma0chan3Â Â Â | 48000000.dma-controller:ch3
dma0chan4Â Â Â | 48000000.dma-controller:ch4
dma0chan5Â Â Â | 48000000.dma-controller:ch5
dma0chan6Â Â Â | 48000000.dma-controller:ch6
dma0chan7Â Â Â | 48000000.dma-controller:ch7
dma0chan8Â Â Â | 48001000.dma-controller:ch0
dma0chan9Â Â Â | 48001000.dma-controller:ch1
dma0chan10Â Â | 48001000.dma-controller:ch2
dma0chan11Â Â | 48001000.dma-controller:ch3
dma0chan12Â Â | 48001000.dma-controller:ch4
dma0chan13Â Â | 48001000.dma-controller:ch5
dma0chan14Â Â | 48001000.dma-controller:ch6
dma0chan15Â Â | 48001000.dma-controller:ch7
dma1 (48000000.dma-controller): number of channels: 8
dma1chan0Â Â Â | 4000e000.serial:rx (via router: 48002000.dma-router)
dma1chan1Â Â Â | 4000e000.serial:tx (via router: 48002000.dma-router)
dma1chan2Â Â Â | 4000b000.spi:tx (via router: 48002000.dma-router)
dma1chan3Â Â Â | 4000b000.spi:rx (via router: 48002000.dma-router)dmas
dma1chan4Â Â Â | 44009000.spi:tx (via router: 48002000.dma-router)
 dma1chan5   | 44009000.spi:rx (via router: 48002000.dma-router)
 dma1chan6   | 40012000.i2c:tx (via router: 48002000.dma-router)
 dma1chan7   | 40012000.i2c:rx (via router: 48002000.dma-router)
dma2 (48001000.dma-controller): number of channels: 8
dma2chan1Â Â Â | 4c006000.dcmi:tx
This is really good news ! But I don’t like the hackish method I used to getl DCMI
to use DMA2
… What is the recommended method ?
I tried to add dmas = <&dma2 0 0 0x400 0xe0000001>;
in my device tree node for DCMI
to overwrite dmas = <&dmamux1 75 0x400 0xe0000001>;
from stm32mp151.dtsi.
According to /sys/kernel/debug/dmaengine/summary
DCMI uses the DMA2 controller, but it does not work….
Thank you in advance,
Sylvain.
<p id=”docs-internal-guid-70e29185-7fff-97b0-0d38-c39dbf26df72″ dir=”ltr”>Awesome, I’m glad you got it working. I think the reason you can’t directly set &dma2
in the device tree is that DMAMUX
configuration is needed to route the request signal from DCMI
to DMA2
within the SoC, and that remains unconfigured if the device tree only references the DMA controller directly. You could potentially try changing only the SPI TX DMA to &dma2
to see if that’ll still work since the CPU is starting the transfer – I didn’t yet look closely to see whether it should.
<p dir=”ltr”>In the stm32-dmamux.c
driver, it definitely seems to be an oversight that there is no provision for manually choosing a controller. They just take the 16 total channels (eight DMA0 followed by eight DMA1) and assign them sequentially as you saw. I think ST may not have anticipated/tested DMA use for simultaneous video streams like this. The cleanest approach I can think of right now would be to improve the stm32-dmamux.c
driver to add (optional) manual multiplexing control in the device tree.
Octavo Systems LLC all rights reserved
OCTAVO is registered in the U.S. Patent and Trademark Office. OSD, C-SiP, and the Octavo Logo are trademarks of Octavo Systems LLC.
"*" indicates required fields
"*" indicates required fields
"*" indicates required fields
"*" indicates required fields