And a newer tester that uses PropSPI v2 as well as changing to measure the interval between two block completions. So it then also accounts for overhead at the RPi.
Update: Oops, now corrected for newer API.
Tested SPI driver code given by you. Also checked time in logic analyzer that I attached here. And also measure time using my_shift_in() function (mentioned in comment #20). What I found is there is time variation for each buffers in both scenario. Is there any possibility to get steady time for each buffer? Does it depends on frequency?
Below screenshots are of same waveform using SPI slave driver code. Note the time that varies between 4 to 6 milliseconds.
Below screenshots are of same waveform using my_Smile_in() function.Variations: 1 to 3 milliseconds.
It's the RPi, as the master, that controls the CLK. If the master stops clocking, then the slave can't help. You'll need to look more closely at clock spacings to see more why some blocks take longer than others.
Also, your CS looks odd doing so much toggling. EDIT: Actually, it looks like maybe all long block times coincide with the crazy CS behaviour.
Good news is the inter-block gap looks consistent. Although far too long itself to get the required bandwidth.