PNut appears faster than FastSpin in certain cases
Rayman
Posts: 14,646
I'm not sure how this can be, but PNut appears to be slightly faster than FastSpin in certain cases.
Seen this twice now.
One was an FSRW test of reading in a BMP (fsrw_test2.spin2). PNut appeared to be just a hair faster.
And now, I'm loading up hyperram from uSD in a loop like this:
Was working fine in FastSpin.
But for PNut, I had to insert a delay or use the WaitReady() function with FastSpin or else the drivers stomp on each other.
I have no idea how this can be possible though...
Everything I know says FastSpin should always be faster...
Seen this twice now.
One was an FSRW test of reading in a BMP (fsrw_test2.spin2). PNut appeared to be just a hair faster.
And now, I'm loading up hyperram from uSD in a loop like this:
repeat i from 1 to 480 sd.pread(pBuffer,640) WriteHyperRamRow(pBuffer,640,i) '(pData,nBytes,Row) 'copy nBytes from pData to Row "Row" on hyperRam. nBytes cannot exceed 1024 (?) 'waitus(50) WaitReady() 'need to either wait for writing to complete or at least about 50 us so sd doesn't overwrite before complete
Was working fine in FastSpin.
But for PNut, I had to insert a delay or use the WaitReady() function with FastSpin or else the drivers stomp on each other.
I have no idea how this can be possible though...
Everything I know says FastSpin should always be faster...
Comments
Hmm... My fsrw_test2 code doesn't work with -O2. I think it's fine anyway.
So it's acting more like set parameters for write command?
I added a WaitReady() function to make it blocking:
Write1 - write info entered into command buffer and return. Write starts immediately.
Write2 - previous write still not done but buffer is available, overwrite command buffer with this write info, return.
Write3 - command buffer still full. Wait until command buffer available and then write to command buffer and return.
Write4 - etc.
Would it also be possible to disable any PASM optimization for such ORG - END blocks?
There may be a reason, that I write the PASM code not in the fastest way, and the compiler/optimizer can not know, what I had in mind.
Andy
Yes, that's already done in github.