STREAMER - I beg you example !
Ramon
Posts: 484
in Propeller 2
Hi!
I beg you all to please help me to figure out how to code this :
I have this two LONGS (*): data byte $01, $08, $80, $07 $00, $FF, $F0, $1F And want to serially send a 1-bit stream into pin number 32 in a loop repeat 0000 0001 0000 1000 1000 0000 0000 0111 ' Send to pin 32 0000 0000 1111 1111 1111 0000 0001 1111 ' Send to pin 32
Sorry, I am not able to write this simple test code by just only reading the docs without any example.
I was looking at the streamer, but actually I don't know if there is any other way much more efficient (like LUT / LOOKUP, or whatever ...)
Extra1 : What would be the fastest possible bitrate?
Extra2 : What would be the fastest possible bitrate if we need to transmit 1520 bytes?
Comments
Sending ... and into pin 32. Are you wanting the streamer outputting or inputting? And is this clocked SPI?
Chip's flash_loader.spin2 is a good demo of using the streamer for both burning the EEPROM and then booting from it.
Hi Evanh
Thanks. I want to output to pin 32. It's not SPI, but will need a fixed frequency (which frequency is not important, the highest the better).
Be warned that you are dealing with a extremely stupid guy, so reading the 276 lines of flash_loader.spin2 (even if it has comments) is as useful to me as reading the current DOCs (a wonderful technical document, but without any single simple example).
It's not a coherent source of info but there was a long conversation with Mike (msrobots) a couple years back that appears to be similar in idea - https://forums.parallax.com/discussion/170216/ringbuffer-was-streamer-questions-how-to-sync/p1
My own experimenting was very messy but I do still have the sources that I could dig up.
EDIT: I used 32-bit data width there and it later morphed into hyperRAM testing at 8-bit data width. But there's nothing stopping it being 1-bit wide instead.
EDIT2: It was all Pasm2 back then. Do you prefer Spin (with Pasm where needed) examples?
EDIT3: I think what Mike was doing is negotiate a starting time between the two props then blindly bursting. It worked well for me in a single Prop2 between two cogs, via the physical pins, but between two Propellers might be more difficult. It would depend on all propellers seeing the same external clock on the XI pin.
EDIT4: Ha, I tell a lie about pasm only. it looks like Mike was using Eric's extended fastspin.
Thank you for your help Evanh!
Wow ! Your streamer example 689 lines of code, and msrobots' example is even more complex doing simultaneous RX/TX on different cogs and using three file objects.
Please, I beg again, could it be possible to have a **SIMPLE ** code example? (something really simple like the pseudo code in the first post).
Here is a very simple example in Spin. Be aware that the streamer can output the bits up to the full sysclock frequency, I don't know if you have a scope, fast enough to verify this. So I have set the frequency to 1/32 sysclock (~5.6 MHz @ 180MHz).
Andy
Ariba, thank you so much !!
I will start with that code, and try to make some debugs.
NOTE to myself: This thread seems to match, and it has also interesting details about how the streamer compares with smartpin serial synchronous input / output modes -> https://forums.parallax.com/discussion/171086/faster-spi-bus-transfers/p1
(It's hard to get around all this concepts. I think that the docs should have included some simple code examples and graphs !)
_clkfreq = 10_000_000
should help with seeing the data with a scope. Can up the clock rate after you've got the sequencing sorted.EDIT:
_clkfreq = 4_000_000
is about as low as allowed in Spin auto-config. I have handcrafted 2 MHz for this sort of testing before.Ariba's example is using what Chip calls "Immediate mode". It spits out whatever data is in the S operand of XINIT/XCONT/XZERO. When you look at the hardware manual for the streamer, you'll see five groups of outputting modes. The two immediate modes use S operand but the remaining three modes all use hubRAM as the data source, with the fifth mode being specifically for video output.
Hi Evanh,
Yes, thank you. I am currently exactly doing that (lowering clock as much as possible to be able to do a simple quick LED test first, at around 1Hz, using my eyes as scope). 4MHz is around the lowest I can get. Unfortunately it is still too high speed to be able to do a simple led-eye-scope test.
Also, I am no sure about the relationship on NCO and system frequency. According to instruction set, the allowed values are between #0 and #511. I guess that the higher the value the delay will be longer (as we will need to wait until it reach zero) is that correct?
I am using the following test code with flexprop 5.2 and Retroblade2 (that's why I also limit debug to 115_200)
Okay, NCO (Numerically-controlled oscillator) is not a delay timer in the simple sense. They're a little fancier because they can produce a fractional average rate. Albeit as a dither.
There is two parameters: One is the update period, for SETXFRQ this is simply the sysclock itself. The second paramter is an amount to accumulate on each update. So, an amount of, say, 200 will, on each sysclock, add 200 to the accumulator inside the NCO.
The output of the NCO is bit31 of its accumulator. Adding 200 each sysclock will take over 10.7 million sysclocks to trigger one streamer cycle. In your case, a single bit is outputted on pin 36.
So, to make the NCO generate a streamer trigger on every sysclock needs the maximum value of $8000_0000. Other examples are provided in the hardware doc.
BTW: You're not far off one bit per second with setxfrq #511 and clkfreq of 4 MHz. setxfrq #268 will give you a bit every two seconds.
Right, the next thing is large immediates (##number) in pasm2. They do work. You've just chopped out too much of the listing using grep like that.
The assemblers generate a prefixing instruction, called AUGD or AUGS, that sets up a hidden register(s) with the extra bits to complete a full 32-bit immediate number in the code.
You also might prefer seeing the most significant bit first:
(it was not a single grep, but the many test I did. I removed the commands and merged the output to avoid wasting internet bandwidth)
The P2 instruction set document doesn't say that this instruction needs AUGD.
But yes, the complete listing shows AUGD preceding SETXFRQ.
I have just done a quick test with the scope.
Tried zillions of combinations and was not able to make a simple LED toggling.
I was thinking that I didn't setup my scope properly, but a simple test with pintoggle at (4MHz sysclock) showed a 1MHz frequency.
And I was able to detect any low (or high) frequency with a simple pintoggle, but was not able to emulate the same with the streamer:
I modified the immediate values to make it square wave:
Maybe the key is in the md value. I will look at it later.
Thanks for your detailed comments !
Or use a lower setxfreq and higher clkfreq. Has anyone ever used setxfreq #1?
That's because those prefixes can be applied to almost all instructions. Not ALTx.
Oh, it's the waitms() One second is far too short if each bit time is greater than a second. It'll just keep resetting the XINIT as it loops.
As an aside, you would have see something if you had three consecutive streamer commands. The XINIT is started immediately, and the XCONT is buffered immediately. If a second XCONT was after that it would have blocked, stalling the Cog, waiting for the buffer to free up.
Ha, I just noticed that the assembled machine code for Ariba's
setxfrq ##$80000000 / 32
is wrong. Or at least not what was intended. It has been treated as a signed number when the intent was unsigned.setxfrq ##$8000_0000 >> 5
should work correctly. If not then this:setxfrq ##$0400_0000
Thank you all !!
Evanh, Yes, you are right. The instruction setxfrq ##$80000000 / 32 is being treated as unsigned and correct one should be setxfrq ##$8000_0000 >> 5.
I also needed to slightly modify Ariba's code to make it work. I moved the xinit instruction outside the loop, and keep only xcont inside the loop. I was able to toggle pin 36 (1Hz) with this code:
For what you're doing now, it actually doesn't need the XINIT line at all.
XINIT just guarantees an immediate restart of the streamer - Including the NCO I suspect. But when it's not doing anything it will happily startup from an XCONT.
I remember that tried many combinations and only worked this way.
Well, now the big question: How to make it work to read from an array of bytes (instead of using immediate values)?
I've just loaded up your example and removed the XINIT line. It's working fine.
Three options: You can feed more S operands from cogRAM, or reference lutRAM with a map in S, or setup a RDFAST from hubRAM using the FIFO.
The FIFO option allows larger data blocks and more autonomous but it must be consecutive addresses within the hubRAM blocks. The hubRAM data can also reference lutRAM.
PS: More than one block require the cog to manage the FIFO with FBLOCK instructions.
Agree, for the first sentence only.
I have done more tests. There seems to be something wrong with XINIT.
XINIT inside the REPEAT prevents buffering of streamer commands because part of restarting feature is to empty the command buffer. It just endlessly restarts in a tight loop.
Lutram and Rdfast are not really usable within Spin2, so the easiest is to read the array values into a local variable and send it per Streamer immediate-mode:
(This code also shows that you can use unsigned divide with $8000_0000 for calculating the streamer freq.)
For sure Spin2 limits the max. possible streamer frequency to a few MHz (repeat loop freq. * 32) if you need a contiguous bitstream.
Andy
I suspect the problem with the single XINIT test would've been the whole program terminated on you. Thereby cancelling the streamer actions. This works:
I have just asked Eric on a separate thread (FlexProp thread).
XINIT seems fine. The problem is how FlexProp handles an empty in-line PASM (org .. end) inside a repeat loop.
It seems a FlexProp bug, PNut v35k does not have this issue.
Yep, I intentionally removed the empty ORG/END to avoid that.
I just tested this mode variation and I'm not seeing any difference in effect. The bit order stays the same. I've also tried at 2-bit width and no effect there either. Both bit order and 2-bit word order seems unchanged by the "a" bit being set. Not sure what the story is right now.
I think I used it once before a long time back with hubRAM transfers. But that was 32-bit word width. EDIT: Hmm, or maybe not. It only applies to 4-bit or less word size.