FlexSpin and Interrupts
msrobots
Posts: 3,709
in Propeller 2
I can use Interrupts in PASM Code with PropTool and FlexSpin.
But I can not use Interrupts in FlexSpin Spin2, Basic or C.
So question for @ersmith (or @Wuerfel_21 who understand his code), why is it that that FlexSpin does not support the Interrupts and can that be added?
I really prefer FlexSpin for my projects but would like to run some Spin (or Basic) routines in one of the available interrupts, would help a lot in my current projects.
curious,
Mike
Comments
Filesystem support will need worked over for sure. The fast smartpin code I wrote assumes no interrupts and, as is, will break badly if a block transfer is interrupted. It gets a lot of its speed from unbroken SPI clocking.
That said, if I ever sort out using the streamer then that should be able to operate alongside interrupts.
I think interrupts break the fifo and so doesn’t work with hub exec type code.
But, any pasm driver cogs can use interrupts…
Fortunately it'll never occur, unless the interrupt code itself depends/relies-on Hub execution.
Once it's started, and enters in sync with the Hub "pace", Hub exec can "serve" the Alu at Sysclk-rate, but the fastest instructions can only "consume" it at Sysclk/2-rate.
In case of Hub execution, the Fifo interface can be seen as always "in advance", as related to the rate the Alu is able to "consume" it.
So, from the standpoint of Hub execution, keeping Interrupt routines code within the limits of Cog registries and/or Lut memory spaces is a win-win situation.
Hope it helps a bit.
Henrique
An IRQ will be a branch. Any branch into hubRAM automatically invokes the FIFO. Interrupts shouldn't be an issue living anywhere. The limitation on the use of the FIFO is explicitly attempting to use it for two jobs at once. eg: If hubexec is using it then the streamer can't without crashing the program.
You're right, as usual.
It was my fault; mother's-language-related, but still a fault...
I understood the "break the fifo"-part of @Rayman's post in terms of breaking the rhythm (aka: breaking the pace), not in terms of any conficting/crashing situation.
I was thinking on how the beggining of servicing the interrupt routine would force the Fifo to get redirected to another Hub address, in order to "grab" interrupt routine's first instruction; this normally consumes cycles, untill realignment with the Hub rotation can be achieved;
And it'll occur once more, at the end (return) of the interrupt code execution, when the Fifo would need to be redirected again, in order to "grab" the next instruction of the code that was interrupted.
Both situations (entering interrupt service routine, and returning from it) would cause Fifo realignment with Hub rotation, hence any timing penalty.
Sorry by the misinterpretation.
Ok, I remembered wrong...
I did find this post from @ersmith :
Flexprop's high level languages absolutely do not support ISRs, the generated code is not interrupt safe (e.g. it uses the CORDIC). If you need to use ISRs, I recommend putting them in a separate cog from the one running the HLL.
Simple (no CORDIC, etc) handwritten ISRs should actually work. You'll need to PUSHA/POPA any registers you clobber. Also, if FCACHE is enabled y
There's a tracking issue for real IRQ support (including high-level handlers), main issue is that the mutable state of a cog is really quite large.
A PNut /PropTool compiles to a bytecode, which is then interpreted by a short interpreter by the cog #0. In case of IRQ, that's interpreter which is interrupted, not a program, the interpreter state can be much more defined than the unknown big piece of code.
>
Making FlexSpin generated code be re-entrant (and hence safe to use both inside and outside of interrupts) would be an enormous amount of work -- the code generator was never designed with that in mind (it was originally written for P1, where interrupts were not ever an issue). Off the top of my head there would be the following problems:
(1) We would have to give up CORDIC instructions, or else protect them from interrupts (delaying the interrupts)
(2) Lots of COG registers would have to be saved/restored: all of the arg* registers, result* registers, and var* registers
(3) Various other internal state would have to be saved/restored (the _muldiva registers, RESULT registers, and so on)
(4) FCACHE would have to be handled correctly
(5) We'd have to give up using the REP instruction in the code generator
All of this would cost quite a bit in performance, and it's by no means trivial to implement.
Bring on the 1 MB 16-Cog Prop2 I say.
Let's get the current P2 documented and a develop full suite of useful drivers before we send Chip off on another 14-year adventure, shall we?
Oh, that's not a new design I'm referring to. It only takes money for OnSemi to re-spin the Prop2 onto 130 nm silicon.
Sure, the digital stuff is all parametric, but what about redesigning all the analog parts for a different process? Is that just a matter of manual layout, or would Chip's schematics have to change as well?
Schematic, no. Although Chip will want to fix the crystal oscillator, sysclock source select, and ADC flaws.
Treehouse will need to fit the primitives to the new process, yes. More money of course. Chip's involvement is minimal though.
And/or a "P2 Plus" with (PS)RAM interface added to the HUB and (a lot of) (PS)RAMs glued on top
But then
This is the most priority task for Propeller's family. This chip can do a lot of things but you have to know how to write the code for it. I play with P2s more than one year and I have a P1 experience which helped a lot, so "cog", "hub", SPIN, overall program model with 8 parallel processors with limited internal RAM, etc, were not new things for me. Still, I don't know a lot of things and the descriptions are hard to find and are scattered in the current documentation files. My current question is: how often can I call the CORDIC from one cog? Do I have to wait 55 clocks, get the result and set a new operation, or maybe I can set a loop, calling the CORDIC every 8 clocks and receiving the result from 7th call before? The experiment will tell but not everyone wants to experiment and then the set of examples and the list of things that can and cannot be done could save time for these experiments.
Not for me thanks. That would need a fully implemented hardware cache (Expect hubRAM to half in size) so as to handle burst read/writes and reduce the resulting read-modify-write thrashing. Latency will still be high when compared to native DRAM.
Not exactly. We only had to give up pipelined use of the CORDIC, e.g. starting multiple Q.... instructions before GETQX/Y. Using only a single CORDIC operation at a time is not disturbed by interrupts if they don't use the CORDIC themselves.
But yes, you have to be careful what's inside the ISR. A simple, hand writte ASM ISR that interrupts the compiled main program isn't a problem. Using compiled ISRs would be difficult due to reasons (2) to (4), I agree. (5) (REP) does no harm as long as you're aware that it delays/stalls the ISR execution.
This works fully automagically. You can start up to 7 CORDIC operations before the first GETQX/Y (as long as you don't use interrupts which can mess up the timing). See example here
Using the CORDIC solver non-pipelined, e.g. single operation at a time only, is fine even when using interrupts. I've tested that and it works. See discussion here
I'm pretty sure the state of non-piped CORDIC can be saved/restored using a sequence such as:
I found I can call the cordic fast. Time to try this
I always had in mind that the CORDIC is in most cases too slow (55 clocks, while reading from HUB is up to 17 clocks and from LUT 3 clocks). Now I tested the pipelined operation: it computes 12 sines in 166 clocks with partially unrolled loop. This is 14 clock for one sample and should be reduced even more when there is more to compute in one loop. The asymptotic value is 8 clocks per operation. Maybe this means a 6-op FM synth with a 16 voice polyphony (=something like a DX7, but with a perfect sine waves and DACs ) can be fit in one cog this way. A CORDIC can also calculate all the exponential/logarythmic stuff needed for envelopes.
A test loop:
In practise, it's actually hard to fit the required numerical shuffling into just four instructions. Especially for long buffers. Using a 16-clock interval (every second pipeline slot) produces better pacing.
Sorry, I should have been more clear. The compiler would have to give up generating CORDIC instructions, at least if we wanted to use high level languages for ISRs (which the OP was asking about). Although perhaps something like Ada's qrotate hack would allow us to save and restore the CORDIC state; we'd have to do some testing.
What happens if you want to use the Q register in your ISR, and you need to save the current SETQ state? E.g., if it is currently being used for MUXQ operations when the interrupt hits.
Is there a good way to read the current Q so you can restore it afterwards with another SETQ?
Maybe QROTATE #0,#0 then a GETQY D and later SETQ D? But this adds a lot of ISR entry overhead and may have to compete with other CORDIC operations in progress at the time of the ISR.
EDIT: actually it doesn't seem like that would work, as it uses 0 for Y if the SETQ prefix is not used prior to the QROTATE operation, not some existing Q from an earlier SETQ.
you can save Q by doing
but the compiler never generates code that relies on stale Q register, so whatever (infact, does it ever use MUXQ? Certainly not for the compiled code)
(Nevermind; Ada beat me to it.)
Very nice.