Thanks Chip, looks good. The period and pin select is going to all be in the Smartpins isn't it? They can't be two clock instructions can they?
No, they are each a separate animal that is configured over its DIR signal. It's impossible for a cog, through instruction execution to create a back-to-back transition on a DIR signal, but the MSGOUT instruction does this and the smart pin receives it and decodes it, while holding its pass-through DIR steady. Note that the smart pin hasn't been coded yet. This is coming after the cog gets scoured clean.
Chip, make sure the implicit IFREE only occurs if the interrupt for the associated address is enabled. I can imagine someone only using a timer interrupt and accidentally trigger an IFREE in the interrupt routine because they JMPed to an address in the streamer or pin-related address.
Actually, do you really need three pairs of addresses? Couldn't you just do:
LINK $1F2, $1F3 (for streamer)
LINK $1F2, $1F4 (for pins)
LINK $1F2, $1F5 (for timer)
The other thing is heat. All the polling and background activity in "hot" was quite expensive! When the decision was made to cut some features, like that video system and the in COG math, it just didn't seem to make any sense to be left at such a reduced performance level.
It was the fast piepline, the big fast instructions, some instructions were really big I think, and the big fast buses muxing so many paths that were the big heat generators.
If it got to a point where the interrupts were confusing, yeah. Maybe. But these are distinct purposes, each with a hardware vector, and frankly, will be needed to make a COG do what a COG can do. Edit: This design dispenses with "in COG" special functions. WAITVID Great! WAITVID was cool for video, but everyone really wanted it to go bi-directional.
Without these interrupts, or a tasker, these COGS actually have some limits P1 COGS do not.
What sense does that make?
For me, I was thinking about the monitor, and on chip type development I did on "hot", and replicating that would take a lot of COGS. That made the 16 look more like 6 or maybe even 4 COGS. Nice, but a far cry from where it was.
The other thing is heat. All the polling and background activity in "hot" was quite expensive! When the decision was made to cut some features, like that video system and the in COG math, it just didn't seem to make any sense to be left at such a reduced performance level.
So my one question is, "are we headed toward hot 2.0?" Seriously. Tell me we aren't, and this is all good.
This chip will be warm to the touch when it's really busy, but not "hot". The package we are using is good for dissipating 2W with help from its bottom plate. I'm thinking 2W is an all-out figure.
I was initially uncomfortable about adding interrupts to P2.
I should no better and trust the "Verilog Wizard" to come up with an elegant and simple to use solution!
<hangs head in shame>
One again, Nice work Chip!
P.S Yee Hah, Multitasking is back....
I was thinking about one, not specifically meant to the timer, but
Due to code space constrains, interrupt service routines should be the most generic they can, then, if one wants to change only the ILOCK funcionality of an already programmed interrupt, without knowing all the details, and specially, during the time the interrupt service routine is running, e. g., to enable a lower priority interrupt to occur, inside the service routine of a higher priority one, then the hability of only changing the ILOCK status could be of advantage.
Forgot about those Evan. Yes. Without that going on, making sure the COGS can perform is likely safe. Still, when it's all scrubbed... maybe some analysis can be done.
Chip, make sure the implicit IFREE only occurs if the interrupt for the associated address is enabled. I can imagine someone only using a timer interrupt and accidentally trigger an IFREE in the interrupt routine because they JMPed to an address in the streamer or pin-related address.
Actually, do you really need three pairs of addresses? Couldn't you just do:
LINK $1F2, $1F3 (for streamer)
LINK $1F2, $1F4 (for pins)
LINK $1F2, $1F5 (for timer)
Then you only need to look at $1F2.
Excellent idea about qualifying the automatic IFREE with that interrupt's enable.
We can't use just $1F2 because there might be a 1ms timer interrupt that has a huge service routine that will need to be interrupted by high-priority pin and streamer interrupts. If ILOCK was automatically asserted on every interrupt, we could get away with only $1F2, but it would preclude a lot of interrupt use cases.
The other thing is heat. All the polling and background activity in "hot" was quite expensive! When the decision was made to cut some features, like that video system and the in COG math, it just didn't seem to make any sense to be left at such a reduced performance level.
It was the fast piepline, the big fast instructions, some instructions were really big I think, and the big fast buses muxing so many paths that were the big heat generators.
That's right. There was no signal gating, anywhere. In this new design, only the parts of the ALU that are being used see any input signals change. Switching is probably 10% here, compared to what it was in P2-Hot.
On yeah. I had already forgotten that one interrupt could interrupt another. I'm not thrilled about losing two registers per interrupt type, but I guess this is no worse than the 6 registers we gave up for CTRA/B on P1.
On yeah. I had already forgotten that one interrupt could interrupt another. I'm not thrilled about losing two registers per interrupt type, but I guess this is no worse than the 6 registers we gave up for CTRA/B on P1.
Remember HUBEXEC. COG space is still a premium, but spillover is now a simple, fast option.
On yeah. I had already forgotten that one interrupt could interrupt another. I'm not thrilled about losing two registers per interrupt type, but I guess this is no worse than the 6 registers we gave up for CTRA/B on P1.
And if you don't use those interrupts, there's no register loss. I put the timer one up highest because that's going to be the most common one, with the less-common ones building downwards.
Excellent idea about qualifying the automatic IFREE with that interrupt's enable. Is this 'Automatic' a user option ?I can think of use cases where cog non-int code wants to steal a little time from an INT to finish something important, but does not want to kill the INT totally. A delayed IFREE would be a means to do that. Automatic IFREE that cannot be disabled, could bite.
Jmg is thinking about a sort of handover to a task - in a multitasking OS. Where that targeted task will be guarenteed to jump to highest priority and finish up the processing while still holding the IRQ mask.
- Interrupt routines can be protected from interruption by the L mode bits, cancelling the need for hold-off counting. When they exit, IFREE always occurs.
MCUs with one level of INT priority have an implicit hold-over, that runs in parallel with any SW Lock/free.
Once an INT fires, the INT HW State machine waits until a RETI, before other INTs are actioned.
SW Enable/Disable are OR'd with that Logic, to give INT action.
You may have meant that action, rather than a physical change of a user IFREE flag ?
How many clock cycles should last a changing pin value, before being detected as "any edge"?
Is this value latched anywhere?
IIRC the Smart Pins were to include Debounce, where users can define exactly that.
The fastest such polling can go is usually 2 SysCLKs, one to sample the Pin, and another to compare previous sample. Often one more Sync state is added, to give some noise filtering, meaning a pin has to be low for 2 sysclks.
Problem is: if the any edge option was selected, and the signal has a short active region, without latching the level that caused the interrupt, one can't be sure the real value that caused it.
I. e., reading the pin value inside the interrupt routine can be meaningless.
Excellent idea about qualifying the automatic IFREE with that interrupt's enable. Is this 'Automatic' a user option ?I can think of use cases where cog non-int code wants to steal a little time from an INT to finish something important, but does not want to kill the INT totally. A delayed IFREE would be a means to do that. Automatic IFREE that cannot be disabled, could bite.
Another question:
How many clock cycles should last a changing pin value, before being detected as "any edge"?
Is this value latched anywhere?
Yanomani
If the pin went from high to low, or vice-versa, that's "any edge". I was always registering two cycles of a pin, so that I'd have something to look at: a one-clock-ago state and a two-clocks-ago state. I just compare those two for 0-to-1, 1-to-0, or difference ("any edge").
Jmg is thinking about a sort of handover to a task - in a multitasking OS. Where that targeted task will be guarenteed to jump to highest priority and finish up the processing while still holding the IRQ mask.
Some driver writers like to own the IRQ.
Well, that would imply that after that special OS routine completes, then the effective return-from-interrupt would execute, in order to get back to the original interrupted code. Wouldn't the ILOCK'd interrupt handler call the OS routine, somehow, and then do the JMP $1Fx instruction, causing an automatic IFREE?
- Interrupt routines can be protected from interruption by the L mode bits, cancelling the need for hold-off counting. When they exit, IFREE always occurs.
MCUs with one level of INT priority have an implicit hold-over, that runs in parallel with any SW Lock/free.
Once an INT fires, the INT HW State machine waits until a RETI, before other INTs are actioned.
SW Enable/Disable are OR'd with that Logic, to give INT action.
You may have meant that action, rather than a physical change of a user IFREE flag ?
With automatic initial ILOCK and terminal IFREE, wouldn't this be about the same?
We've all been ruminating on the same things, I think.
How about this:
INTMODE D/# - set interrupt mode
INTPER D/# - set period for timer interrupt
INTPIN D/# - set pin for edge interrupt
IFREE - allow interrupt
ILOCK - don't allow interrupt until IFREE
Whenever a JMP/LINK to $1F0/$1F2/$1F4 occurs (return from interrupt), IFREE happens automatically.
This system takes about the same amount of logic, but has these advantages:
- multiple concurrent interrupts are allowed on a first-come basis, but in the event of simultaneous occurrence, they are prioritized as: streamer, pin-edge, timer.
- Timer period and pin number can be changed on the fly.
- Interrupt routines can be protected from interruption by the L mode bits, cancelling the need for hold-off counting. When they exit, IFREE always occurs.
This would take an hour to change. I like the idea of being able to have slow housekeeping interrupts with fast event-driven interrupts that are protected from interruption. This could allow a whole console computer with video, keyboard, mouse, and RTC to operate from one cog, with mainline code not needing to worry about any keyboard/mouse/RTC details - they would all be taken care of in the background.
WOW Chip,
This is fantastic. I can have a timer interrupt in case a pin level does not change, and they have separate vectors. If we don't use interrupts, those vectors are free for cog use.
We should b able to run some nice uarts at high speed, while still having time to perform other housekeeping etc while we are assembling the characters.
this is certainly worth 2.5% LEs per cog.
My expectations are that we can use the smart pins to output some bits at special timing, so we can effectively have a simple serial stream being sent fast. This will make for simple uart transmission without cog control for shifting out the bits (cog sw can append start and stop bits if required).
Comments
No, they are each a separate animal that is configured over its DIR signal. It's impossible for a cog, through instruction execution to create a back-to-back transition on a DIR signal, but the MSGOUT instruction does this and the smart pin receives it and decodes it, while holding its pass-through DIR steady. Note that the smart pin hasn't been coded yet. This is coming after the cog gets scoured clean.
Actually, do you really need three pairs of addresses? Couldn't you just do:
LINK $1F2, $1F3 (for streamer)
LINK $1F2, $1F4 (for pins)
LINK $1F2, $1F5 (for timer)
Then you only need to look at $1F2.
It was the fast piepline, the big fast instructions, some instructions were really big I think, and the big fast buses muxing so many paths that were the big heat generators.
If it got to a point where the interrupts were confusing, yeah. Maybe. But these are distinct purposes, each with a hardware vector, and frankly, will be needed to make a COG do what a COG can do. Edit: This design dispenses with "in COG" special functions. WAITVID Great! WAITVID was cool for video, but everyone really wanted it to go bi-directional.
Without these interrupts, or a tasker, these COGS actually have some limits P1 COGS do not.
What sense does that make?
For me, I was thinking about the monitor, and on chip type development I did on "hot", and replicating that would take a lot of COGS. That made the 16 look more like 6 or maybe even 4 COGS. Nice, but a far cry from where it was.
The other thing is heat. All the polling and background activity in "hot" was quite expensive! When the decision was made to cut some features, like that video system and the in COG math, it just didn't seem to make any sense to be left at such a reduced performance level.
So my one question is, "are we headed toward hot 2.0?" Seriously. Tell me we aren't, and this is all good.
This chip will be warm to the touch when it's really busy, but not "hot". The package we are using is good for dissipating 2W with help from its bottom plate. I'm thinking 2W is an all-out figure.
I should no better and trust the "Verilog Wizard" to come up with an elegant and simple to use solution!
<hangs head in shame>
One again, Nice work Chip!
P.S Yee Hah, Multitasking is back....
Due to code space constrains, interrupt service routines should be the most generic they can, then, if one wants to change only the ILOCK funcionality of an already programmed interrupt, without knowing all the details, and specially, during the time the interrupt service routine is running, e. g., to enable a lower priority interrupt to occur, inside the service routine of a higher priority one, then the hability of only changing the ILOCK status could be of advantage.
Yanomani
No, they are each a separate animal ...
Bugger, I was thinking all along you were going to use the available mass of counters in Smartpins for generating IRQ interval.
Actually, do you really need three pairs of addresses? Couldn't you just do:
LINK $1F2, $1F3 (for streamer)
LINK $1F2, $1F4 (for pins)
LINK $1F2, $1F5 (for timer)
Then you only need to look at $1F2.
Excellent idea about qualifying the automatic IFREE with that interrupt's enable.
We can't use just $1F2 because there might be a 1ms timer interrupt that has a huge service routine that will need to be interrupted by high-priority pin and streamer interrupts. If ILOCK was automatically asserted on every interrupt, we could get away with only $1F2, but it would preclude a lot of interrupt use cases.
It was the fast piepline, the big fast instructions, some instructions were really big I think, and the big fast buses muxing so many paths that were the big heat generators.
That's right. There was no signal gating, anywhere. In this new design, only the parts of the ALU that are being used see any input signals change. Switching is probably 10% here, compared to what it was in P2-Hot.
2W peak seems entirely appropriate. Cool beans. Or gates, really.
Remember HUBEXEC. COG space is still a premium, but spillover is now a simple, fast option.
And if you don't use those interrupts, there's no register loss. I put the timer one up highest because that's going to be the most common one, with the less-common ones building downwards.
How many clock cycles should last a changing pin value, before being detected as "any edge"?
Is this value latched anywhere?
Yanomani
Excellent idea about qualifying the automatic IFREE with that interrupt's enable. Is this 'Automatic' a user option ?I can think of use cases where cog non-int code wants to steal a little time from an INT to finish something important, but does not want to kill the INT totally. A delayed IFREE would be a means to do that. Automatic IFREE that cannot be disabled, could bite.
Some driver writers like to own the IRQ.
Right now, these are atomic and discrete. We should keep it that way. We've got 16 COGS. Surely there is a way to insure that priority task happens.
- Interrupt routines can be protected from interruption by the L mode bits, cancelling the need for hold-off counting. When they exit, IFREE always occurs.
MCUs with one level of INT priority have an implicit hold-over, that runs in parallel with any SW Lock/free.
Once an INT fires, the INT HW State machine waits until a RETI, before other INTs are actioned.
SW Enable/Disable are OR'd with that Logic, to give INT action.
You may have meant that action, rather than a physical change of a user IFREE flag ?
A qualifier count, to filter out unwanted edge transitions.
How many clock cycles should last a changing pin value, before being detected as "any edge"?
Is this value latched anywhere?
IIRC the Smart Pins were to include Debounce, where users can define exactly that.
The fastest such polling can go is usually 2 SysCLKs, one to sample the Pin, and another to compare previous sample. Often one more Sync state is added, to give some noise filtering, meaning a pin has to be low for 2 sysclks.
Then my almost intuitively sketch describes this situation.
Yanomani
I. e., reading the pin value inside the interrupt routine can be meaningless.
Excellent idea about qualifying the automatic IFREE with that interrupt's enable. Is this 'Automatic' a user option ?I can think of use cases where cog non-int code wants to steal a little time from an INT to finish something important, but does not want to kill the INT totally. A delayed IFREE would be a means to do that. Automatic IFREE that cannot be disabled, could bite.
Another question:
How many clock cycles should last a changing pin value, before being detected as "any edge"?
Is this value latched anywhere?
Yanomani
If the pin went from high to low, or vice-versa, that's "any edge". I was always registering two cycles of a pin, so that I'd have something to look at: a one-clock-ago state and a two-clocks-ago state. I just compare those two for 0-to-1, 1-to-0, or difference ("any edge").
Some driver writers like to own the IRQ.
Well, that would imply that after that special OS routine completes, then the effective return-from-interrupt would execute, in order to get back to the original interrupted code. Wouldn't the ILOCK'd interrupt handler call the OS routine, somehow, and then do the JMP $1Fx instruction, causing an automatic IFREE?
- Interrupt routines can be protected from interruption by the L mode bits, cancelling the need for hold-off counting. When they exit, IFREE always occurs.
MCUs with one level of INT priority have an implicit hold-over, that runs in parallel with any SW Lock/free.
Once an INT fires, the INT HW State machine waits until a RETI, before other INTs are actioned.
SW Enable/Disable are OR'd with that Logic, to give INT action.
You may have meant that action, rather than a physical change of a user IFREE flag ?
With automatic initial ILOCK and terminal IFREE, wouldn't this be about the same?
How about this:
INTMODE D/# - set interrupt mode
INTPER D/# - set period for timer interrupt
INTPIN D/# - set pin for edge interrupt
IFREE - allow interrupt
ILOCK - don't allow interrupt until IFREE
Interrupt mode settings: %LSS_LPP_LT
%LSS = streamer interrupt (issues 'LINK $1F0,$1F1 WC,WZ')
L: 1=ILOCK on interrupt
SS: 0x=disable, 10=rollover, 11=block wrap
%LPP = pin edge interrupt (issues 'LINK $1F2,$1F3 WC,WZ')
L: 1=ILOCK on interrupt
PP: 00=disable, 01=any edge, 10=pos edge, 11=neg edge
%LT = timer interrupt (issues 'LINK $1F4,$1F5 WC,WZ')
L: 1=ILOCK on interrupt
T: 0=disable, 1=enable
Whenever a JMP/LINK to $1F0/$1F2/$1F4 occurs (return from interrupt), IFREE happens automatically.
This system takes about the same amount of logic, but has these advantages:
- multiple concurrent interrupts are allowed on a first-come basis, but in the event of simultaneous occurrence, they are prioritized as: streamer, pin-edge, timer.
- Timer period and pin number can be changed on the fly.
- Interrupt routines can be protected from interruption by the L mode bits, cancelling the need for hold-off counting. When they exit, IFREE always occurs.
This would take an hour to change. I like the idea of being able to have slow housekeeping interrupts with fast event-driven interrupts that are protected from interruption. This could allow a whole console computer with video, keyboard, mouse, and RTC to operate from one cog, with mainline code not needing to worry about any keyboard/mouse/RTC details - they would all be taken care of in the background.
WOW Chip,
This is fantastic. I can have a timer interrupt in case a pin level does not change, and they have separate vectors. If we don't use interrupts, those vectors are free for cog use.
We should b able to run some nice uarts at high speed, while still having time to perform other housekeeping etc while we are assembling the characters.
this is certainly worth 2.5% LEs per cog.
My expectations are that we can use the smart pins to output some bits at special timing, so we can effectively have a simple serial stream being sent fast. This will make for simple uart transmission without cog control for shifting out the bits (cog sw can append start and stop bits if required).