We cannot expect the P2 to do everything. Some things just need to be done in software. If there are glitches then either the design is wrong or the software must work around it. This is what we did on the P1, and it is what is done on every other micro.
Actually, nope. Other MCUs often have such sampling conditioning, and it is not a 'design is wrong' issue at all, but a natural consequence of across domain, or dual pin sampling.
In the USB example, because clocks are not locked, there is a narrow, but finite, window where a single clock sample, unconditioned, may give a false positive.
Likewise a ESD spike can false-trigger a single-sample pin.
The actual sampling aperture of a single D-FF will be < 1ns
We cannot expect the P2 to do everything. Some things just need to be done in software. If there are glitches then either the design is wrong or the software must work around it. This is what we did on the P1, and it is what is done on every other micro.
Actually, nope. Other MCUs often have such sampling conditioning, and it is not a 'design is wrong' issue at all, but a natural consequence of across domain, or dual pin sampling.
In the USB example, because clocks are not locked, there is a narrow, but finite, window where a single clock sample, unconditioned, may give a false positive.
Likewise a ESD spike can false-trigger a single-sample pin.
The actual sampling aperture of a single D-FF will be < 1ns
This is precisely why you have to "qualify" your sample. If you have a great hardware lump then it will do the processing required. But if you are making the hardware by software representation (ie typical bit-banging) then you need to qualify all signals.
The P1 and the P2 is by design, a software configurable device - that is its beauty. If Chip adds hardware blocks for all the peripherals desired, then the P1/P2will no longer be a P1/P2, and then the P2 will compete with all the other micros out there.
If Chip adds hardware blocks for all the peripherals desired, then the P1/P2will no longer be a P1/P2, and then the P2 will compete with all the other micros out there.
I'm not following the logic here.
The P2 already competes with other Micros.
Sampling Flip-Flops are very a long way from "hardware blocks for all the peripherals desired", so that point seems contrived.
Cluso is right. The Prop does not compete with anything. For sure there is weigh-ups when deciding to use a Prop but those are not commercial factors. And I don't foresee the Prop2 ever breaking into commercially minded thinking either.
The point Cluso makes about bit-bashing vs peripheral blocks is just how far away they are from each other. There's nothing contrived at all.
It is possible to have any instruction to start Timers in 2-3 different COG's in SYNC.
TIMSYNC COG's ---- 32 bit's value specifying TIMERS (16 COG's x 2 Timers)
0 = don't change anything, 1 = Restart that timers in SYNC
One approach with the existing design might be to use a LOCK change event as your global sync signal. You would still need to account for slight timing discrepancies, depending on whether you are polling LOCK change event or are using an ISR.
On the other hand, if the primary idea is to slave the timing of multiple cogs to a "master" timer, you could use the LOCK as a synchronization barrier and a single timer on one of the cogs to set/reset the barrier LOCK. All of the waiting cogs would then unblock, do their thing, then wait on the LOCK again.
The single-stepping seems to work fine, but we need a mechanism for breaking asynchronously, so that we can stop the cog and poll the execution point. Any ideas on how that should work? I think it needs to be another cog or a pin event.
The Prop does not compete with anything. For sure there is weigh-ups when deciding to use a Prop but those are not commercial factors. And I don't foresee the Prop2 ever breaking into commercially minded thinking either.
Wow, oh dear, - I hope Parallax have not wasted their time, given you seem to believe there is no commercial footprint for P2 designs !!
June July August
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
-- -- -- -- -- -- 1 2 3 4 1
-- -- -- -- -- -- -- 5 6 7 8 9 10 11 2 3 4 5 6 7 8
-- -- -- -- -- -- -- 12 13 14 15 16 17 18 9 10 -- -- -- -- --
21 22 23 24 25 26 27 19 20 21 22 23 24 25 -- -- -- -- -- -- --
28 29 30 26 27 28 29 30 31 -- -- -- -- -- -- --
-- --
I thought interrupts were done, but people continue to suggest changes.
I hope the P2 instruction set is completed.
Is P2 Day almost here? I'm starting to wonder/wander again.
Could we get a project update from Parallax please?
The Prop does not compete with anything. For sure there is weigh-ups when deciding to use a Prop but those are not commercial factors. And I don't foresee the Prop2 ever breaking into commercially minded thinking either.
Wow, oh dear, - I hope Parallax have not wasted their time, given you seem to believe there is no commercial footprint for P2 designs !!
I didn't say the Prop has no commercial footprint. Only that it has no commercial competitor. Just like the Stamp.
Chip and Ken have stated the Prop was and is not a commercial endeavourer - not in those words but certainly in the same vein. Ken had stated once or twice that the market Parallax fills is educational. It's a whole package deal, the processor used may be important for the what's being taught but it's only a small part of the whole package.
Don't get me wrong, I'm here because I like the architectural design of the Propeller. As are many others. And I'll be sticking around for that same reason. And I very much like the Prop2 as well.
The Xmos parts have similarities to the Prop, and I think recent developments have moved Prop2 towards the Xmoses. However, for the average school, I don't think the technical comparisons make much inroads.
Is there any provider of educational material using an Xmos part for there to be competition?
The business model of Parallax is to provide means to educate people. The business model of XMOS is to sell semiconductors. That makes a huge difference. And as I lost a lot of money and lifetime following a company with a business model to sell semiconductors I prefer to work with a company that educates people to enable them to also use semiconductors that will suddenly see EOL.
Parallax's business model is the same as that of XMOS, Microchip, Atmel or any other non-not for profit, to make money.
Their focus may be geared towards the edu market, however they are not the same as something like RPi foundation, which is as an actual charity.
They did make an attempt with Parallax Semi to move into the larger business/industrial sector at one time, which also goes against your point.
That failed, and they have decided to stick with the edu, hobby/low-volume specialty niche.
I have no doubt that Ken would be thrilled to have one of the big automakers call them up for a demo of the P2.
If that ever does happen, I expect you'd see a change of direction and pace that influx of $ would have.
Yes, of course Parallax is a for-profit business, just targeted at packaged education deals is all. Even the hobby aspect is not a big earner - I believe Ken has said exactly this.
The label "commercial" in this context is as a chip design for selling as just another micro-controller. JMG was suggesting that the Prop is competing with other micro-controllers commercially. I tried to lay out how the Prop came to be, because it most certainly was not from trying to be a commercially competitive design. The architecture is just too far off from the norm for that ... someone casually checking out the price will probably not look again because they're comparing with feature rich PIC32's and the like. Eight cores doesn't mean a lot when you only want one to manage all the spec'd I/O. And 32kB of main memory is a serious cap. The Hub is cool how it works but it is also a limitation.
There may well be some volume-ish commercial customers, I wouldn't know ... maybe that's what triggered the attempt at "Parallax Semi". Doesn't change what I've said as being on the mark though.
I've got the whole chip (minus the smart pins) compiling on the Cyclone V -A9 device now. It's using 60% of the FPGA logic.
A single-cog Prop2 compiles in 4 minutes and has an Fmax of ~120MHz on the Cyclone IV. It's about 105Mhz on the Cyclone V, which always seems to be slower than the Cyclone IV.
Here's the crazy thing, though: When I do a full-chip compile with 16 cogs on the Cyclone V -A9 device, the critical paths become flop-to-flop interconnect delays, with no logic in-between. These paths connect the hub RAMs' inputs and the CORDIC's results. I think on the ASIC, this wiring delay won't be such a problem. These paths lower the full-chip Fmax to ~82MHz on the -A9. We'll probably just run it at 80MHz on the FPGA, then. I'm sure we could go to 100MHz, too, without any problems, given likely workbench temperatures.
I have paved the way for adding the block-r/w instructions and will add them next. The big impediment to implementation is out of the way, so it should be easy.
Here is a big question for you guys:
We are planning on this:
16 cogs w/ 512KB hub RAM
If things turn out overly-big silicon-wise, and we need to reduce the chip size, which of the following would be better?
16 cogs w/256KB hub RAM, lots of cogs and less RAM
-or-
8 cogs w/512KB hub RAM, fewer cogs and faster memory access
Hopefully, we can get it all in there. With 16 of these new cogs, we are using about 60% of the logic that only 8 of the P2-Hot cogs would have required.
This is a pretty solid argument. And from what I understand of the new interrupt scheme and hubex (which would see better performance in jumps, btw), 8 cogs are pretty effective. Then when you also consider hub<->pin throughput, fast P2-to-P2 comms would be a viable solution for those who need more than 8 cogs.
We've made the COGS really useful again. Seems like this dilemma comes up at some level of COG utility.
I also vote, in advance of the question, to be conservative with the Smart Pins. There will be some things we need them to do, so that has to get done. But if it comes down to the pins trumping COGS and RAM? Make them a little less smart.
In other words, once the COGS and RAM are a lock, let's make sure they stay locked and live with what's left for Smart Pins.
As long as it's not gonna end up hot, and we don't end up doing this again, I'm good either way any of this goes, as long as it means we get a chip this go around.
Comments
In the USB example, because clocks are not locked, there is a narrow, but finite, window where a single clock sample, unconditioned, may give a false positive.
Likewise a ESD spike can false-trigger a single-sample pin.
The actual sampling aperture of a single D-FF will be < 1ns
One design question on Timers in COG.
It is possible to have any instruction to start Timers in 2-3 different COG's in SYNC.
TIMSYNC COG's ---- 32 bit's value specifying TIMERS (16 COG's x 2 Timers)
0 = don't change anything, 1 = Restart that timers in SYNC
Given that, his question could be read as, "Is there a way to start timers in different COGS in sync?"
The P1 and the P2 is by design, a software configurable device - that is its beauty. If Chip adds hardware blocks for all the peripherals desired, then the P1/P2will no longer be a P1/P2, and then the P2 will compete with all the other micros out there.
The P2 already competes with other Micros.
Sampling Flip-Flops are very a long way from "hardware blocks for all the peripherals desired", so that point seems contrived.
The point Cluso makes about bit-bashing vs peripheral blocks is just how far away they are from each other. There's nothing contrived at all.
One approach with the existing design might be to use a LOCK change event as your global sync signal. You would still need to account for slight timing discrepancies, depending on whether you are polling LOCK change event or are using an ISR.
On the other hand, if the primary idea is to slave the timing of multiple cogs to a "master" timer, you could use the LOCK as a synchronization barrier and a single timer on one of the cogs to set/reset the barrier LOCK. All of the waiting cogs would then unblock, do their thing, then wait on the LOCK again.
INTERRUPT #%xxcccciii
where
xx = don't care (for P2)
cccc = cog number
iii = interrupt vector
Causes an interrupt to vector %iii of cog %cccc
Wow, oh dear, - I hope Parallax have not wasted their time, given you seem to believe there is no commercial footprint for P2 designs !!
I thought interrupts were done, but people continue to suggest changes.
I hope the P2 instruction set is completed.
Is P2 Day almost here? I'm starting to wonder/wander again.
Could we get a project update from Parallax please?
Seems there isn't too much.
Docs, ROM, fuses, and the boot process.
We know we get one before the Smart Pin work begins.
Maybe we dont have too long now.
I didn't say the Prop has no commercial footprint. Only that it has no commercial competitor. Just like the Stamp.
Chip and Ken have stated the Prop was and is not a commercial endeavourer - not in those words but certainly in the same vein. Ken had stated once or twice that the market Parallax fills is educational. It's a whole package deal, the processor used may be important for the what's being taught but it's only a small part of the whole package.
Don't get me wrong, I'm here because I like the architectural design of the Propeller. As are many others. And I'll be sticking around for that same reason. And I very much like the Prop2 as well.
http://www.xmos.com/
Is there any provider of educational material using an Xmos part for there to be competition?
https://www.xmos.com/contact/education?secure=1
http://www.xcore.com/forum/index.php
They also provide scholarships and support young (12-14) students:
http://www.xmos.com/news/press/17555
Thats plainly incorrect.
Parallax's business model is the same as that of XMOS, Microchip, Atmel or any other non-not for profit, to make money.
Their focus may be geared towards the edu market, however they are not the same as something like RPi foundation, which is as an actual charity.
They did make an attempt with Parallax Semi to move into the larger business/industrial sector at one time, which also goes against your point.
That failed, and they have decided to stick with the edu, hobby/low-volume specialty niche.
I have no doubt that Ken would be thrilled to have one of the big automakers call them up for a demo of the P2.
If that ever does happen, I expect you'd see a change of direction and pace that influx of $ would have.
The label "commercial" in this context is as a chip design for selling as just another micro-controller. JMG was suggesting that the Prop is competing with other micro-controllers commercially. I tried to lay out how the Prop came to be, because it most certainly was not from trying to be a commercially competitive design. The architecture is just too far off from the norm for that ... someone casually checking out the price will probably not look again because they're comparing with feature rich PIC32's and the like. Eight cores doesn't mean a lot when you only want one to manage all the spec'd I/O. And 32kB of main memory is a serious cap. The Hub is cool how it works but it is also a limitation.
There may well be some volume-ish commercial customers, I wouldn't know ... maybe that's what triggered the attempt at "Parallax Semi". Doesn't change what I've said as being on the mark though.
A single-cog Prop2 compiles in 4 minutes and has an Fmax of ~120MHz on the Cyclone IV. It's about 105Mhz on the Cyclone V, which always seems to be slower than the Cyclone IV.
Here's the crazy thing, though: When I do a full-chip compile with 16 cogs on the Cyclone V -A9 device, the critical paths become flop-to-flop interconnect delays, with no logic in-between. These paths connect the hub RAMs' inputs and the CORDIC's results. I think on the ASIC, this wiring delay won't be such a problem. These paths lower the full-chip Fmax to ~82MHz on the -A9. We'll probably just run it at 80MHz on the FPGA, then. I'm sure we could go to 100MHz, too, without any problems, given likely workbench temperatures.
I have paved the way for adding the block-r/w instructions and will add them next. The big impediment to implementation is out of the way, so it should be easy.
Here is a big question for you guys:
We are planning on this:
16 cogs w/ 512KB hub RAM
If things turn out overly-big silicon-wise, and we need to reduce the chip size, which of the following would be better?
16 cogs w/256KB hub RAM, lots of cogs and less RAM
-or-
8 cogs w/512KB hub RAM, fewer cogs and faster memory access
Hopefully, we can get it all in there. With 16 of these new cogs, we are using about 60% of the logic that only 8 of the P2-Hot cogs would have required.
John Abshier
8 cogs should also simplify the cog<>pin muxing
Currently 512KB would be a minimum. There are now lots of chips with this RAM plus additional Flash. Two years ago 128KB was possibly acceptable.
However, with a more powerful P2, I tend to think more than 8 cogs would be extremely useful.
I presume it's not possible to go down to the next lower geometry given the time the P2 has taken???
This is a pretty solid argument. And from what I understand of the new interrupt scheme and hubex (which would see better performance in jumps, btw), 8 cogs are pretty effective. Then when you also consider hub<->pin throughput, fast P2-to-P2 comms would be a viable solution for those who need more than 8 cogs.
We've made the COGS really useful again. Seems like this dilemma comes up at some level of COG utility.
I also vote, in advance of the question, to be conservative with the Smart Pins. There will be some things we need them to do, so that has to get done. But if it comes down to the pins trumping COGS and RAM? Make them a little less smart.
In other words, once the COGS and RAM are a lock, let's make sure they stay locked and live with what's left for Smart Pins.
As long as it's not gonna end up hot, and we don't end up doing this again, I'm good either way any of this goes, as long as it means we get a chip this go around.
I'm itching to see how well HubExec works with the 16-way egg beater latencies. But I'd not want to sacrifice the RAM just to see that.
Internal routing looks easier with 12 than with 16 anyway.