Interesting. The versions with fewer COGs would actually have lower latency in accessing the hub. That will mean that code written for the 16 COG version might not work correctly on a version with fewer COGs. Is that correct?
Above, Chip says this "These variants all use the 16-slot egg-beater, so there's really no need to constrain cogs to a power of two. We could have three cogs, if we wanted. Keeping the eggbeater as is keeps things simple and objects' timing consistent."
However, with less COGs you could have a choice to do 16 or less, by simple alias of slots.
ie an option (default) of 16 keeps timing simpler, but with fewer COGs you can easily allocate 2 or 4 slots of that 16.
Interesting. The versions with fewer COGs would actually have lower latency in accessing the hub. That will mean that code written for the 16 COG version might not work correctly on a version with fewer COGs. Is that correct?
If yes (lower latency) then code may act differently, but could run a lot faster.
If no (same latency - wasted hub cycles for missing cogs) then code acts the same, but lacks significant hub speed improvements.
Interesting. The versions with fewer COGs would actually have lower latency in accessing the hub. That will mean that code written for the 16 COG version might not work correctly on a version with fewer COGs. Is that correct?
No, the egg-beater will still be 16 clock cycles, regardless of the number of cogs.
BTW
Did you think about asking OnSemi about on-chip FLASH ?
Did you think about asking OnSemi about on-chip oscillators and accuracy ? (Microchip are claiming 0.25% accuracy factory trimmed 16MHz PIC16F1454)
Interesting. The versions with fewer COGs would actually have lower latency in accessing the hub. That will mean that code written for the 16 COG version might not work correctly on a version with fewer COGs. Is that correct?
It was stated earlier that versions with fewer cogs would still use the 16-cycle eggbeater structure. So there is no latency advantage with having fewer cogs. I'm not sure why this is, but it probably just to reduce the re-design effort, and it also makes the timing the same as a full 16-cog version that just uses few cogs.
Hopefully this discussion on different versions doesn't impact the schedule for the initial 16-cog version. I fear that discussions like this, such as the LUT-sharing discussion are slowing down progress.
Did you think about asking OnSemi about on-chip oscillators and accuracy ? (Microchip are claiming 0.25% accuracy factory trimmed 16MHz PIC16F1454)
As always, the BIG print giveth (typ only) and the fine print taketh away...
"Note 1: The USB specifications indicate low-speed USB devices should have a USB transmit frequency
tolerance of ± 1.5%. If this setting is selected, it is recommended to either keep the application at room
temperature, to use active clock tuning from a 32.768 kHz crystal, or employ manual adjustments to
the OSCTUNE register to maintain the HFINTOSC within the ± 1.5% tolerance range."
"The USB specifications full-speed USB devices should have a USB transmit frequency tolerance of
± 0.25%. In order to meet this specification, the firmware must enable (at runtime) the active clock
tuning feature associated with the HFINTOSC."
ie the same as Silabs EFM8UB1, which locks to the 1ms USB frame rate.
I see the Microchip part cannot lock at LS USB.
Interesting. The versions with fewer COGs would actually have lower latency in accessing the hub. That will mean that code written for the 16 COG version might not work correctly on a version with fewer COGs. Is that correct?
If I may pipe up here for a moment....
This issue and many others were discussed, and it seemed that giving the programmer the option to choose Turbo speed for higher performance, or remain at legacy speed for compatibility might be nice. Chip was going to consider that.
Now you've got me very interested Chip, I must say. I would skip DIP altogether as it is so easy to provide a DIP adaptor pcb and a 38-pin TSSOP version seems to make sense to me but is the 28-pin you mentioned TSSOP?. Otherwise what you have there looks great. Now where is the pre-order link I can click........
Interesting. The versions with fewer COGs would actually have lower latency in accessing the hub. That will mean that code written for the 16 COG version might not work correctly on a version with fewer COGs. Is that correct?
If I may pipe up here for a moment....
This issue and many others were discussed, and it seemed that giving the programmer the option to choose Turbo speed for higher performance, or remain at legacy speed for compatibility might be nice. Chip was going to consider that.
II'm not sure where that is at just now.
Cheers,
Peter (pjv)
I've been thinking about this all day. I've come to the conclusion that it is best to optimize for cog count count, which is the same as hub RAM slot count. This would make timing faster for smaller chips, while saving a ton of logic. It takes a lot of flops and muxes to pretend there are 16 slots when there aren't.
I've been looking through all the eggbeater-related circuits, figuring out what needs to be parameterized, so that they can optionally be made smaller. All these facets can be controlled from parameter definitions in the top Verilog file. That way, we could dial up any combination of cogs, hub RAM, smart pins and CORDIC that we want.
Reducing eggbeater metrics to model the smaller family variants will save enough logic that we can get more out of the smaller FPGA boards.
Just to state the obvious... those interested in modifying the Verilog sources down the road just got a huge helping hand:)
Even if the whole family gets shot by Ken.
I've been looking through all the eggbeater-related circuits, figuring out what needs to be parameterized, so that they can optionally be made smaller. All these facets can be controlled from parameter definitions in the top Verilog file. That way, we could dial up any combination of cogs, hub RAM, smart pins and CORDIC that we want.
Because this is LSB mapped, this still needs to be a power-of-2, right ?
ie choices of 16,8,4
Because this is LSB mapped, this still needs to be a power-of-2, right ?
ie choices of 16,8,4
Good question. Probably not. Just have to reconfigure the HubRAM blocks to match the number of Cogs so that the switch covers all addressable space per Hub rotation. The switch logic requirements should be reasonably squareish as a result.
It'll be odd to think there could be decimal symmetry ... maybe better not to contemplate such evils ...
We're getting ahead of ourselves though aren't we?
Second thoughts, you're right JMG, dunno why I thought otherwise. The address mapping between rotation and block would be messy not mapping to a power of two.
Second thoughts, you're right JMG, dunno why I thought otherwise. The address mapping between rotation and block would be messy not mapping to a power of two.
To make the hardware most functionally efficient, we need to stick with powers of two. If we were going to stay with 16 slots, in every case, then we could have any number of cogs 1..16. We are going to tighten things up, though.
I've been thinking about this all day. I've come to the conclusion that it is best to optimize for cog count count, which is the same as hub RAM slot count. This would make timing faster for smaller chips, while saving a ton of logic. It takes a lot of flops and muxes to pretend there are 16 slots when there aren't.
I've been looking through all the eggbeater-related circuits, figuring out what needs to be parameterized, so that they can optionally be made smaller. All these facets can be controlled from parameter definitions in the top Verilog file. That way, we could dial up any combination of cogs, hub RAM, smart pins and CORDIC that we want.
Reducing eggbeater metrics to model the smaller family variants will save enough logic that we can get more out of the smaller FPGA boards.
4, 8 and 16 cogs make sense and of course powers of two.
Only other possible number making any sense would likely be 12, but this would require a hub of 16 slots with 4 unused. Only sense here would be if reducing to 12 gave a space advantage.
2 cogs wouldn't make much sense unless there was a huge hub ram, and a number of smart pins. Think of a multi P2 system where the 2 cog would be running a big program.
Otherwise, as a part of a dual P2 on the one die with a "shared dual port ram" between them. Possibly a much later design if the P2 family really takes off.
Question...
Can a non smart pin still do ADC and pull-up/pull down ?
DIP28 and DIP40 would be awesome...now I can finally get excited about the P2. Without DIPs the P2 would probably be too expensive for me. I cannot afford $50 boards! As a side note I buy/use pic32 DIP28 all the time.
DIP28 and DIP40 would be awesome...now I can finally get excited about the P2. Without DIPs the P2 would probably be too expensive for me. I cannot afford $50 boards! As a side note I buy/use pic32 DIP28 all the time.
Don't anticipate DIP. There is a ground plate under the chip so probably only SMT chips!
I don't see a need for dip, unlike the 1990's we now oshpark and oshstencils
It lets you create small little boards for under $5 each.
Not hard to place by hand parts that have 0.8-0.65mm pin spacing when you use a mylar paste stencil.
I don't see a need for dip, unlike the 1990's we now oshpark and oshstencils
It lets you create small little boards for under $5 each.
Not hard to place by hand parts that have 0.8-0.65mm pin spacing when you use a mylar paste stencil.
And that bottom GND pad can be soldered by hand from the back side if you just put a big-enough hole in the PCB to accommodate the soldering iron tip. I think a 0.100" plated-through hole would be perfect, right under the middle of the chip.
I would assume the 4 cog version will not run as hot, and a 16 gpio 20pin narrow DIP just for breadboard fun could not hurt.
even if it uses a 32gpio die, just burry as using them for inter-cog communication would lead to incompatibly when moving up to a larger smd part.
Comments
Above, Chip says this
"These variants all use the 16-slot egg-beater, so there's really no need to constrain cogs to a power of two. We could have three cogs, if we wanted. Keeping the eggbeater as is keeps things simple and objects' timing consistent."
However, with less COGs you could have a choice to do 16 or less, by simple alias of slots.
ie an option (default) of 16 keeps timing simpler, but with fewer COGs you can easily allocate 2 or 4 slots of that 16.
If no (same latency - wasted hub cycles for missing cogs) then code acts the same, but lacks significant hub speed improvements.
Damned if you do, damned if you don't !!!
No, the egg-beater will still be 16 clock cycles, regardless of the number of cogs.
Did the test chip get on the May Shuttle ?
BTW
Did you think about asking OnSemi about on-chip FLASH ?
Did you think about asking OnSemi about on-chip oscillators and accuracy ? (Microchip are claiming 0.25% accuracy factory trimmed 16MHz PIC16F1454)
Hopefully this discussion on different versions doesn't impact the schedule for the initial 16-cog version. I fear that discussions like this, such as the LUT-sharing discussion are slowing down progress.
With 640K you can rule the world.
With 512K you can, too, though you may have to leave out Antartica and Greenland, or perhaps make them virtual.
As always, the BIG print giveth (typ only) and the fine print taketh away...
"Note 1: The USB specifications indicate low-speed USB devices should have a USB transmit frequency
tolerance of ± 1.5%. If this setting is selected, it is recommended to either keep the application at room
temperature, to use active clock tuning from a 32.768 kHz crystal, or employ manual adjustments to
the OSCTUNE register to maintain the HFINTOSC within the ± 1.5% tolerance range."
"The USB specifications full-speed USB devices should have a USB transmit frequency tolerance of
± 0.25%. In order to meet this specification, the firmware must enable (at runtime) the active clock
tuning feature associated with the HFINTOSC."
ie the same as Silabs EFM8UB1, which locks to the 1ms USB frame rate.
I see the Microchip part cannot lock at LS USB.
If I may pipe up here for a moment....
This issue and many others were discussed, and it seemed that giving the programmer the option to choose Turbo speed for higher performance, or remain at legacy speed for compatibility might be nice. Chip was going to consider that.
II'm not sure where that is at just now.
Cheers,
Peter (pjv)
A discussion for a later date when the Prop2 deluxe edition is done.
Try this...(because I really don't know:) the delta_latency is a constant. The system clock is a variable.
Therefore...
I've been thinking about this all day. I've come to the conclusion that it is best to optimize for cog count count, which is the same as hub RAM slot count. This would make timing faster for smaller chips, while saving a ton of logic. It takes a lot of flops and muxes to pretend there are 16 slots when there aren't.
I've been looking through all the eggbeater-related circuits, figuring out what needs to be parameterized, so that they can optionally be made smaller. All these facets can be controlled from parameter definitions in the top Verilog file. That way, we could dial up any combination of cogs, hub RAM, smart pins and CORDIC that we want.
Reducing eggbeater metrics to model the smaller family variants will save enough logic that we can get more out of the smaller FPGA boards.
Just to state the obvious... those interested in modifying the Verilog sources down the road just got a huge helping hand:)
Even if the whole family gets shot by Ken.
Regards,
Rich
Because this is LSB mapped, this still needs to be a power-of-2, right ?
ie choices of 16,8,4
It'll be odd to think there could be decimal symmetry ... maybe better not to contemplate such evils ...
We're getting ahead of ourselves though aren't we?
To make the hardware most functionally efficient, we need to stick with powers of two. If we were going to stay with 16 slots, in every case, then we could have any number of cogs 1..16. We are going to tighten things up, though.
Some musings:
1) on smaller cog counts, a parameter to COGINIT determining if that cog's access is 1:16 or 1:num_cogs
2) SOIC-28 please...
3) DIP28 & DIP40 would be fantastic for education
4) fewer Vdd / Vio on TQFP-52 would allow TQFP-48 or TQFP-44
I can see using the SOIC-28 a LOT
Would it be possible to fit 8KB-32KB of OTP (or even better, flash), especially on the smaller parts?
Only other possible number making any sense would likely be 12, but this would require a hub of 16 slots with 4 unused. Only sense here would be if reducing to 12 gave a space advantage.
2 cogs wouldn't make much sense unless there was a huge hub ram, and a number of smart pins. Think of a multi P2 system where the 2 cog would be running a big program.
Otherwise, as a part of a dual P2 on the one die with a "shared dual port ram" between them. Possibly a much later design if the P2 family really takes off.
Question...
Can a non smart pin still do ADC and pull-up/pull down ?
It lets you create small little boards for under $5 each.
Not hard to place by hand parts that have 0.8-0.65mm pin spacing when you use a mylar paste stencil.
No, it would just be a digital I/O pin, controlled by DIR and OUT and returning IN.
And that bottom GND pad can be soldered by hand from the back side if you just put a big-enough hole in the PCB to accommodate the soldering iron tip. I think a 0.100" plated-through hole would be perfect, right under the middle of the chip.
even if it uses a 32gpio die, just burry as using them for inter-cog communication would lead to incompatibly when moving up to a larger smd part.
Guess just important for volume users.
Personally, I think I'd just always use the big chip.