Internal ADC and other features will be available on each I/O pin.
Particularly, an internal DAC per pin also. And I remember Chip discussing with someone on strengthening the drive currents for direct VGA drive. This can benefit digital bus performance also. The I/O blocks are a beast.
I wouldn't call it the smartest idea to run the logic directly out to electrically harsh environments though. Eg: Around your average polar-f lease.
Right now, I am recoding portions of the Verilog source so that datapath elements will be properly inferred by the synthesizer, rather than pedestrian adder cells, and such. Without fast adders and multipliers, the critical paths become very long and the chip, if fabricated, would only operate at maybe 60MHz. We think 100MHz should be achievable (worst case process, voltage, and temp), but not the originally-hoped-for 160MHz.
There are two sine lookup ROMs per counter so that a Goertzel algorithm can be performed on a delta-sigma bitstream coming out of an I/O pin in ADC mode. This is for measuring energy and phase of the CTR's frequency (PHS += FRQ, uses 9 MSB's from PHS for phase). The point of this is to be able to output a sinus on one pin (via a pin DAC), then measure the returning phase and energy on another pin (in ADC mode). Also, each CTR can operate as a biphase function generator, outputting sine, triangle, sawtooth, or square wave with 9-bit amplitude control, 9-bit offset control, and 9-bit relative phase control, with 31-bit frequency control (or 32-bit, if you count backwards, where FRQ => $8000_0000).
Thanks for the update. I'm also interested in more detail on the instruction set. Maybe Roy can provide an update on that? He said he had to run it by you first..
Dave,
Sorry for the delay. I shouldn't have promised as fast a delivery as I had. Anyway, I have a few posts in the works, and I am waiting on Chip to review one, since my notes were lacking in a couple spots. In the mean time Chip has been busy and produced a new instruction list that includes more changes, so I have to go back over things and make sure my stuff is still right. I don't want to release information that is wrong.
The information will come out. Please be patient though.
Dave,
Sorry for the delay. I shouldn't have promised as fast a delivery as I had. Anyway, I have a few posts in the works, and I am waiting on Chip to review one, since my notes were lacking in a couple spots. In the mean time Chip has been busy and produced a new instruction list that includes more changes, so I have to go back over things and make sure my stuff is still right. I don't want to release information that is wrong.
The information will come out. Please be patient though.
Among the many reasons to do this are to ensure that it is maintained from a single company source.
I'll guarantee that the page will be posted within a day of obtaining the final draft from Roy (with Chip's review). Thank you, Roy. We hope not to squish your enthusiasm at the expense a bit of company formality.
Sorry.
...Also, each CTR can operate as a biphase function generator, outputting sine, triangle, sawtooth, or square wave with 9-bit amplitude control, 9-bit offset control, and 9-bit relative phase control, with 31-bit frequency control (or 32-bit..).
Can these 32b counters also do more mundane things like
* Capture on either/both edges, and clear-on-capture ( seen in newest NXP parts )
That HW detail matters, for measuring fast pulses, with clear-on-capture, you know you always have the pulse width, even if you have skipped some captures. With capture alone, you are never sure.
* Clock from an external pin, and toggle a pin, and ideally the Toggle Out, should optionally be the 'other counters' Capture pin. (or, build a /2^N Pin-Divider, which then drives capture, saves a complex counter, but costs a little silicon)
Chips with 32 bit counters AND 32 bit prescalers are appearing, and if they just connected the Prescaler OP to CtrClk OR Capture, that would work nicely.
The /2^N => Capture does not have to be a physical pin, it can be a buried connection.
With this simple silicon support, you can build Reciporocal Frequency Counters, that have very high precision, and dynamic range.
* Quadrature count, with noise filtering ?
* Usual PWM generate
If the counter is able to go 2x or 4x the core, allow it to do so. That extra time precision can be invaluable.
I haven't seen any information on Spin 2. My hope is that Spin 2 will used the same bytecodes as Spin, and treat the unused $3C code as a prefix to another set of 256 opcodes. The extended opcodes could take advantage of the new features in Prop 2. The current Spin binary files could handle image sizes up to 64KB in size. A 128KB binary could be accomodated by using some of the bits in the header that are normally set to zero.
My understanding is that Prop 2 will not contain a Spin interpreter in ROM. The interpreter will need to be loaded from RAM. Maybe it can be linked into the Spin 2 binary file for each Spin program.
Dave,
I don't think that would be a good plan really. Since the spin interpreter(VM) is loaded from external memory, it's possible to have a Spin 1 version if you really need it, otherwise run Spin 2 and let it be tuned to what's best for the Prop 2. I don't think there is a really compelling reason for them to be binary compatible. Source compatible is enough, in my opinion.
Dave,
I don't think that would be a good plan really. Since the spin interpreter(VM) is loaded from external memory, it's possible to have a Spin 1 version if you really need it, otherwise run Spin 2 and let it be tuned to what's best for the Prop 2. I don't think there is a really compelling reason for them to be binary compatible. Source compatible is enough, in my opinion.
I suspect Spin1 programs will run faster than Spin2 programs, if its VM still fits in one cog and doesn't have to resort to the LMM to execute. OTOH, the LMM on the Prop2 might be fast enough to overcome any of the time inefficiencies extant in Spin1 that were required to shoehorn its VM into 496 longs. It'll be interesting to see what the trade-offs are.
A long time ago (years) I sent my faster spin interpreter to Chip. By utilising the hub bytecode decoder and subroutine executer, spin was about 25% faster, but the maths flew. With the new spin interpreter being soft on PropII, and LMM being utilised for the relatively unused bytecodes, some real speed improvements should be possible. And all this is aside from all the new instructions and speed of the propII.
Even if the propII were only clocked at 100MHz and comparing that to a Prop 1 at 100MHz, spin should be, at a guess, 6x-8x faster or more!!! Remember, push & pop instruction sequences are used for almost all bytecodes and these alone are down to single instructions, as will be the bytecode fetch sequence. Hub is 2x faster without any tricks. Cogs are 4x faster. IIRC the save/restore flags are also single instruction too.
A long time ago (years) I sent my faster spin interpreter to Chip. By utilising the hub bytecode decoder and subroutine executer, spin was about 25% faster, but the maths flew. With the new spin interpreter being soft on PropII, and LMM being utilised for the relatively unused bytecodes, some real speed improvements should be possible. And all this is aside from all the new instructions and speed of the propII.
Even if the propII were only clocked at 100MHz and comparing that to a Prop 1 at 100MHz, spin should be, at a guess, 6x-8x faster or more!!! Remember, push & pop instruction sequences are used for almost all bytecodes and these alone are down to single instructions, as will be the bytecode fetch sequence. Hub is 2x faster without any tricks. Cogs are 4x faster. IIRC the save/restore flags are also single instruction too.
Once I get the info on the instruction set I will revisit the spin interpreter for the propII. At that time I will also revisit it for prop1. Now, what is going to suffer??
I was wondering if you could give us an update on the prop 2 chip. You guys have been very quite lately
and I hope you didn't run into any problems with the design.
The status right now is about the same as far as what we have released at the most recent EXPO's. We are working hard to finalize some of the synthesized logic, and there is a memory block consisting of RAM and ROM that needs to be layed out. When I get schematics from Chip I will begin work on that... for now I am busy with the buss lines that travel around the entire chip and bringing them to a point that the synthesized logic will be able to communicate to them.
Not terribly exciting... I mean, what I meant to say was that it was 'tedious' and that it would cause most humans to 'tear up' for no apparent reason... :-)
This video only shows wiring 3 of the 92 inputs from the I/O.
Each I/O has 4 wires to the buss, so for 92 I/O's total that's 368 wires.
Another buss is for the DACs located on each I/O that require 296 wires.
Keeping track? That's a total of 664 wires on the buss that must connect to the chips Core. This is divided so that 332 wires are on the East side and 332 wires on the West side.
Comments
Particularly, an internal DAC per pin also. And I remember Chip discussing with someone on strengthening the drive currents for direct VGA drive. This can benefit digital bus performance also. The I/O blocks are a beast.
I wouldn't call it the smartest idea to run the logic directly out to electrically harsh environments though. Eg: Around your average polar-f lease.
Sorry.
Right now, I am recoding portions of the Verilog source so that datapath elements will be properly inferred by the synthesizer, rather than pedestrian adder cells, and such. Without fast adders and multipliers, the critical paths become very long and the chip, if fabricated, would only operate at maybe 60MHz. We think 100MHz should be achievable (worst case process, voltage, and temp), but not the originally-hoped-for 160MHz.
There are two sine lookup ROMs per counter so that a Goertzel algorithm can be performed on a delta-sigma bitstream coming out of an I/O pin in ADC mode. This is for measuring energy and phase of the CTR's frequency (PHS += FRQ, uses 9 MSB's from PHS for phase). The point of this is to be able to output a sinus on one pin (via a pin DAC), then measure the returning phase and energy on another pin (in ADC mode). Also, each CTR can operate as a biphase function generator, outputting sine, triangle, sawtooth, or square wave with 9-bit amplitude control, 9-bit offset control, and 9-bit relative phase control, with 31-bit frequency control (or 32-bit, if you count backwards, where FRQ => $8000_0000).
Chip
Thanks for the update. I'm also interested in more detail on the instruction set. Maybe Roy can provide an update on that? He said he had to run it by you first..
Dave
Sorry for the delay. I shouldn't have promised as fast a delivery as I had. Anyway, I have a few posts in the works, and I am waiting on Chip to review one, since my notes were lacking in a couple spots. In the mean time Chip has been busy and produced a new instruction list that includes more changes, so I have to go back over things and make sure my stuff is still right. I don't want to release information that is wrong.
The information will come out. Please be patient though.
Ouch, that's a big drop from expected.
C.W.
Hey all - I've asked that Roy provide the draft instruction set to Parallax so we can post it in the official place where it belongs: http://www.parallaxsemiconductor.com/Products/propeller2specs
Among the many reasons to do this are to ensure that it is maintained from a single company source.
I'll guarantee that the page will be posted within a day of obtaining the final draft from Roy (with Chip's review). Thank you, Roy. We hope not to squish your enthusiasm at the expense a bit of company formality.
Ken Gracey
Can these 32b counters also do more mundane things like
* Capture on either/both edges, and clear-on-capture ( seen in newest NXP parts )
That HW detail matters, for measuring fast pulses, with clear-on-capture, you know you always have the pulse width, even if you have skipped some captures. With capture alone, you are never sure.
* Clock from an external pin, and toggle a pin, and ideally the Toggle Out, should optionally be the 'other counters' Capture pin. (or, build a /2^N Pin-Divider, which then drives capture, saves a complex counter, but costs a little silicon)
Chips with 32 bit counters AND 32 bit prescalers are appearing, and if they just connected the Prescaler OP to CtrClk OR Capture, that would work nicely.
The /2^N => Capture does not have to be a physical pin, it can be a buried connection.
With this simple silicon support, you can build Reciporocal Frequency Counters, that have very high precision, and dynamic range.
* Quadrature count, with noise filtering ?
* Usual PWM generate
If the counter is able to go 2x or 4x the core, allow it to do so. That extra time precision can be invaluable.
My understanding is that Prop 2 will not contain a Spin interpreter in ROM. The interpreter will need to be loaded from RAM. Maybe it can be linked into the Spin 2 binary file for each Spin program.
I don't think that would be a good plan really. Since the spin interpreter(VM) is loaded from external memory, it's possible to have a Spin 1 version if you really need it, otherwise run Spin 2 and let it be tuned to what's best for the Prop 2. I don't think there is a really compelling reason for them to be binary compatible. Source compatible is enough, in my opinion.
Good point.
That give any possibility to write theirs own interpreter to and load it instead on SPIN
-Phil
Even if the propII were only clocked at 100MHz and comparing that to a Prop 1 at 100MHz, spin should be, at a guess, 6x-8x faster or more!!! Remember, push & pop instruction sequences are used for almost all bytecodes and these alone are down to single instructions, as will be the bytecode fetch sequence. Hub is 2x faster without any tricks. Cogs are 4x faster. IIRC the save/restore flags are also single instruction too.
Dont forget possibility to have Stack in CLUT memory with single instruction to Save/Restore.
Can you give us an update on the Propeller 2 chip. My company needs to make some critical decisions
and a status report from you might help.
Russ
Hi Ken, Roy
Any chance of the draft instruction set document for the Propeller 2 appearing sometime soon?
Ross,
I was wondering if you could give us an update on the prop 2 chip. You guys have been very quite lately
and I hope you didn't run into any problems with the design.
The status right now is about the same as far as what we have released at the most recent EXPO's. We are working hard to finalize some of the synthesized logic, and there is a memory block consisting of RAM and ROM that needs to be layed out. When I get schematics from Chip I will begin work on that... for now I am busy with the buss lines that travel around the entire chip and bringing them to a point that the synthesized logic will be able to communicate to them.
Other than busy, that's my story.
It is generally wiser to under promise and over deliver than to do things the other way round ...
Ross.
Sorry to keep bugging you about this, but my curiosity keeps getting the best of me ...
Russ
This video only shows wiring 3 of the 92 inputs from the I/O.
Each I/O has 4 wires to the buss, so for 92 I/O's total that's 368 wires.
Another buss is for the DACs located on each I/O that require 296 wires.
Keeping track? That's a total of 664 wires on the buss that must connect to the chips Core. This is divided so that 332 wires are on the East side and 332 wires on the West side.
Video: Wiring up the I/O's on the buss