Putting smarts into the I/O pins
cgracey
Posts: 14,155
I was thinking today that CTRs do a lot of things to eventuate a state on an I/O pin, or take some reading. These complicate the cogs, of course.
How about making a simple, flexible state machine that is built onto each pin? It could do PWM, duty modulation, frequency measurement, state timing, ADC accumulation, even A/B encoding between two pins. It could be a UART, as well!
On the Prop2, there is a special signal that goes to each pin, in addition to DIR and OUT, called ALT. It is used to send serial messages to configure the pin for special modes. Prop2 has a CFGPINS instruction which sends a serial bit stream to any number of I/O's on a given port via the ALT signals, using a 32-bit mask. We could get by with using the DIR signals, instead, for this purpose, since software (taking 2 clocks per instruction) could never cause such a rapid %010xxxxx_xxxxxxxx_xxxxxxxx_xxxxxxxx message to initiate. The state machine on the pin would configure itself according to the message and operate accordingly. The DIR pin would be held high to keep the pin in the special mode. The OUT signal would then be a live input to the configured pin. Pins could send messages back via the IN signals. A simple shifter in each cog could receive a message from any pin, shifted back at the clock rate. If DIR were to go low, the pin would revert back to normal mode. This way, cogs that configure pins, but then shut down, release the pins they put into special modes.
What stuff can we make the pin state machines do? Once configured, we have OUT going to the pin and IN coming back from the pin.
How about making a simple, flexible state machine that is built onto each pin? It could do PWM, duty modulation, frequency measurement, state timing, ADC accumulation, even A/B encoding between two pins. It could be a UART, as well!
On the Prop2, there is a special signal that goes to each pin, in addition to DIR and OUT, called ALT. It is used to send serial messages to configure the pin for special modes. Prop2 has a CFGPINS instruction which sends a serial bit stream to any number of I/O's on a given port via the ALT signals, using a 32-bit mask. We could get by with using the DIR signals, instead, for this purpose, since software (taking 2 clocks per instruction) could never cause such a rapid %010xxxxx_xxxxxxxx_xxxxxxxx_xxxxxxxx message to initiate. The state machine on the pin would configure itself according to the message and operate accordingly. The DIR pin would be held high to keep the pin in the special mode. The OUT signal would then be a live input to the configured pin. Pins could send messages back via the IN signals. A simple shifter in each cog could receive a message from any pin, shifted back at the clock rate. If DIR were to go low, the pin would revert back to normal mode. This way, cogs that configure pins, but then shut down, release the pins they put into special modes.
What stuff can we make the pin state machines do? Once configured, we have OUT going to the pin and IN coming back from the pin.
Comments
Can pin pairs see each other? I am thinking differential input and output.
Love the simple uart idea. I'd need to see docs on what the state machine can do, how many states etc to really get my gears turning... sounds REALLY promising!
Interesting idea, smarter Pins are always good.
Can you expand on what registers are supplied with this ?
32 bit count plus modulus and capture/compare gets you to 96 bits already, 128 if you capture both edges.
UART needs (say) RxDiv.TxDiv, RxREG. TxREG are those 'shared' with the counter registers
There are cases where Atomic enable is important, would each pin be able to map as one-bit in 32.
That would allow start/reset/enable on up to 32 in a single time slot.
Addit: If all COGs could see these few per-pin registers, this opens a path for COG-COG communication, using some simple rules.
-Phil
Actually, the CTRs are one of the slowest-to-develop features of any Propeller. To decouple that mess from the cog is fantastic, because it simplifies the heck out of the cog design and allows the pin functions to be developed separately. We will need some CTRs for other things, but they could be a subset of what they are now on the Prop1.
Imagine we have these instructions:
OUTMSG D,S/# - send message in S to all OUT's represented by D bits. D even/odd selects OUTA/OUTB
DIRMSG D,S/# - send message in S to all DIR's represented by D bits. D even/odd selects DIRA/DIRB. Leave the affected DIR's high to maintain mode.
WAITMSG D,S/# - wait for next message from pin S/#. Configured pins keep resending their messages, so you'll get the next one.
That's all the cog interface we need to do this pin scheme. This kicks all kinds of complexity out of the CTRs.
What are you worried about losing in CTR functionality?
-Phil
1) You configure the 16 pins for PWM in parallel using DIRMSG MASK,MODE. MODE includes the PWM type and frame count.
2) For each pin, you do OUTMSG PINMASK,ONCOUNT. This sets up the ONCOUNTs for the next PWM frame.
3) Do WAITPR PINMASK to wait for a rising edge from one of the configured pins (they're all sync'd, so any pin will do)
4) Go to (2) to update all the ONCOUNTs.
In the case of the Prop1 CTRs, perhaps nothing. I just didn't want to get stuck in the morass of complicating those counters to meet some higher expectations, like on the Prop2. There is a lot of interplay between CTRs and cogs, and to standardize what that is is soothing to my mind.
Man, you are good at coming up with some complex scenarios.
Consider how this could work in the messaging arrangement I proposed.
I'm not quite following who sets the cadence here, is this SW timed, or some HW at the pin, SW fed ?
I think ONCOUNT is a pin-local value, single buffered for a toggle action ?
Does this example have a minimum pulse width that can be generated ?
Here is a good 'killer' case
Can 16 PWMs generate 16 pulses with aligned leading edges, & Pulses of 1..16 clocks wide.
Another test case:
The need to both time-capture, and edge-count, on the same pin.
That one is the classic Reciprocal Frequency Counter.
That's nice and all, but it doesn't answer my question.
I understand about the PLLs and the synthesized logic issue. 'Bummer, but that's the way it goes. Despite the DACs, DUTY mode is still nice to have. The logic modes are vital, as are the special analog modes. The P1 is like a set of very basic Tinkertoys that you can put together in unimagined ways. Any specialized port functions may be wonderful and efficient, but if they kill the Tinkertoys, they kill the Prop's overall elegance.
-Phil
Once configured, a pin does its thing. In the case of PWM, the frame would commence at the terminus of the mode message via DIRMSG. After that, you would have a whole frame period in which to update the register (via OUTMSG) that will get loaded into the PWM counter at the beginning of the next frame (or the mode message might have contained the initial value). The minimum frame time would be related to the OUTMSG time, which would probably be 18 clocks for a 16-bit data payload (%1dddddddddddddddd0 - high start bit, data, low stop bit).
But would it be so bad to have some helper stuff in the pins, themselves, to take care of high-bandwidth headaches that would otherwise hold a CTR hostage - and maybe even the cog?
Imagine a pin configured to sum up some number of ADC feedback pulses, and continually send the current value back via the IN pin, where you could use a WAITMSG instruction to grab the next message over the next umpteen clocks? Wouldn't that be awesome? It seems freeing to me, and I love the idea that the details of this don't get mixed up in the cog design.
1..16 clocks wide, yes, but the frame size would need to be big enough to allow all PWM counts to be reloaded in time.
What do you mean by time-capture and edge-count on the same pin? It sounds like you are measuring period and frequency at the same time. Sounds intriguing, but I'm missing something.
...This serial thing is kind of brings back to mind the CRU bus on the TMS9900 CPU, only much faster of course.
What you call a PWM counter here, is a down counter, setting a time-to-next toggle, right ?
The host SW has to poll the pin and then send a next-time-vaue ?
That gives a minimum time of [poll_end+message_send],and the value in the counter is not the True Pulse Width, but added Clocks ?
ie True Pulse Width = Time[poll_end+message_send]+ Sent_Value ?
I think more registers will be needed 'at the pin', but a serial means to (re) configure is probably ok, at ~18 clocks.
The use cases I can think of, do not update rapidly, but the pin-precision does need to be 1..N clock accurate.
-Phil
Maybe we're talking past each other here, but you'd set a value to be used in the next frame. It would get picked up and used when the time came. You could create a 0-length pulse, a 1-clock pulse, or any other size pulse, up to and including the frame count. Is that not PWM?
Ah, so there is single buffering, and the width you send, is what will next be loaded, when this one finishes ?
That's 2/3 of the register resource needed for True PWM.
If you just add a 3rd local register for PWM compare value it means the COG can step back, and not need to continually send the two alternating 'next_width' values.
Yes, each (rising) edge increments an edge count, and each edge can also, (under SW message control) capture values from a SysCLKd TIME counter and the Edge Counter.
ie one edge can fire 3 events, one INC and 2 CAPT on Two counters.
Capture of both captures needs to be atomic, so the T,E values are free of aperture errors and are known to be for the same edge. Those are then read at leisure, and calculations done.
Usually, on a Classic uC, you can set up 2 timers to do this, but often the Atomic is tricky if the capture enables are in two separate SFR registers.
Phil, this simplifies the cog design. I can now focus on getting that thing tuned up without the complex interplay that a counter sophisticated enough to handle the new pins would introduce. This is an awesome thing! And we can develop those pin circuits outside of worrying about how it affects the cog.
I'm in phase now - I now realize there is a buffer register included, do you have 2 x 32 bit locals, one storing, and one counting.
That's why I was asking how many pin-local registers this uses.
Do the basic I/O instructions still work the same way? ie Can I set OUTA = #0 and then toggle DIRA to enable pul-down ?
How much delay will be inherent for setting OUTA to it appearing on the pin? And for reading the pin with INA?
Are these little state m/cs (per pin) going to chew much power?
Might these things be better left for P2+ ???
Sounds worth exploring,at least.
The Standard counter operation does need some fixes, but ideally anything should be a super-set approach.
There is some fundamental sense in acting at the pin, so long as the logic is tolerable.
Can this local logic pack under the bonding pads ?
We're talking about the same thing, I think. When you configure the PWM, you set the number of clocks in a frame. You then feed it values via OUTA which get captured and loaded at the beginning of the each frame. If you don't send a new value, it just reuses the old one. The cog can do whatever else it wants. Isn't this PWM?
But why is one measurement or the other not adequate? Both values may not even correlate well.
It would be near the pin. Doing this also lets me get rid of all the 1.8V configuration junk that is now in our I/O pin circuit. We would feed probably 20 signals into the actual I/O pin, which would be almost stateless, itself. This gets the digital side into the synthesized logic where it will be closed reliably.
My reading is SW I/O is completely unchanged, this is an optional-enable.
When not used, the power is only in the clock routing, and that was already going to the pins.
If they let the COGs sleep longer these smarter pins can actually slash power.
They only take power if you use them.
Without configuring a pin via DIRMSG, it acts as a normal digital I/O. If a pin has been configured for some special mode, the DIR bit is left high. If that pin then goes low, the mode is cancelled and the pin reverts to being a normal I/O. DIR can be set high or low in software without triggering any special modes, because instructions take two clocks, so the pin will see only this kind of activity on DIR: %0011001100... It will never see the %010 message header as a result of normal instructions. Only the DIRMSG instruction will create that fast pattern to signify that a message is coming. The pin, when it sees that message, will hold its current mode, and then switch to the new mode in one clock, after the message is complete.
We are going to need something like this, anyway, to set the analog modes. This whole proposal is just about augmenting the concept, I now realize.
Exactly! And good observation about power. I didn't realize that, but it's true. Better to have the cog in a WAITCNT while the measly pin circuit does its job.
Now I am confused.
If you fail to update a Buffered-Toggle Counter, then you get 50% output on the repeated count-value.
Not quite PWM.
If you want true PWM, you need (min) 3 sets of local registers. One counting, and one holding the Reload value and one holding the compare value, and usually that compare is also buffered, so it delay-updates on Counter-reload, making 4 regs.
This PWM can go to 0 on a register value, and a single register sets PWM, the other sets Freq.
An alternative is two Reload registers, that alternate to set TimeH and TimeL, if users want fixed frequency PWM they need to change both values, and this cannot go to 0% in the basic form. (single clock width pulses are the limit)
Like this:
1) You set PWM mode with a frame_count and, for the sake of discussion, an initial on_count.
2) X=0, Y=on_count
3) if X == frame_count then X=0 and Y=on_count, else (X++, if Y<>0 then Y--, pin output = Y<>0)
4) accept new on_time via OUTMSG, while staying in (3)
That's what I've always understood PWM to be, anyway. There's a frame_count and an on_count which fits within the frame_count.