A reload in both modes waits for the next base period rollover.
Both screenshots show that. How is that not the same?
From first trigger, Transition mode waits a dT until the first edge, then toggles every dT
Pulse mode has a long gap/delay of 2x the pulse widths, before the pulses start.
The pulse mode is still only waiting 1x base period.
The pulse width is a fraction of the base period as set by the x[31:16] value.
In this case half the base period.
@jmg
Here's the same pulse test with the first pulse pin with a shorter pulse width but same base period.
Yes, I still think that long lead-in time is incorrect, as is the hand-over effect.
In most user cases, a more immediate output-from-trigger is expected.
If someone is doing FM modulation / FSK, there is a case for no invalid frequencies, in which case Toggle mode with handover fits.
On MCUs there are PWM modes, where you can vary both Threshold and Reload values separately, on the fly.
ie you can change frequency slightly, and PWM still behaves normally.
I'm not sure how you would do that from these traces ?
Consider data from a QSPI interface.
Iy's easy enough to set up a streamer to burst nibbles at x frequency for n clocks.
The streamer has it's own NCO clock and no external clock input/output.
So a smartpin can be used to generate n clocks for the QSPI.
It starts getting messy having to fudge the launch time of the smartpin to sync with the streamer.
It would be nice if these two hardware blocks cooperated with each other.
dirl #tpin
wxpin #5, #tpin '5 clock period for shortest setup lag
dirh #tpin 're-sync Smartpin to the program execution
wxpin #60, #tpin 'Setup: 60 clocks per transition period (operate on whole periods)
wypin #4, #tpin 'Setup: 4 transitions (this starts a fresh countdown of Y)
outh #marker 'the 5 clocks has expired and the new setup takes effect
nop
outl #marker
I've noticed that unlike the period (X) setting, the transition/pulse (Y) setting has to be set after being released from reset. It appears to be zero until set. The reason will be because Y is directly modified by the countdown of transitions/pulses, ie: Its starting value is lost immediately upon action.
I've noticed that unlike the period (X) setting, the transition/pulse (Y) setting has to be set after being released from reset. It appears to be zero until set. The reason will be because Y is directly modified by the countdown of transitions/pulses, ie: Its starting value is lost immediately upon action.
Yeah, that's mentioned in Chip's documentation. Answering an earlier question (I think), here's what I would like to see:
* The internal counter (whose period is set by WXPIN) is not counting when DIR is low.
* WYPIN implicitly sets DIR high.
* When DIR is transitioned (either explicitly or implicitly) from low to high, the internal counter is reset and starts counting.
So that...
* If you want to have the mode's behavior start on a period boundary (like it does now), you issue a DIRH followed by a WYPIN (within the first period's duration).
* If you want to have the mode's behavior start immediately, you just issue the WYPIN (which implicitly sets DIR high).
For instance, in pulse mode, issuing WYPIN when DIR is low would cause the output to immediately go high. On the other hand, if you first issued a DIRH immediately followed by a WYPIN, then the output wouldn't go high until the end of the first period.
In transition mode, issuing WYPIN when DIR is low would cause an immediate transition. On the other hand, if you first issued a DIRH immediately followed by WYPIN, then the first transition wouldn't occur until the end of the first period.
Consider data from a QSPI interface.
Iy's easy enough to set up a streamer to burst nibbles at x frequency for n clocks.
The streamer has it's own NCO clock and no external clock input/output.
So a smartpin can be used to generate n clocks for the QSPI.
It starts getting messy having to fudge the launch time of the smartpin to sync with the streamer.
It would be nice if these two hardware blocks cooperated with each other.
Agreed, this cooperation detail will be quite important.
* The internal counter (whose period is set by WXPIN) is not counting when DIR is low.
* When DIR is transitioned (either explicitly or implicitly) from low to high, the internal counter is reset and starts counting.
Those two are already true. That's why a reset is predictable as per my above example that Oz just tested.
* WYPIN implicitly sets DIR high.
Good idea. For a moment I was thinking it messed with gang synchronising but then realised the only way to do it at the moment is with my above method and that has to already be ticking before Y is set, so ganging doesn't change, but singling becomes simpler.
It's takes another register in the Smartpin but being able to preset Y before releasing the reset would probably be the best though.
Yes. As well as that, it seems like it could also be useful to have a register/bit to atomic sync the Streamer ?
I think? Chip already has it so you can prime many smart pins, and start all on one clock edge, so that feature needs to be retained.
Ah .... how about an implicit DIRL at the end of the count instead. It should be able to accept a preset Y without a whole other register that way.
The tricky part for Chip is to trigger off both the termination of the transition count at the start of the final period then secondly the controlling DIR low at the end of the final period. There is probably a couple of extra state bits needed.
He starts the Streamer first and has simply experimented to find out how many words are read/written by the Streamer ahead of the Smartpin clocking actually transferring any data to/from the HyperRAM. Then has subsequently accommodated that in the Streamer buffering.
He appears completely oblivious to the leading and trailing null periods in that Smartpin mode but is not really any the worse off for it.
He starts the Streamer first and has simply experimented to find out how many words are read/written by the Streamer ahead of the Smartpin clocking actually transferring any data to/from the HyperRAM. Then has subsequently accommodated that in the Streamer buffering.
He appears completely oblivious to the leading and trailing null periods in that Smartpin mode but is not really any the worse off for it.
That's a clear win in my books.
What's a clear win? That he wasn't affected by the issues we are discussing? It's good to know that the primary intent of the mode works, but I still think we need better control over the leading/trailing (depending on how you look at it) delays.
It's a win because it was so easy to use. And the fact that for a databus interface management it's not as important as we first presumed to have full control of the starting point. Just as long as it's predictable. And a Smartpin reset is predictable.
The buffer formatting compensation used by Richard will be difficult to remove altogether me thinks. There would have to be a 1-of-64 gating or something added to the Streamers.
I feel the only real issue, for stepper motion control, that Chip should have a look at, is eliminating the trailing null period on mode %00100.
Yes EvanH, you are right - I was oblivious to the leading and trailing null periods in the Smartpin mode that I used in my HyperRAM code !
Back in early June I did remark that it would be very useful to have some documentation/control over the relative timing of streamer/start pin operation.
I’ve long wanted a HyperRAM solution up-and-running to use in more demanding scientific applications - and with the fairly limited information I had at the time I just took the practical, experimental approach and used a logic analyzer to work out how to get the streamer and Smartpin to cooperate. It’s great to see that all the subtle details are being discussed/fleshed out here as they’ll help us see how much one can do with/what limitations there are when working with a P2 in real-world applications.
Looking back at my code again I set the Smartpin for 2 cycles high, with a period of 4 cycles (generating HR CLK), while the streamer data transfer rate was configured with an setxfrq ##$4000_0000. At one stage I wondered about pushing data transfer rates even higher by going to Smartpin 1 cycle high/2 cycle period and changing the setxfrq long to $8000_0000 but maybe those leading/trailing delays you guys mention would be problematic, plus there'd then likely be a major issue getting correct read timing - which really needs to use the RWDS signal (delayed from clock edges by 7 ns). Quite a bit more complicated than what we have at the moment - but I'm sure someone will find a way...
Richard,
Very good work, I'll just say, for taking on using of the hardware.
As for doubling the Hyperbus clock to 0.5 of sys-clock: Reading of Hyperbus will be easy to move to edge aligned timing transitions and still meet the timings, but writing to Hyperbus won't be so easy. This is because, with the clock line being emulated with a data pin, the internal Prop2's latching times are phase shifted to an earlier time with respect to the Hyperbus clock.
May have to have writes/commands at half speed of reads.
May have to have writes/commands at half speed of reads.
or, maybe the smart pin needs a user choice of a negative-edge half-SysCLK delay option at the pin ?
that's one D-FF and a Config Control bit ?
or a Streamer cell mode that supports DDR/DTR natively, which would allow higher bandwidths ?
Anything like that would have to be a special part of the custom pin cell. A Smartpin can't do it. If Chip can keep getting freebies on the test dies then I guess that could be up for grabs still.
Yikes, that was a bit assertive of me. I'm not that certain. I've not done any HDL myself but it seems to me falling edge clocking will not be an option.
Yikes, that was a bit assertive of me. I'm not that certain. I've not done any HDL myself but it seems to me falling edge clocking will not be an option.
Falling edge is easy enough to code, but it is generally avoided as timing checks are not as simple.
That's why you only use it in very local cases, such as DDR pin cells, or DDR delays.
As to if it needs to be in the custom area - good question, I'm not sure on where Chip has placed the first pin sampling/drive flip flops.
I believe the outer ring is Buffers/ESD/MUX, but no registers, but that may be wrong ?
Seairth,
I've just poked modes %00100 "pulse/cycle output" and %00101 "transition output". Both are pulse streaming output modes.
Neither of them are buffered, so not particularly great for stepper use.
%00101 counts out transitions and is not too bad at chaining successive segments. To prevent any glitch in the pulse timing, when directly waiting, it requires a minimum of 12 clocks setting for the transition period. Longer periods would be required when chaining segments via interrupts.
%00100 appears to be flawed, it has a built-in trailing bias that prohibits seamless chaining of segments. See attached snap, I've intentionally used lower frequency to highlight that the bias is based on the length of the pulse period.
I've been going over this %00100 smart pin mode. It works the way I intended it to. The design goal was to have it generate Y clock cycles, then make IN high AFTER the cycles were complete. In cases where this will get used, it will be important to know that all transitions are complete, so you can do some other signalling afterwards. Yesterday I made it work like mode %00101, but it was not very tidy because of the period reloading, and you couldn't tell when it was actually done, so I put it back the way it was, originally.
%00100 appears to be flawed, it has a built-in trailing bias that prohibits seamless chaining of segments. See attached snap, I've intentionally used lower frequency to highlight that the bias is based on the length of the pulse period.
I've been going over this %00100 smart pin mode. It works the way I intended it to. The design goal was to have it generate Y clock cycles, then make IN high AFTER the cycles were complete. In cases where this will get used, it will be important to know that all transitions are complete, so you can do some other signalling afterwards. Yesterday I made it work like mode %00101, but it was not very tidy because of the period reloading, and you couldn't tell when it was actually done, so I put it back the way it was, originally.
So that means the trailing bias is still there ? - this seems to be an effect whereby the previous settings, carry-forward into the next pulse frame.
Or you can also think of this as a lead-in delay effect, where the first edge is not quite where expected, as the something 'old' needs to be flushed first..
Similar to if you imagine a BAUD Speed register in a UART, that updated after the Start BIT end.
The trailing bias is still there. The 16-bit timebase you set becomes the granularity of all activity. This has a hidden value, though, in that you can start up a whole bunch of pins at once, then write their pulse counts before the current timebase period, and they will all start in sync. And later, when you update each one, the whole group will remain locked to the same timebase.
You can always reset a smart pin by dropping its DIR low for a clock or two.
Comments
Pulse mode has a long gap/delay of 2x the pulse widths, before the pulses start.
The pulse width is a fraction of the base period as set by the x[31:16] value.
In this case half the base period.
Here's the same pulse test with the first pulse pin with a shorter pulse width but same base period.
Yes, I still think that long lead-in time is incorrect, as is the hand-over effect.
In most user cases, a more immediate output-from-trigger is expected.
If someone is doing FM modulation / FSK, there is a case for no invalid frequencies, in which case Toggle mode with handover fits.
On MCUs there are PWM modes, where you can vary both Threshold and Reload values separately, on the fly.
ie you can change frequency slightly, and PWM still behaves normally.
I'm not sure how you would do that from these traces ?
Consider data from a QSPI interface.
Iy's easy enough to set up a streamer to burst nibbles at x frequency for n clocks.
The streamer has it's own NCO clock and no external clock input/output.
So a smartpin can be used to generate n clocks for the QSPI.
It starts getting messy having to fudge the launch time of the smartpin to sync with the streamer.
It would be nice if these two hardware blocks cooperated with each other.
Here's your code in action.
Yeah, that's mentioned in Chip's documentation. Answering an earlier question (I think), here's what I would like to see:
* The internal counter (whose period is set by WXPIN) is not counting when DIR is low.
* WYPIN implicitly sets DIR high.
* When DIR is transitioned (either explicitly or implicitly) from low to high, the internal counter is reset and starts counting.
So that...
* If you want to have the mode's behavior start on a period boundary (like it does now), you issue a DIRH followed by a WYPIN (within the first period's duration).
* If you want to have the mode's behavior start immediately, you just issue the WYPIN (which implicitly sets DIR high).
For instance, in pulse mode, issuing WYPIN when DIR is low would cause the output to immediately go high. On the other hand, if you first issued a DIRH immediately followed by a WYPIN, then the output wouldn't go high until the end of the first period.
In transition mode, issuing WYPIN when DIR is low would cause an immediate transition. On the other hand, if you first issued a DIRH immediately followed by WYPIN, then the first transition wouldn't occur until the end of the first period.
Agreed, this cooperation detail will be quite important.
Good idea. For a moment I was thinking it messed with gang synchronising but then realised the only way to do it at the moment is with my above method and that has to already be ticking before Y is set, so ganging doesn't change, but singling becomes simpler.
I think? Chip already has it so you can prime many smart pins, and start all on one clock edge, so that feature needs to be retained.
The tricky part for Chip is to trigger off both the termination of the transition count at the start of the final period then secondly the controlling DIR low at the end of the final period. There is probably a couple of extra state bits needed.
Yes, that would be better. That way you can still use DIR to synchronize several smart pins.
He starts the Streamer first and has simply experimented to find out how many words are read/written by the Streamer ahead of the Smartpin clocking actually transferring any data to/from the HyperRAM. Then has subsequently accommodated that in the Streamer buffering.
He appears completely oblivious to the leading and trailing null periods in that Smartpin mode but is not really any the worse off for it.
That's a clear win in my books.
What's a clear win? That he wasn't affected by the issues we are discussing? It's good to know that the primary intent of the mode works, but I still think we need better control over the leading/trailing (depending on how you look at it) delays.
The buffer formatting compensation used by Richard will be difficult to remove altogether me thinks. There would have to be a 1-of-64 gating or something added to the Streamers.
I feel the only real issue, for stepper motion control, that Chip should have a look at, is eliminating the trailing null period on mode %00100.
Back in early June I did remark that it would be very useful to have some documentation/control over the relative timing of streamer/start pin operation.
I’ve long wanted a HyperRAM solution up-and-running to use in more demanding scientific applications - and with the fairly limited information I had at the time I just took the practical, experimental approach and used a logic analyzer to work out how to get the streamer and Smartpin to cooperate. It’s great to see that all the subtle details are being discussed/fleshed out here as they’ll help us see how much one can do with/what limitations there are when working with a P2 in real-world applications.
Looking back at my code again I set the Smartpin for 2 cycles high, with a period of 4 cycles (generating HR CLK), while the streamer data transfer rate was configured with an setxfrq ##$4000_0000. At one stage I wondered about pushing data transfer rates even higher by going to Smartpin 1 cycle high/2 cycle period and changing the setxfrq long to $8000_0000 but maybe those leading/trailing delays you guys mention would be problematic, plus there'd then likely be a major issue getting correct read timing - which really needs to use the RWDS signal (delayed from clock edges by 7 ns). Quite a bit more complicated than what we have at the moment - but I'm sure someone will find a way...
Very good work, I'll just say, for taking on using of the hardware.
As for doubling the Hyperbus clock to 0.5 of sys-clock: Reading of Hyperbus will be easy to move to edge aligned timing transitions and still meet the timings, but writing to Hyperbus won't be so easy. This is because, with the clock line being emulated with a data pin, the internal Prop2's latching times are phase shifted to an earlier time with respect to the Hyperbus clock.
May have to have writes/commands at half speed of reads.
or, maybe the smart pin needs a user choice of a negative-edge half-SysCLK delay option at the pin ?
that's one D-FF and a Config Control bit ?
or a Streamer cell mode that supports DDR/DTR natively, which would allow higher bandwidths ?
That's why you only use it in very local cases, such as DDR pin cells, or DDR delays.
As to if it needs to be in the custom area - good question, I'm not sure on where Chip has placed the first pin sampling/drive flip flops.
I believe the outer ring is Buffers/ESD/MUX, but no registers, but that may be wrong ?
I've been going over this %00100 smart pin mode. It works the way I intended it to. The design goal was to have it generate Y clock cycles, then make IN high AFTER the cycles were complete. In cases where this will get used, it will be important to know that all transitions are complete, so you can do some other signalling afterwards. Yesterday I made it work like mode %00101, but it was not very tidy because of the period reloading, and you couldn't tell when it was actually done, so I put it back the way it was, originally.
Or you can also think of this as a lead-in delay effect, where the first edge is not quite where expected, as the something 'old' needs to be flushed first..
Similar to if you imagine a BAUD Speed register in a UART, that updated after the Start BIT end.
Did that comment mean a dummy reset can remove that lead-in/carry-forward effect in %00100 smart pin mode ?
You can always reset a smart pin by dropping its DIR low for a clock or two.