Is it OK to change APIN value in CTRx when it's running?

ags · 2011-08-02 22:56

I have a project where I'm shifting 16 bits of data out one pin into a shift register. That requires a "shift clock". Once the 16-bit word is shifted into the (external) shift register, I need to apply a single clock pulse to a latch clock on that external register to make all 16 bits visible on the register's output pins.

I think I have the code figured out (thanks for the help) to shift the serial data out. I think I want to just "on the fly" reconfigure the APIN of CTRx so that the shift clock stops and the latch clock starts - for just one pulse. Is that OK, or will doing so induce glitches on the CTRx APIN during the "reconfiguration" of the counter output?

Here's the PASM code that I've come up with. I have a logic analyzer on the way to check my work, but it's a week away from arriving. Thanks!

        'setup FRQA value resulting in generated clock being 1/4 the system clock frequency (20MHz w/80MHz system clock)
        movi  frqa, #%0100_0000_0
        'setup the I/O pin to output the serial clock (in this case, hard-coded to I/O pin 0)
        or    dira, #1
        movs  ctra, #0
        'setup the I/O pin to output serial data (in this case, hard-coded to I/O pin 1) 
        or    dira, #2
        movs  ctrb, #1
        'setupt the I/O pin to output the latch clock (in this case, hard-coded to I/O pin 2)
        or    dira, #4                
        'configure CTRB for NCO single-ended operation - but with FRQB cleared there will be no accumulation into PHSB
        'this means that PHSB[31] will be forced onto CTRB:PINA each system clock
        movi  ctrb, #%0_00100_000     
        'all ready now - just need to load serial value into PHSB and start CTRA
shiftWord
        mov   phsb, serialOutputWord
        'configure CTRA for NCO single-ended operation (accumualate FRQA into PHSA each clock)
        'PLLDIV field is not used for this CTRx mode
        movi  ctra, #%0_00100_000               'bit 0 clocked
        shl   phsb, #1                          'bit 1 clocked
        shl   phsb, #1                          'bit 2 clocked
        shl   phsb, #1                          'bit 3 clocked
        shl   phsb, #1                          'bit 4 clocked
        shl   phsb, #1                          'bit 5 clocked
        shl   phsb, #1                          'bit 6 clocked
        shl   phsb, #1                          'bit 7 clocked
        shl   phsb, #1                          'bit 8 clocked
        shl   phsb, #1                          'bit 9 clocked
        shl   phsb, #1                          'bit 10 clocked
        shl   phsb, #1                          'bit 11 clocked
        shl   phsb, #1                          'bit 12 clocked
        shl   phsb, #1                          'bit 13 clocked
        shl   phsb, #1                          'bit 14 clocked
        shl   phsb, #1                          'bit 15 clocked
        'stop serial clock, start latch clock (in this case, hard-coded to I/O pin 2)
        movs  ctra, #2
        'stop latch clock
        mov   ctra, #0

kuroneko · 2011-08-02 23:31

ags wrote: »

I think I have the code figured out (thanks for the help) to shift the serial data out. I think I want to just "on the fly" reconfigure the APIN of CTRx so that the shift clock stops and the latch clock starts - for just one pulse. Is that OK, or will doing so induce glitches on the CTRx APIN during the "reconfiguration" of the counter output?

Well, if the counter output changes state the moment you switch pins you may get something you don't want. ATM the falling edge of your clock coincides with the pin change. While this isn't a problem for the old pin (its outa bit will keep it at 0 anyway) the new pin may see the back of the falling edge depending on which is faster, the pin mux or the counter output. If you want to play it safe you can simply move the clock forward so its falling edge happens one cycle earlier. This would require phsa to be preset with frqa.

However, this may violate the setup time for your registers (now 1 system clock cycle instead of 2). Another option would be - if the registers can deal with asymmetric clock(s) - to use a DUTY based clock (1:4 ratio).

For now I'd just see if the registers can cope with the current setup and go from there. It may well just work.

Peter Jakacki · 2011-08-02 23:32

I can't see a problem with changing the pin but on another note you mentioned you were waiting on a logic analyzer. In the old days when men were men (and sheep were scared) I would just slow the clock down and even pulse it manually so that I could observe the bus activity directly with a probe. Or else I programmed a micro to act as a clock for the slave micro and then sampled the pins in question through the host I/O and apply some fancy formatting to display the results.

There is no reason why you can't do this right now yourself, either the slow clock or manual pulsing or host controller clocking. Of course you will have to disable the PLL by setting the clock mode for external clock XINPUT.

ags · 2011-08-03 05:46

@kuroneko: I follow your comments about moving the clock forward, and the potential impact on violating setup for shifting data into the register, and also the idea of using a 1:4 duty cycle. I believe the latter will still meet the shift register timing requirements.

I asked the question because I thought - and your answer confirms - that I'm missing something about how chaning the APIN value happens. You provided some nice ASCII art recently in another thread, where you broke each Prop instruction down into the 4 system clock time periods to demonstrate the timing of what I think I've coded. You used "SDeR" which I presumed coincided with Figure 4 on page 9 of the Propeller Datasheet v1.2. I took that to mean "Source fetch" "Destination fetch" "execute" and "write Results". (I understand the point of the overlapping stages for two adjacent instructions, but still am confused why there are 5 labeled statges (1-5) instead of 4.) Anyway, what I think I've coded results in the shift clock going high on the "D" stage and low on the "e" stage (that is, clock is high for "D+e" and low for "R+S"). If the new CTRx APIN value is stored in the "R" stage, doesn't that give the following "S" stage where the clock remains low before it (now on the new APIN instead of the old) would go high? In other words, for an 80MHz system clock, isn't there 12.5nSec between writing the new APIN value and the CTRx clock going high? Since data is shifted into the register on the rising clock edge, that would mean a full 50nSec between the last bit shift into the register and the latch clock, which would meet that part's timing.

I agree that I should just fire the code up and see if it "just works" - but if I'm missing something in the details of exactly how the Prop timing works during instruction execution, I would like to understand so that I may have a more complete grasp of what is going on and hopefully be able to write more robust code.

Thanks.

ags · 2011-08-03 05:55

Peter Jakacki wrote: »

I can't see a problem with changing the pin but on another note you mentioned you were waiting on a logic analyzer. In the old days when men were men (and sheep were scared) I would just slow the clock down and even pulse it manually so that I could observe the bus activity directly with a probe...

Those are some creative ways to avoid using/needing a LA. However, isn't it likely that some of the fine details of the internal Properller instruction exectution timing may be masked when chaning the system clock from a 12.5nSec period to much slower, 1K or even 1M times slower?

kuroneko · 2011-08-03 06:56

You'll notice that there is one unlabled cycle (which brings it up to 2+4=6). The instruction fetch and decode cycles are overlapping with the previous instruction's execute and result cycles, e.g.

IdSDeR
    IdSDeR
        IdSDeR

So for normal timing consideration you can ignore the Id part.

ags wrote: »

Anyway, what I think I've coded results in the shift clock going high on the "D" stage and low on the "e" stage (that is, clock is high for "D+e" and low for "R+S"). If the new CTRx APIN value is stored in the "R" stage, doesn't that give the following "S" stage where the clock remains low before it (now on the new APIN instead of the old) would go high? In other words, for an 80MHz system clock, isn't there 12.5nSec between writing the new APIN value and the CTRx clock going high?

The way I look at instruction cycles (black box observation) is that there is an active clock edge in there somewhere which samples/latches inputs and generates outputs. So as a result of D-stage your clock goes high and two cycles later (R-stage) goes low (active clock edge forces counter update)

'         S D e R S D e R S D e R
'   phsa  0 0 0 0 1 2 3 0 1 2 3 0
'   pinA  __________/###\___/###[COLOR="red"]\[/COLOR]_
'   swap  ----------------------[COLOR="red"]|[/COLOR]-

The pin change itself also happens in R-stage. So you have effectively 2 cycles between pin change and clock going high (again). What I'm worried about is that the pin change may be faster then the counter output can actually change from high to low during R-stage, i.e. old pin is cut off but its outa setup keeps it low which is fine. The new pin gets the counter output (high) which is about to go low so you may see a spike in there (unlikely but possible). OTOH, the pin mux may be slow enough for the counter to be already low then that's a non-issue. That's why my DUTY suggestion:

'         S D e R S D e R S D e R
'   phsa  0 0 0 0 1 2 3 0 1 2 3 0
'   pinA  __________/#\_____/#[COLOR="red"]\[/COLOR]___
'   swap  ----------------------[COLOR="red"]|[/COLOR]-

Here the counter has its high/low transition during e-stage (pin change - like all register writes - is still during R) which keeps them well apart from each other. Are we still on the same page?

ags · 2011-08-03 11:01

kuroneko wrote: »
You'll notice that there is one unlabled cycle (which brings it up to 2+4=6). The instruction fetch and decode cycles are overlapping with the previous instruction's execute and result cycles, e.g.
IdSDeR
IdSDeR
IdSDeR
So for normal timing consideration you can ignore the Id part.

Yes, that's clearer. I understand that there's a pipelining that overlaps the 6 stages of execution for efficiency: instruction (pre)fetch; instruction decode; source fetch; destination fetch; instruction execution; write results. The first two (for the next instruction) overlap the last two (for the current instruction). I just didn't undertand why it would be called (and labelled) a "5-stage" sequence - there are six distinct clock cycles, and for each cycle there is something happening with a register (that is, when executing the current instruction (no register access required) the next instruction is prefetched (register access required) and when decoding the (next) instruction (no register access required) the current instruction results are written (register access required)). I guess it's just terminology, but I found it confusing that the 2nd clock cycle (labelled "M+1" underneath the diagram) wasn't a numbered stage (on the top of the diagram). Why doesn't decode of the next instruction and writing the result of the current instruction count for something? Well, nevermind that, it's just the way my mind works, I guess.

' S D e R S D e R S D e R
' phsa 0 0 0 0 1 2 3 0 1 2 3 0
' pinA __________/#\_____/#[COLOR=red]\[/COLOR]___
' swap ----------------------[COLOR=red]|[/COLOR]-
Here the counter has its high/low transition during e-stage (pin change - like all register writes - is still during R) which keeps them well apart from each other. Are we still on the same page?

So I do see what you are saying. I was thinking that whatever the delay from the Prop system clock edge (rising or falling, doesn't matter as long as it's consistent) to something happening, it would be consistent. If it were true that the output value on the I/O pin specified as CTRA's APIN changed very late (relative to the system clock - in the "R" stage of your diagram) and the contents of the CTRA register itself (changing the APIN field) were written very early (relative to the system clock - in the "S" stage of your diagram) then their could potentially be a glitch (on the new I/O pin being configured as the counter APIN, but not the old I/O pi previously configured as the counter APIN) as the transition of the counter-generated clock "interfered" with changing the the APIN value of the CTRA register.

For this, I guess I'd either need to get a ruling from folks at Parallax themselves, or put a relatively fast logic analyzer on the pins and see for myself. Which brings me back to waiting for that LA to show up!

Thanks again for your thoughtful and thorough responses.

ags · 2011-08-05 08:23

@kuroneko:my previous post was pretty long. There are two hidden questions (which I should have made clearer) which may make sense if that longer post was on track:

1) Any idea why the "instruction pipeline stages" diagram (Propeller Datasheet v1.2, page 9, Figure 4) labels & discusses 5 stages? I can understand 6 (2 (execute and write-result) for instrution N-1, those same 2 (prefetch and decode) and 4 for instruction N) or 4 (ignore prefetch and decode) - but not 5. Am I missing anything important? That just doesn't make sense to me.

2) Is there a specific reason why you would suspect that the CTRx register (or perhaps all registers) will have different timing (when the new value is actually stored in the "R" stage of the instruction) than when the counter (or actually, the CTRx:APIN or BPIN) value would update? I assume that is a totally different circuit path than "normal" register updates, so it could be possible. Is that the basis of your concern? (Not that you have said it *will* be different, just that it could be different).

Thanks.

cgracey · 2011-08-05 10:38

The pin-select fields in the CTRx registers go straight to mux's which select the pins of interest. If you change these pin-select fields, you will just be looking at a different pin on the next clock cycle. There's no crack for anything to fall into.

kuroneko · 2011-08-05 16:37

ags wrote: »

1) Any idea why the "instruction pipeline stages" diagram (Propeller Datasheet v1.2, page 9, Figure 4) labels & discusses 5 stages? I can understand 6 (2 (execute and write-result) for instrution N-1, those same 2 (prefetch and decode) and 4 for instruction N) or 4 (ignore prefetch and decode) - but not 5. Am I missing anything important? That just doesn't make sense to me.

I wouldn't read too much into it. See them as references so the text can refer to them. The second cycle is IIRC the decoding stage which doesn't really add anything to the discussion. As I said before, for normal timing considerations you have 4 visible cycles.

ags wrote: »

2) Is there a specific reason why you would suspect ...

There is a [thread=127653]different delay[/thread] from the counter outputs to the actual pins depending on which cog drives which pin (phsx more so than outa). Monitoring that with a waitpxx can give you differences of up to one clock cycle (@80MHz). While not directly related to the issue at hand it raised a flag. It may well be (and most likely is) that the pin mux voids any delay as it opens a different path (starting closer to the driving output) but it's a bit hard to say from looking at a black box.

So as Chip noted, nothing to worry about

Is it OK to change APIN value in CTRx when it's running?

Comments