Waitse1 problem with Cog0

macca · 2022-07-10 12:13

Hello,

Time for another issue that probabily is something I don't understand...
The title is a bit of a spoiler but can someone run the following code and tell me what is the debug output ?

CON

    _CLKFREQ = 160_000_000

    REPO_PIN = 7
    SP_REPO1_MODE = %01_00001_0 | 1 << 16    ' %TT_MMMMM_0, P[12:10] != %101

DAT
                org     $000
main
                asmclk

                dirl    #REPO_PIN
                wrpin   ##SP_REPO1_MODE, #REPO_PIN
                dirh    #REPO_PIN

                setse1  #%001_000000 | REPO_PIN

                debug("-- wait --")
                waitse1
                debug("-- posted --")

                jmp     #$

The code is apparenly a nonsense but I'm experimenting with long repository mode and event notifications. If the setup is correct, it should wait for a long to be written to the smartpin.

evanh · 2022-07-10 12:41

You'll find that events are nominally tripped. They have to be cleared upon first config ... and you can't follow a SETSEx immediately with a POLLSEx ... so init code should be like this:

        dirl    #REPO_PIN
        wrpin   ##SP_REPO1_MODE, #REPO_PIN
        setse1  #%001_000000 | REPO_PIN
        dirh    #REPO_PIN
        pollse1

macca · 2022-07-10 12:54

@evanh said:
You'll find that events are nominally tripped. They have to be cleared upon first config ... and you can't follow a SETSEx immediately with a POLLSEx ... so init code should be like this:

Oh c'mon... is that written somewhere ?

Some additions, that I'll have posted after the answers, are that if you use event 2 (setse2 / waitse2) it works first time, also if you run the same code in another cog it works first time. To me it looks an issue with Cog0 and event1! Maybe something to do with the loader firmware ?

I'm starting to get tired of losing hours banging the head around because of some note well buried somewhere...

evanh · 2022-07-10 13:23

The docs do tell a lot but for this type of behaviour I just try the combinations to see what works. Your test case is exactly how I do that. Make it simple on the one question, then shuffle and insert as different ideas come to mind.

You'll find similar gotcha details with smartpins and streamers too.

evanh · 2022-07-10 13:26

@macca said:
Some additions, that I'll have posted after the answers, are that if you use event 2 (setse2 / waitse2) it works first time, also if you run the same code in another cog it works first time. To me it looks an issue with Cog0 and event1! Maybe something to do with the loader firmware ?

Interesting. Then it'll be more likely prior uses of Cog0. Cog0 gets heavily used, including using events, during boot-up.

evanh · 2022-07-10 13:35

There is a note in the docs about what instructions will clear the event. It lists SETSEx separately in a somewhat cryptic "SEn is cleared when matched 'SETSEn D/#' is called.". I believe that is saying it only clears when the same mode bits are reused. Which of course means the same pin number too.

EDIT: Actually, the SEx events might be unique in this regard. The other event types, eg: ADDCT or SETPAT, are always cleared by the setup event instruction. EDIT2: Or not. XRL, ATN and QMT also need cleared before use because there is no setup of event for those ones.

macca · 2022-07-10 13:58

@evanh said:
The docs do tell a lot but for this type of behaviour I just try the combinations to see what works. Your test case is exactly how I do that. Make it simple on the one question, then shuffle and insert as different ideas come to mind.

After how much time ? 3 years ? after the P2 release we still need to rely on trial and errors to find how things works ? No sorry, that's unacceptable. This was only a test so in some way easy to find and fix (so to speak, more correctly work around) the behaviour, but what if I had it tested running on other cogs and then when put in the real code it stopped working because the waitse1 was running on Cog0 ? Hours or days trying to fix a non existent bug! And can't imagine if this happens in a production environment!

You'll find similar gotcha details with smartpins and streamers too.

Yes, that's another point... I always spend hours trying to understand that cumbersome bit-maps and never works how I suppose to work.

There is a note in the docs about what instructions will clear the event. It lists SETSEx separately in a somewhat cryptic "SEn is cleared when matched 'SETSEn D/#' is called.". I believe that is saying it only clears when the same mode bits are reused. Which of course means the same pin number too.

That's not cryptic, that's insane if true! I expect it to clear whenever I use setse, not after some lucky event, why someone should setse with the same parameters ? If you call it multiple time it is clear that the parameters will change!
Now that I know, I'll always do a pollse, still don't understand why I can't poll after a setse... is there a setup time ? Again, documented somewhere ?

Every time I see an unexpected behaviour I don't know if it is my code, a wrong setup, an undocumented behaviour that needs a workaround, a chip failure (still we don't know the cause that that famous ptra behaviour in the other thread, right ?), or don't know what!

Sorry, but I'm very frustrated about these things, this is not the first time and I fear it will not be the last.

Yanomani · 2022-07-10 13:59

It's a case of an "ideal/plain" SR-Flip-Flop-logic, where inputs are pre-conditioned, just in a way to avoid the pesky race condition they can be prone-off/sensitive-to.

From the docs (pages 43/44):

"Events are tracked and can be polled, waited for, and used as interrupt sources.

Before explaining the details, consider the event-related instructions.

First are the POLLxxx instructions which simultaneously return their event-occurred flag into C and/or Z, and clear their event-occurred flag (unless it's being set again by the event sensor):"

and also:

"Next are the WAITxxx instructions, which will wait for their event-occurred flag to be set (in case it's not, already) and then clear their event-occurred flag (unless it's being set again by the event sensor), before resuming.".

The whole objective is to don't ever never "loose" any occurred/occurying event, thus one must ensure they're not being already-set/still-held by the activating circuit.

Everything works fine, provided the event-triggering-actions/"pulses" are never allowed to coincide with the event-clearing commands.

evanh · 2022-07-10 14:12

Don't spit the dummy guys. That's just the nature of concurrency. But the performance gains can be great. I just seem to like this stuff.

As soon as you break from serial sequencing you are deep in intermediate states that need focused understanding. And placing wrappers around it just bloats the code ... which is what tends to happen in most high-level handler code. Everything becomes libraries.

evanh · 2022-07-10 14:28

@evanh said:
There is a note in the docs about what instructions will clear the event. It lists SETSEx separately in a somewhat cryptic "SEn is cleared when matched 'SETSEn D/#' is called.". I believe that is saying it only clears when the same mode bits are reused. Which of course means the same pin number too.

Funnily, testing that statement proves to be correct. The following also works:

        dirl    #REPO_PIN
        wrpin   ##SP_REPO1_MODE, #REPO_PIN
        dirh    #REPO_PIN
        setse1  #%001_000000 | REPO_PIN
        setse1  #%001_000000 | REPO_PIN

macca · 2022-07-10 14:29

@Yanomani said:
It's a case of an "ideal/plain" SR-Flip-Flop-logic, where inputs are pre-conditioned, just in a way to avoid the pesky race condition they can be prone-off/sensitive-to.

I'm not arguing about why it works like it works, I don't know how things are implemented in the chip and I don't need (or rather want) to know. My complain is about not documenting these things so every time one must rely on the forum wisdom if had enough patience to get past the unexpected behaviour (I'm sure that several people will just flag the P2 as unrealiable and discard it). It is advisable to call pollseX after each setseX ? Fine, write it! setseX needs some clocks to setup ? Fine, write it!

And still, my other comment shows that the behaviour is different for events 2 (and I suppose 3 and 4 as well) and if cog is not 0, so it was just luck to have find this issue so early, someone else may experience that in a production code and never know what happens! All those "magic nops" that sometimes pop-up to fix thins may just be something like that, who knows ?

evanh · 2022-07-10 14:35

All Cogs will be the same hardware. The differences found will be in stale state. ie: Just like trying to use uninitialised RAM and assuming it always holds zeros. That one tends to land coders in trouble way too often.

macca · 2022-07-10 14:35

@evanh said:
Don't spit the dummy guys. That's just the nature of concurrency. But the performance gains can be great. I just seem to like this stuff.

Concurrency ? Where is concurrency in the code above ? How do I know that an instruction needs time to setup or results in an indeterminated state if the documentation doesn't say so ?
The fun is starting to fade away when thes things occurs regularly...

As soon as you break from serial sequencing you are deep in intermediate states that need focused understanding. And placing wrappers around it just bloats the code ... which is what tends to happen in most high-level handler code. Everything becomes libraries.

Not sure understand so... a-ha...

Yanomani · 2022-07-10 14:39

@evanh said:
Don't spit the dummy guys. That's just the nature of concurrency. But the performance gains can be great. I just seem to like this stuff.

As soon as you break from serial sequencing you are deep in intermediate states that need focused understanding. And placing wrappers around it just bloats the code ... which is what tends to happen in most high-level handler code. Everything becomes libraries.

You are among just a few ones able to craft a HyperRam-bit/byte-bangged-driver that can use events while "reading" those beasts, based solelly on the way RWDS acknowledges/reacts-to HRCLK.

I know, is kind of a cheat-challenge, since transitions at the former will depend on the edges of the clock (so they're are always sequential, in essence), but., just try to figure how...

Whanna complicate it? Add a pattern-identifyer/counter, either for specific bytes or words in the stream, on the fly, but ever ending the command at the right place, after the second byte of the specific word where the pattern was found, and/or after a certain number of "events" just "ticks".

All of it, respecting self-refresh limits. It'll be enlightenning...

evanh · 2022-07-10 14:43

As for the needed separation between SETSEx and POLLSEx, yeah, looks like that one ain't documented.

Also, the full docs, with examples, aren't yet written. I think there must be other priorities putting documentation on hold. It's notable that Chip has gone back to coding more debugger stuff for the past few months.

evanh · 2022-07-10 14:53

@macca said:

@evanh said:
Don't spit the dummy guys. That's just the nature of concurrency. But the performance gains can be great. I just seem to like this stuff.

Concurrency ? Where is concurrency in the code above ? How do I know that an instruction needs time to setup or results in an indeterminated state if the documentation doesn't say so ?

Yes, docs still need to be better. We're still on Chip's terse version. The rewrite hasn't happened yet.

Knowing that the concurrency is there is why I test out such things as instruction spacings. The execution pipeline is a good example of the hardware concurrency. Some circumstances, like self-modified code, require space between certain instructions.

PS: I can't be specific with why the spacer is needed with SETSEx + POLLSEx. I don't know exactly how the two interact.

Yanomani · 2022-07-10 15:00

The other possible way would be having a up-to-Sysclk-rate-driven FIFO, where events would be "tagged" (and time-stamped), so one can be ensured to don't even "loose" anyone of them, up to the limits imposed by Sysclk-rate itself.

But given the fastest instructions can be paced up to Sysclk/2, and one ever needs more than one of them, just to discriminate the events and take the corresponding actions, that'll be kind of a connundrum.

Any available FIFO-FULL-flag could ever hit, before any/many propper actions could be taken.

macca · 2022-07-10 15:01

@evanh said:
All Cogs will be the same hardware. The differences found will be in stale state.

Yes and that adds another point of uncertainty to Cog0 because if the cause of the issue is really the loader firmware, Cog0 is left in a state that is different from any other cog despite being the same hardware and despite saying that when a cog is started serveral things are reset or cleared or whatever... Now how do I know if a weird behaviour is not caused by Cog0 ? A simple restart is enough ? Avoid using it as much as possible may be a good solution!

Also, the full docs, with examples, aren't yet written. I think there must be other priorities putting documentation on hold. It's notable that Chip has gone back to coding more debugger stuff for the past few months.

I don't usually criticize the companies priorities because in the end it is not my money at stake, however that nice debugger I saw the other day in the streaming event could have waited some more time in favor of a real well written documentation that should be the priority of any company that is producing a microcontroller chip, especially after three years from the release! And don't let me talk about development tools support! Look at the RP2040, Parallax have a lot to learn from them!

ie: Just like trying to use uninitialised RAM and assuming it always holds zeros. That one tends to land coders in trouble way too often.
The execution pipeline is a good example of the hardware concurrency. Some circumstances, like self-modified code, require space between certain instructions.

Except that these are documented, not guessed by trial and errors!
Happened to me several time a weird behaviour caused by a res variable in the middle of long defined variables, stupid mistake caused by too much copy/paste yes, but that is documented.

evanh · 2022-07-10 15:04

Actually, the main point I was making was more about how events work generally. They are by nature a concurrently running mechanism. That's why a prior stale state can mess up the expected sequence.

evanh · 2022-07-10 15:12

@macca said:

@evanh said:
All Cogs will be the same hardware. The differences found will be in stale state.

Yes and that adds another point of uncertainty to Cog0 because if the cause of the issue is really the loader firmware, ...

Nooo .... we know it can't be assumed to be cleared. SETSEx doesn't clear the event unless it's the same mode twice.

Also, the full docs, with examples, aren't yet written. I think there must be other priorities putting documentation on hold. It's notable that Chip has gone back to coding more debugger stuff for the past few months.

I don't usually criticize the companies priorities ...

I mentioned Chip's focus only as an indicator, not as the priority. Someone else is doing the finished docs, not Chip. Chip will assist only.

PS: Lindsay I think it might be.

macca · 2022-07-10 15:25

@evanh said:

@macca said:

@evanh said:
All Cogs will be the same hardware. The differences found will be in stale state.

Yes and that adds another point of uncertainty to Cog0 because if the cause of the issue is really the loader firmware, ...

Nooo .... we know it can't be assumed to be cleared. SETSEx doesn't clear the event unless it's the same mode twice.

No, you suppose it can't be assumed to be clear, and actually, your supposition is wrong for events 2, 3, and 4 and for any cog other than 0!
Since the documentation is not clear about that, my supposition based on observations, is that all events are cleared at cog startup (how could not be that ?) and Cog0 is bugged because the loader firmware leaves it in an uncertain state (again supposition) rather than restart it as one suppose (again) it should do.

I mentioned Chip's focus only as an indicator, not as the priority. Someone else is doing the finished docs, not Chip. Chip will assist only.

PS: Lindsay I think it might be.

Whoever... the point is that all company resources should be focused to produce a well written documentation (but not from now, at least two years ago...)!

evanh · 2022-07-10 15:33

@macca said:
No, you suppose it can't be assumed to be clear, and actually, your supposition is wrong for events 2, 3, and 4 and for any cog other than 0!

It is clear in the docs that SETSEx doesn't initially clear the event flag. So it's just incidental if the flag is set or not. Depends completely on any prior use. Just like uninitialised RAM.

Whoever... the point is that all company resources should be focused to produce a well written documentation (but not from now, at least two years ago...)!

They're limited. As you first pointed out, they'll be putting the money where it's most needed.

evanh · 2022-07-10 15:39

A similar effect occurs with Cordic ops too. They can become out of sync if one result is not collected in order. Then you start collecting prior stale results when you are expecting new results.

macca · 2022-07-10 15:40

@evanh said:

@macca said:
No, you suppose it can't be assumed to be clear, and actually, your supposition is wrong for events 2, 3, and 4 and for any cog other than 0!

It is clear in the docs that SETSEx doesn't initially clear the event flag. So it's just incidental if the flag is set or not. Depends completely on any prior use. Just like uninitialised RAM.

I'm not talking about setse, I'm talking about cog startup! In the example above there is no prior use (not documented at least). Cog startup state should be certain, it is not, apparently, for Cog0!

Whoever... the point is that all company resources should be focused to produce a well written documentation (but not from now, at least two years ago...)!

They're limited. As you first pointed out, they'll be putting the money where it's most needed.

No documentation = no company use it for products = no money from selling chips.
But yes, it is not my money at stake.

evanh · 2022-07-10 15:44

@macca said:
I'm not talking about setse, I'm talking about cog startup! In the example above there is no prior use (not documented at least). Cog startup state should be certain, it is not, apparently, for Cog0!

Really?! That's like saying I want a data dump of uninitialised RAM before I start running my program.

macca · 2022-07-10 15:56

@evanh said:

@macca said:
I'm not talking about setse, I'm talking about cog startup! In the example above there is no prior use (not documented at least). Cog startup state should be certain, it is not, apparently, for Cog0!

Really?! That's like saying I want a data dump of uninitialised RAM before I start running my program.

So explain, why it works as expected for any other cog (or any other event than 1 for that matter) ?
If the startup state is really random the result should be random as well... do you see that ? I don't... the results for me are consistent.

Anyway, this is now a pointless discussion, nothing can be assumed, lesson learned, more code that needs to be added.

evanh · 2022-07-10 16:10

Some things are in a defined start state, others are not. That's one that isn't.

PS: Concurrent mechanisms lead to this scenario because the input of a particular step/instruction can often be prior state defined. Like what I said about the Cordic.

Alexander (Sandy) Hapgood · 2022-07-10 16:46

Is it possible that Parallax designed the P2 and now can’t figure out how the document it?

evanh · 2022-07-10 17:14

For sure it won't be easy, but there is just a lot lot more in there compared to the Prop1. The Prop2 terse docs are bigger than all the Prop1 docs combined.

Chip will likely end up making more diagrams of the hardware as well. The low-level I/O's mode-separated functional diagrams are a good example. Some parts that are currently only as Verilog will be easier to understand as a drawing. A more complete version of the pipeline and ALU ports and hidden registers. Details on the FIFO and egg-beater. A layout of the staging registers for those and I/O and streamers.

macca · 2022-07-10 17:53

@evanh said:
Some things are in a defined start state, others are not. That's one that isn't.

You see it not defined, I see it defined except one case... point of view, i guess.

PS: Concurrent mechanisms lead to this scenario because the input of a particular step/instruction can often be prior state defined. Like what I said about the Cordic.

Again, no concurrency here. Initial state here!

evanh · 2022-07-10 18:07

@macca said:

@evanh said:
Some things are in a defined start state, others are not. That's one that isn't.

You see it not defined, I see it defined except one case... point of view, i guess.

The SEx event flags initial state are explicitly not defined by the fact that a SETSEx used with a different mode is guaranteed not to clear the event flag. That means the flag is not defined, just like uninitialised RAM.

PS: Concurrent mechanisms lead to this scenario because the input of a particular step/instruction can often be prior state defined. Like what I said about the Cordic.

Again, no concurrency here. Initial state here!

I'm talking generally. The Cordic is a clear easy to understand example of concurrency that can still get into a jumbled sequence. The event system is more open and therefore more prone to such issues, whether by in sequence bugs or just not fully initialising it.

Waitse1 problem with Cog0

Comments