Funky COGNEW behavior...

potatohead · 2014-02-23 22:32

Anyone else notice this? Ideally, I'm doing some basic thing wrong.

I was just going to take the easy path and have my graphics COG draw the fractal, then end. To do this, I capture the COGID with that instruction, and the COG stops itself once the rendering is done. Takes a coupla seconds.

So far so good.

The next step was to just fire it off on demand, using one of the buttons. The most obvious thing was to just cognew again, because there are COGS free. This simple program only uses COG 0 to draw the frame buffer, and isn't using tasks, HUBEX, etc...

When I do this, I find COG 0 gets killed off!

I've attached the program. Use F10 to run it, and press key1 to repeatedly draw the image. I've included a clear to white so it can be easily seen. This is robust. You can hold the button, click it a lot, whatever. Just works as it should, but it's using coginit to request a specific COG.

Run the other one, and COG 0 gets killed off the moment you press the button

I've attached a cognew and a coginit version, identical, but for two changes:

'spawn fractal on cog 2
                'cognew  fstart_addr, bitmap
                 coginit fstart_addr, bitmap, #2

checkbut        getp  #29  wz
          'if_z  cognew  fstart_addr, bitmap    
          if_z  coginit fstart_addr, bitmap, #2 

                ret

The button is checked once per video frame, so a slow press would request a lot of COGS, but those requests should fail, not take out a running one, or am I doing something basic, and wrong?

ozpropdev · 2014-02-23 23:14

Hi Potatohead
I just verified your 2 programs here with the same result. COGNEW fails.
I modified my 4 cog Invaders to see what happens, and it failed.
COGNEW doesn't fail if the free cog is #3 though?
Brian

Edit: Just modified Cordic Bugz to COGNEW instead of COGINIT and it all works....weird....funky even?

potatohead · 2014-02-23 23:58

I didn't notice COG 3. I'll have to start up #2, then just ask for COG 3 doing something else, or the monitor to test here. And yeah, weird.

Given you saw it too, it's worth Chip looking over IMHO. Perhaps others have had similar behaviors and missed them due to all the new features / variables?

Still failed with COG 3 free. What I did was uncomment the "end" label in the fractal COG, causing it to jump to itself over and over and I did a coginit to have it be either cog 1 or cog 2, and I uncommented the cognew in "checkbut", meaning a button press would request one of the other COGS. It killed the video COG off right away.

Perhaps this is a leftover from when there were 5? Or just a bug?

We should simply see a fail to allocate a COG in this case, and that does not appear to be happening.

BTW: Cordic bugs is cool.

cgracey · 2014-02-24 00:05

Potatohead,

My current version is in a state of flux, so I can't do a test of my own at the moment, but it sounds like maybe COGID is always returning 0. Can you start up cog2 and have it output its COGID result to some I/O pins?

potatohead · 2014-02-24 00:11

Ahh, good thought! Yes. Will do.

ozpropdev · 2014-02-24 00:12

cgracey wrote: »

Potatohead,

My current version is in a state of flux, so I can't do a test of my own at the moment, but it sounds like maybe COGID is always returning 0. Can you start up cog2 and have it output its COGID result to some I/O pins?

Chip
I sent the COGID value to the leds and it looks OK.

cgracey · 2014-02-24 00:15

ozpropdev wrote: »

Chip
I sent the COGID value to the leds and it looks OK.

Thanks for trying that. Could you verify that 0..3 report back properly.

It seems that COGNEW may have the problem, then?

potatohead · 2014-02-24 00:20

Looks good here too. 0, 1, 2, 3, just as it should be.

cgracey · 2014-02-24 00:36

Because COGID is used for some fuse configuration and CNT testing, it doesn't return a D value if WC is used.

So, does the problem seem to be centered around COGNEW?

ozpropdev · 2014-02-24 00:42

cgracey wrote: »

Because COGID is used for some fuse configuration and CNT testing, it doesn't return a D value if WC is used.

So, does the problem seem to be centered around COGNEW?

That appears to be the case.

cgracey · 2014-02-24 01:04

ozpropdev wrote: »

That appears to be the case.

Does COGNEW always relaunch cog0?

ozpropdev · 2014-02-24 01:06

cgracey wrote: »

Does COGNEW always relaunch cog0?

In the scenarios that fail it stops cog0.

cgracey · 2014-02-24 01:42

ozpropdev wrote: »

In the scenarios that fail it stops cog0.

Once I rehabilitate the current design, I'll do this test and find out what the problem is. Thanks for doing those experiments, Guys.

ozpropdev · 2014-02-24 02:06

Chip
Here's one of the weird scenarios I found in my Invaders test.
3 instances of the game are launched and run fine. 4 Dummy instances load ok and all 8 Cog leds are illuminated.
Notice the only cognew that works is the 3rd game?
The only other scenario that works is if I launch just 1 game with cognew.
Is a clue buried in this scenario?

			coginit	_coglet,_game1,#1
			coginit	_coglet,_game2,#2
			cognew	_coglet,_game3
			cognew	_coglet,_game4
			cognew	_coglet,_game4
			cognew	_coglet,_game4
			cognew	_coglet,_game4

Cheers
Brian

Edit: By the way I haven't been able to get Cordic Bugz to fail at all yet....Weird!

cgracey · 2014-02-24 02:48

ozpropdev wrote: »
Chip
Here's one of the weird scenarios I found in my Invaders test.
3 instances of the game are launched and run fine. 4 Dummy instances load ok and all 8 Cog leds are illuminated.
Notice the only cognew that works is the 3rd game?
The only other scenario that works is if I launch just 1 game with cognew.
Is a clue buried in this scenario?
			coginit	_coglet,_game1,#1
			coginit	_coglet,_game2,#2
			cognew	_coglet,_game3
			cognew	_coglet,_game4
			cognew	_coglet,_game4
			cognew	_coglet,_game4
			cognew	_coglet,_game4
Cheers
Brian

Edit: By the way I haven't been able to get Cordic Bugz to fail at all yet....Weird!

This could have something to do with the hub logic. I'm not sure what it is, but I suspect it will be simple fix.

potatohead · 2014-03-25 00:00

Just tried COGNEW on the 3/20/2014 image. It's still broken as described in this thread.

Run the attached program, which works fine with the new COGRUN instruction. I tried a lot of combinations with that, rapid fire, cog numbers all over the place, and it works.

A COGNEW will kill off the main program in COG 0, just like before. Press Key 1 and it successfully calls the fractal COG once, twice, then kills off the COG 0.

Various other keypresses that result in the graphics COG being called for a screen draw fail. The expected behavior is to simply launch COGS on each button press. Multiple COGS may get called, but they just draw to the screen, stopping when they are done. Shouldn't ever impact COG 0.

I'll update to the next FPGA image tomorrow. Sorry, if this has already been fixed. Please confirm.

Sapieha · 2014-03-25 00:26

Hi potatohead.

Have You read This.

'Those docs are out-of-date, per the warning at the top of the file. COGINIT is gone,
replaced by COGRUN/COGRUNX There's also 'COGNEW/COGNEWX The -X variants start the cog in hub exec mode.
There is only one parameter for code/pointer now, and that is D:

D = %aaaaaaaaaaaaaaaa_bbbbbbbbbbbbbbbb

D[15:0] point to the starting long in hub (minus the two %00 LSBs),
and become PTRB of the cog getting started (again, minus the two %00 LSBs).
D[31:16] become PTRA of the cog getting started (minus the two %00 LSB's)

So, a single 32-bit value now serves both purposes. This frees up S/# in COGRUN/COGRUNX to provide the cog number,
keeping the entire operation atomic, without needing to make self-modifying code to convey the cog#.

COGRUN D/#,S/#
COGRUNX D/#,S/#
COGNEW D/#
COGNEWX D/#

----------------- 
 Usage:    cognew($52C, rx_pin << 10 + tx_pin << 2)    'start monitor in new cog
----------------- 
        cognew    monitor_pgm

potatohead wrote: »

Just tried COGNEW on the 3/20/2014 image. It's still broken as described in this thread.

Run the attached program, which works fine with the new COGRUN instruction. I tried a lot of combinations with that, rapid fire, cog numbers all over the place, and it works.

A COGNEW will kill off the main program in COG 0, just like before. Press Key 1 and it successfully calls the fractal COG once, twice, then kills off the COG 0.

Various other keypresses that result in the graphics COG being called for a screen draw fail. The expected behavior is to simply launch COGS on each button press. Multiple COGS may get called, but they just draw to the screen, stopping when they are done. Shouldn't ever impact COG 0.

I'll update to the next FPGA image tomorrow. Sorry, if this has already been fixed. Please confirm.

ozpropdev · 2014-03-25 02:13

Hi Potatohead
The problems COGNEW had in my previous tests seems to be OK now. Where Invaders used to fail works fine now.
I have changed to the new format of 2 x16 bit pointers as well.

One theory that popped into my head is maybe we got caught inbetween releases, Maybe COGNEW was already in the new format?
I had an issue with Pnut wanting a 2nd operand on SETXFR before the 2nd operand was implemented.

cgracey · 2014-03-25 06:11

potatohead wrote: »

Just tried COGNEW on the 3/20/2014 image. It's still broken as described in this thread.

Run the attached program, which works fine with the new COGRUN instruction. I tried a lot of combinations with that, rapid fire, cog numbers all over the place, and it works.

A COGNEW will kill off the main program in COG 0, just like before. Press Key 1 and it successfully calls the fractal COG once, twice, then kills off the COG 0.

Various other keypresses that result in the graphics COG being called for a screen draw fail. The expected behavior is to simply launch COGS on each button press. Multiple COGS may get called, but they just draw to the screen, stopping when they are done. Shouldn't ever impact COG 0.

I'll update to the next FPGA image tomorrow. Sorry, if this has already been fixed. Please confirm.

I'll be out in the shop soon to look into this, but something just occurred to me... Pressing a button can result in tens of transitions. Could there be some issue about this?

potatohead · 2014-03-25 07:38

I think it might be worth looking at that.

I do have it debounced. But one can hold it down and fire off a lot of COG requests. Frankly, they all should just work, rejected or not, based on a COG being there to use.

And a running COG should not stop for sure, regardless of the incoming requests. If it is somehow glitch prone on many rapid requests, it could be trouble for some use cases.

And the first one does work starting COG 1. The second fails, killing COG 0 every time, regardless of whether COG 1 got done and becomes available. Holding the button down sees COGS 2 and three started and they don't stop. That could be some artifact of this quick program test.

I may try some other tests tonite. More are needed IMHO. Maybe a good one is to have the COGS stay busy for a second or two, while various kinds of requests are made, fast, slow, etc... Maybe also have them stay busy for random times and or wait on various things too.

P1 handled all those cases just fine. I've abused COGNEW on it in the past.

potatohead · 2014-03-25 07:54

Actually, on P1, there was no failure case for COGNEW. A COG always got loaded, code runs consistently.

We can now start a COG without a full load, and ot at some address. I did not yet check to see this happen, but my program could be making one of those cases happen, which would end up running the COGSTOP with the wrong COGID, and or causing one to run a long time due to an uninitialized loop, or math result instruction being run befor a math operatio happens, etc...

And we need to test that math result case for sure. I havem't yet, and this is a reminder for me to do that.

In short, we can't rule out program code causes on P2, like we mostly could on P1.

cgracey · 2014-03-25 12:17

I just added COGNEW/COGNEWX to the monitor, so that when the cog # is 8 on a cog start command, it uses COGNEW/COGNEWX instead of COGRUN/COGRUNX.

It seems to work fine in that context. Next, I will try potatohead's code.

cgracey · 2014-03-25 17:09

I'm using your code, potatohead, and I'm experiencing this problem where COGRUN's work, but COGNEW's kill cog0 on the second instance....

potatohead · 2014-03-25 17:12

Yes, that is exactly what I saw last night.

I won't be where the FPGA is until later tonight.

cgracey · 2014-03-25 17:40

potatohead wrote: »

Yes, that is exactly what I saw last night.

I won't be where the FPGA is until later tonight.

It just occurred to me what the problem is:

COGNEW D writes to D. It's overwriting fstart_addr with the cog number. Then, next COGNEW jumps to some low location in ROM which ends in a COGSTOP #0.

So, this should work, instead:

MOV temp,fstart_addr
COGNEW temp

MOV temp,fstart_addr
COGNEW temp

MOV temp,fstart_addr
COGNEW temp

potatohead · 2014-03-25 19:32

Excellent!! Happy to have it be a code problem. I was unable to run anything today.

This new behavior totally got past me.

Funky COGNEW behavior...

Comments