Strange behavior in hubexec mode

cgracey · 2015-10-29 06:02

78rpm, would you mind doing one more test on the permanent changes I made? I actually got to throw a lot of junk away and I just want to be sure I didn't break something else.

78rpm · 2015-10-29 06:04

cgracey wrote: »
78rpm wrote: »
cgracey wrote: »

I posted it above your last post, in case you missed it.

I have had a fairly good test, it appears that not only have you solved the problem, buut you also get to keep Gracey Island. Well done, Chip!!!

Thank you, I enjoyed that interaction. It looks like I won't be needing that FINDBUG instruction now. Perhaps you could reallocate for Peter's jmplnk!

I have a large supply of commas and zeroes should you need some.
		alignl

' need to execute these with Parallax serial terminal, all special characters checked.
' make it's windo wide and about 30 lines high - easy to see when it goes wrong as 
' multiline space around message:
'  "Decimal error, digit not 0 - 9 ="
' and then the individual digits are printf out6 on the next line with some rather
' unusual ones. It always happens in this function, which uses a calla and reta.
' very nearly 11 good ones and 5 bad

xyzzy		long	0, 0, 0, 0, 0  ' perfect with just thes
     		long 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0
		long	0		' error on decimal conversion
		long 	0, 0, 0, 0	' ditto on these
		long	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
		long	0		' error again
		long    0, 0, 0 	' awfuul errors
		long	0		' small error
		long	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0  ' these are good
		long	0		'	 error
'any longs after this line were added after I posted this file to the forum
		long       0, 0, 0, 0   ' errors
		long	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
		long    0, 0, 0, 0, 0   ' errors
		long    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
		long	0, 0, 0, 0, 0	' errors but better
		long	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0  ' ok
		long	0, 0, 0, 0
		long    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
' start back here v2
		long	0 'ok
		long	0 ' was error now ok
		long    0, 0 ' was big error
		long    0
		long    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
		long	0, 0, 0, 0, 0
		long	0, 0
' end of debugging the fpga image
Whew! What a relief. Thank you for testing that. I had to go play a game of solitaire to get past the waiting.

I will make this change permanent now and may make an interim new set of files for the four target boards.

It was my pleasure to be of assistance, and of course to finally find the cause of those annoying "Decimal error messages"! Just as well I put them, it was purely a debug test to help me. I did find a couple of small errors walking through the chain of calls though, so some good came of it my end, too. I bet your glad that one is removed from the list of questionable errors?

I hope David Hein finds his mysterious errors resolved too.

Do you realise without those error messages I will be paranoid there are lurking errors in the code!

So it was the hub-exec fifo full sensor? When / if you take your car in for service and they tell you you have a problem with the hub, you will be able to put them straight.

cgracey · 2015-10-29 06:11

78rpm, here is the file with the permanent changes. Could you please give this a quick test to verify the old problems are (still) gone, and no new problem popped up?

DE0_Nano_Bare_Prop2_v2x3.zip

Yanomani · 2015-10-29 06:18

Hi Chip

If I understood it right, during hub exec, each available hub slot will be used, trying to maintain the hub fifo full?

Even if it'll not have at least 8 empty (consumed) places?

In other words, during hub exec, there will be times that a hub slot will be used, to fill less than eight positions at the fifo?

78rpm · 2015-10-29 06:22

cgracey wrote: »

78rpm, here is the file with the permanent changes. Could you please give this a quick test to verify the old problems are (still) gone, and no new problem popped up?

Just got the file, I'll be about ten minutes. If you click on the file name above the attachment it opens the file in the browser, most interesting.

cgracey · 2015-10-29 06:31

Yanomani wrote: »

Hi Chip

If I understood it right, during hub exec, each available hub slot will be used, trying to maintain the hub fifo full?

Even if it'll not have at least 8 empty (consumed) places?

In other words, during hub exec, there will be times that a hub slot will be used, to fill less than eight positions at the fifo?

It will try to maintain 16 pre-loaded longs. This will preempt RDxxxx/WRxxxx instructions, until the FIFO is full, assuming there is competition for the same slots between the FIFO and the RDxxxx/WRxxxx instructions, before the FIFO is full.

It seems like this could be optimized, but my attempt to do so, for reasons I don't understand, didn't work. It looked like if, maybe, the hub exec 'full' level was set to 11, there would have been no problem, but that still seems excessive to me, over the 8 or 9 that my brain figures is necessary.

There's something I'm not visualizing properly. However, I did thoroughly test the FIFO under every load condition I could think of, using 16 as the "full" threshold, and I determined that it was working fine. I KNOW that works, but I'm reticent to tweak it downward for hub exec, anymore. It works now, like it's supposed to.

Yanomani · 2015-10-29 06:43

Perhaps a little change in logic approach will help here.

Some form of starving algorithm.

If hub exec mode flag is set and there are less than nine instructions at the Fifo, them the next (or just arriving) hub slot, will be the one that reloads 8 positions of it.

This way, you could ensure a full round of available ammo, till the next check point.

Would also help freeing some more hub slots to serve the Cordic pipeline just in time.

Only a thought, I've yet don't had time to mentally iterate all the possibilities.

78rpm · 2015-10-29 06:52

cgracey wrote: »

78rpm, here is the file with the permanent changes. Could you please give this a quick test to verify the old problems are (still) gone, and no new problem popped up?

All looking good. The spinners at the top of screen and clock were trundling away merrily. Not a single "Decimal error" message.

I have had 33 iterations of inserting one extra long - all ok.
I then went back to the state where the worst of the errors were occurring - all ok
I then inserted a number of bytes 1, 2, 3 and they were all okay too. What I really need to do to verify is that iterating on extra bytes and then repeat the exercise for words! ARG!!! I think that would be a pretty good test unless you can think of a way of implementing it.

Ah, I suppose I could make the decimal conversion function relative addressing for the calls and then shift everything up in memory 1 byte then 1 word at a time? That would have the same effect. I would just need to provide stubs where the routines currently are which increase the relative addressing to the real addresses. Wandering code, thou art wonderous! And it will be another test for rdlong / wrlong!

I would also need to add a clear screen and a delay of a couple of seconds before the next test, or just press Enter, that would be better

cgracey · 2015-10-29 06:54

Yanomani wrote: »

Perhaps a little change in logic approach will help here.

Some form of starving algorithm.

If hub exec mode flag is set and there are less than nine instructions at the Fifo, them the next (or just arriving) hub slot, will be the one that reloads 8 positions of it.

This way, you could ensure a full round of available ammo, till the next check point.

Would also help freeing some more hub slots to serve the Cordic pipeline just in time.

Only a thought, I've yet don't had time to mentally iterate all the possibilities.

That's actually how it works, but the full-level is 16. When it has less that 16 longs in it, the hub FIFO will stream in enough longs, starting at the needed time slot, to get back to 16 longs. When it "stops" reading more longs, there are 5, yet, coming down the pipe that pile into the FIFO.

The CORDIC pipeline shares a set of slots with CLKSET/COGID/COGSTOP/etc. hub operations. The hub RAM slots are a different set.

I'm falling asleep now, so I'll get some rest.

Thanks, Guys, for all your help and ideas. Things are shaping up quite well, I think. There are no more bugs that I'm aware of, at this point.

78rpm · 2015-10-29 06:55

cgracey wrote: »

Yanomani wrote: »

Hi Chip

If I understood it right, during hub exec, each available hub slot will be used, trying to maintain the hub fifo full?

Even if it'll not have at least 8 empty (consumed) places?

In other words, during hub exec, there will be times that a hub slot will be used, to fill less than eight positions at the fifo?

It will try to maintain 16 pre-loaded longs. This will preempt RDxxxx/WRxxxx instructions, until the FIFO is full, assuming there is competition for the same slots between the FIFO and the RDxxxx/WRxxxx instructions, before the FIFO is full.

It seems like this could be optimized, but my attempt to do so, for reasons I don't understand, didn't work. It looked like if, maybe, the hub exec 'full' level was set to 11, there would have been no problem, but that still seems excessive to me, over the 8 or 9 that my brain figures is necessary.

There's something I'm not visualizing properly. However, I did thoroughly test the FIFO under every load condition I could think of, using 16 as the "full" threshold, and I determined that it was working fine. I KNOW that works, but I'm reticent to tweak it downward for hub exec, anymore. It works now, like it's supposed to.

Perhaps David Hein's simulator may hopefully through some more light on the matter.

Yanomani · 2015-10-29 06:59

Me too, it is five o'clock here!

Good night!

cgracey · 2015-10-29 07:00

78rpm wrote: »

cgracey wrote: »

Yanomani wrote: »

Hi Chip

If I understood it right, during hub exec, each available hub slot will be used, trying to maintain the hub fifo full?

Even if it'll not have at least 8 empty (consumed) places?

In other words, during hub exec, there will be times that a hub slot will be used, to fill less than eight positions at the fifo?

It will try to maintain 16 pre-loaded longs. This will preempt RDxxxx/WRxxxx instructions, until the FIFO is full, assuming there is competition for the same slots between the FIFO and the RDxxxx/WRxxxx instructions, before the FIFO is full.

It seems like this could be optimized, but my attempt to do so, for reasons I don't understand, didn't work. It looked like if, maybe, the hub exec 'full' level was set to 11, there would have been no problem, but that still seems excessive to me, over the 8 or 9 that my brain figures is necessary.

There's something I'm not visualizing properly. However, I did thoroughly test the FIFO under every load condition I could think of, using 16 as the "full" threshold, and I determined that it was working fine. I KNOW that works, but I'm reticent to tweak it downward for hub exec, anymore. It works now, like it's supposed to.

Perhaps David Hein's simulator may hopefully through some more light on the matter.

Maybe it would. I don't know why 9 levels was still problematic. Anyway, 16 is what I tested the bejeebers out of, a while back, and I would see it drop to 0 and go up to 16, and never miss a data request.

cgracey · 2015-10-29 07:01

78rpm wrote: »

cgracey wrote: »

78rpm, here is the file with the permanent changes. Could you please give this a quick test to verify the old problems are (still) gone, and no new problem popped up?

All looking good. The spinners at the top of screen and clock were trundling away merrily. Not a single "Decimal error" message.

I have had 33 iterations of inserting one extra long - all ok.
I then went back to the state where the worst of the errors were occurring - all ok
I then inserted a number of bytes 1, 2, 3 and they were all okay too. What I really need to do to verify is that iterating on extra bytes and then repeat the exercise for words! ARG!!! I think that would be a pretty good test unless you can think of a way of implementing it.

Ah, I suppose I could make the decimal conversion function relative addressing for the calls and then shift everything up in memory 1 byte then 1 word at a time? That would have the same effect. I would just need to provide stubs where the routines currently are which increase the relative addressing to the real addresses. Wandering code, thou art wonderous! And it will be another test for rdlong / wrlong!

I would also need to add a clear screen and a delay of a couple of seconds before the next test, or just press Enter, that would be better

Okay! Thanks 78rpm. Looks like this issue is put to bed. This is the kind of bug that really worries me, because it could be hard to diagnose. Glad it's out of the way! Thanks a lot for your help.

78rpm · 2015-10-29 07:09

cgracey wrote: »

78rpm wrote: »

cgracey wrote: »

78rpm, here is the file with the permanent changes. Could you please give this a quick test to verify the old problems are (still) gone, and no new problem popped up?

All looking good. The spinners at the top of screen and clock were trundling away merrily. Not a single "Decimal error" message.

I have had 33 iterations of inserting one extra long - all ok.
I then went back to the state where the worst of the errors were occurring - all ok
I then inserted a number of bytes 1, 2, 3 and they were all okay too. What I really need to do to verify is that iterating on extra bytes and then repeat the exercise for words! ARG!!! I think that would be a pretty good test unless you can think of a way of implementing it.

Ah, I suppose I could make the decimal conversion function relative addressing for the calls and then shift everything up in memory 1 byte then 1 word at a time? That would have the same effect. I would just need to provide stubs where the routines currently are which increase the relative addressing to the real addresses. Wandering code, thou art wonderous! And it will be another test for rdlong / wrlong!

I would also need to add a clear screen and a delay of a couple of seconds before the next test, or just press Enter, that would be better

Okay! Thanks 78rpm. Looks like this issue is put to bed. This is the kind of bug that really worries me, because it could be hard to diagnose. Glad it's out of the way! Thanks a lot for your help.

It really was my pleasure. I've not had this much fun since ... last time.

Dave Hein · 2015-10-29 10:40

I just checked the forum this morning and found that there was a lot of activity during the night. I've been away from home during the past week, so I don't have access to my DE2-115. However I get home today, and I'll try the latest FPGA image this afternoon.

When Chip mentioned that he had implemented a level threshold on the streaming FIFO I also added that to spinsim. With the level threshold that he used the FIFO would never run empty since the cog executes every other cycle. I'm sure the Verilog implements this slightly different than I do, and there may be an issue in the timing.

78rpm · 2015-10-29 14:36

Dave Hein wrote: »

I just checked the forum this morning and found that there was a lot of activity during the night. I've been away from home during the past week, so I don't have access to my DE2-115. However I get home today, and I'll try the latest FPGA image this afternoon.

When Chip mentioned that he had implemented a level threshold on the streaming FIFO I also added that to spinsim. With the level threshold that he used the FIFO would never run empty since the cog executes every other cycle. I'm sure the Verilog implements this slightly different than I do, and there may be an issue in the timing.

I wasn't sure if you had a Nano, but it is a Terasic.

The code where this error is likely triggered is a calla from hub into cog, with 7 pushs to the ptra stack in hub. Upon completion, it performs the 7 popa, followed by a reta back into hub space. The next instruction back in hub land is a calla into hub-space. I think the hub and fifo were well and truly exercised, and hopefully threshold problem has been exorcised.

I am just about to return to manually adding longs, word and bytes to push things about to make sure this doesn't cause problems on the Nano. I was adding longs during the testing but it needs a good exercise with the other machine access sizes. I'll cover the fullscale two and half hub rotations with byte, word and long but for the long measurement, so 40 for long, 80 for word and 160 for hub. I think it will take a while, but at least it will be checked.

cgracey · 2015-10-29 14:40

I'll post new FPGA files this morning that reflect this FIFO-level change (bug fix).

jmg · 2015-10-29 18:37

Dave Hein wrote: »

I'm sure the Verilog implements this slightly different than I do, and there may be an issue in the timing.

Maybe Chip can post the verilog test snippet and you can try to clone the operation in SpinSim ?

jmg · 2015-10-29 18:39

cgracey wrote: »

I don't know why 9 levels was still problematic. Anyway, 16 is what I tested the bejeebers out of, a while back, and I would see it drop to 0 and go up to 16, and never miss a data request.

Is there any down side to the 16 threshold ?
I guess a larger fifo is implied, any others ?

78rpm · 2015-10-29 22:13

*** EDITED POST *** See end for reason
*** EDITED POST AGAIN *** See very end for reason

cgracey wrote: »

78rpm wrote: »

cgracey wrote: »

78rpm, here is the file with the permanent changes. Could you please give this a quick test to verify the old problems are (still) gone, and no new problem popped up?

All looking good. The spinners at the top of screen and clock were trundling away merrily. Not a single "Decimal error" message.

I have had 33 iterations of inserting one extra long - all ok.
I then went back to the state where the worst of the errors were occurring - all ok
I then inserted a number of bytes 1, 2, 3 and they were all okay too. What I really need to do to verify is that iterating on extra bytes and then repeat the exercise for words! ARG!!! I think that would be a pretty good test unless you can think of a way of implementing it.

Ah, I suppose I could make the decimal conversion function relative addressing for the calls and then shift everything up in memory 1 byte then 1 word at a time? That would have the same effect. I would just need to provide stubs where the routines currently are which increase the relative addressing to the real addresses. Wandering code, thou art wonderous! And it will be another test for rdlong / wrlong!

I would also need to add a clear screen and a delay of a couple of seconds before the next test, or just press Enter, that would be better

Okay! Thanks 78rpm. Looks like this issue is put to bed. This is the kind of bug that really worries me, because it could be hard to diagnose. Glad it's out of the way! Thanks a lot for your help.

I am still running tests with displaced code, having my test harness jump back to the start after a one second delay so i can visually see any out of porition screen fields.

I have checked inserting longs and words into the code to cover around 2.5 hub revolutions. I have only completed 4 byte inserts so far, will continue with the others as a priority. So far the results are per the design spec, everything ok.

*** EDIT *** Update on testing. I have now completed over one complete hub cycle with inserting byte definitions to push the code up in memory, so this means ir is executing in hub-exec on all byte boundaries.

Number of errors so far = 0

*** EDIT AGAIN *** Testing complete on long, word and byte boundaries covering all lower 2 bits of the hub address for a little over 2.5 hub rotations.

Number of errors = 0

Which is as it should be.

All future software problems will be of my own making. Now, where did put that pad of bug request forms...

Dave Hein · 2015-10-30 00:49

I tested my code with the 10/29/15 FPGA image and it works fine now. Chip, thanks for looking into this.

cgracey · 2015-10-30 00:54

Dave Hein wrote: »

I tested my code with the 10/29/15 FPGA image and it works fine now. Chip, thanks for looking into this.

Thanks go to you guys!

When stuff acts flakey, it's a huge demotivator for those who would like to invest their energy and time in using it. We need to make the quality perfect so that it's worthy of everyone's engagement. We also need to make it fun.

78rpm · 2015-10-30 01:06

cgracey wrote: »

Dave Hein wrote: »

I tested my code with the 10/29/15 FPGA image and it works fine now. Chip, thanks for looking into this.

Thanks go to you guys!

When stuff acts flakey, it's a huge demotivator for those who would like to invest their energy and time in using it. We need to make the quality perfect so that it's worthy of everyone's engagement. We also need to make it fun.

I fully agree with your demotivation point. I was questioning myself as to whether I had an overwrite somewhere that I was blind to.

Quality perfection really is a commendable goal. One only needs to look at the errata docs published by other manufacturers to see that they seem to push to market prematurely. When their main competitors do the same it ends with a vicious circle which serves nobodies interest due to all the workarounds and exception tests. A very frustrating experience all round.

I think Parallax shines here with commitment to quality.

Strange behavior in hubexec mode

Comments