SOLVED !!! Hub Exec Error Raises Head Again On DE0_Nano Bare Board in Prop2_FPGA_11_09_15b
78rpm
Posts: 264
*** EDITED 11th Nov for the second time ****
Problem solved, it was all my fault. Sorry everyone, especially Chip as it has likely dragged him off what he was meant to be doing today. I'll come back and add the link to comment here in a second, forgot to copy it.
This message reveals all:
forums.parallax.com/discussion/comment/1353918/#Comment_1353918
The last file in the list "78rpm_test_harness_0_0_11.spin" is the working version.
*** EDITED 11th Nov ****
Updated this comment after comment:
forums.parallax.com/discussion/comment/1353880/#Comment_1353880
was posted. New file ending in name "_lut_first.spin" attached to this post.
*** Original Post ****
I have started to encounter errors with hubexec code. The error appears identical to a week or so ago when you had to increase the threshold for refilling the hubexec fifo. It may possibly be related to the RD/WR FAST updates though. It is only an effect I see with my code. That is that having generated a decimal value between 0 - 9, it suddenly is no longer between 0 - 9.
One additional factor is newly inserted lut exec, I am now switching execution back and forth from:
Hub -> Lut -> Hub -> Lut->Hub ->Cog -> Hub ->Cog -> Hub
It has re-occurred since the introduction of lut exec. My RD/WR Byte/Word/Long test program was generating errors when executing in lut-exec, just the WRLONG x,--PTRB[16], type of instruction. So suspecting I was at error, either in synthesis, instruction placement, or post execution regsiter checking, I thougt I'd make a new lut routine, called exec_lut_test. It may be an effect of whatever causes the errors, but when I execute WRLONG x,--PTRB[16] in lut exec, PTRB is reported as not updated.
To run you need a DE0-Nano Bare Board (may run on others) & Parallax Serial Terminal with line buffer set to 8192. There are some test indicator instructions which ouput on OUTA to drive some leds (bits 0, 1, 3, 4, 6 & 7).
The ouput error message for lutexec ptrb not updating is on the 14th line of the output:
The output error message for hubexec fifo type problem is right at the end:
Problem solved, it was all my fault. Sorry everyone, especially Chip as it has likely dragged him off what he was meant to be doing today. I'll come back and add the link to comment here in a second, forgot to copy it.
This message reveals all:
forums.parallax.com/discussion/comment/1353918/#Comment_1353918
The last file in the list "78rpm_test_harness_0_0_11.spin" is the working version.
*** EDITED 11th Nov ****
Updated this comment after comment:
forums.parallax.com/discussion/comment/1353880/#Comment_1353880
was posted. New file ending in name "_lut_first.spin" attached to this post.
*** Original Post ****
I have started to encounter errors with hubexec code. The error appears identical to a week or so ago when you had to increase the threshold for refilling the hubexec fifo. It may possibly be related to the RD/WR FAST updates though. It is only an effect I see with my code. That is that having generated a decimal value between 0 - 9, it suddenly is no longer between 0 - 9.
One additional factor is newly inserted lut exec, I am now switching execution back and forth from:
Hub -> Lut -> Hub -> Lut->Hub ->Cog -> Hub ->Cog -> Hub
It has re-occurred since the introduction of lut exec. My RD/WR Byte/Word/Long test program was generating errors when executing in lut-exec, just the WRLONG x,--PTRB[16], type of instruction. So suspecting I was at error, either in synthesis, instruction placement, or post execution regsiter checking, I thougt I'd make a new lut routine, called exec_lut_test. It may be an effect of whatever causes the errors, but when I execute WRLONG x,--PTRB[16] in lut exec, PTRB is reported as not updated.
To run you need a DE0-Nano Bare Board (may run on others) & Parallax Serial Terminal with line buffer set to 8192. There are some test indicator instructions which ouput on OUTA to drive some leds (bits 0, 1, 3, 4, 6 & 7).
The ouput error message for lutexec ptrb not updating is on the 14th line of the output:
Error in lut test exec, returned ptr = $00000124 , difference = 0, start value = $00000124
The output error message for hubexec fifo type problem is right at the end:
Decimal error, digit not 0 - 9 =which occurs after test 2473, just as it starts executing from lut with the first "rdbyte x, ptra" instruction. Then things get very hairy and wrong values are printed and it crashes, probably due to garbled data.
Comments
I updated that file twice yesterday, but didn't change the 'b' to a 'c', like I should have.
You should be using the DE0_Nano_Bare_Prop2_v4c.jic file. Can you confirm?
That's quite an awesome test program you made!
I had flashed the Nano because when I exit the programmer it asks me if I want to save a file of some description. I say yes and it writes it to the image directory. That file existed as it asked if I wanted to overwrite it, so I must have flashed it prior.
So you are saying COG-HUB-COG is fine in any combination, it is just LUT exec that sometimes fails ?
"send_dec" function. That calls a 32 by 32 divide and then a multiply in the cog, the return of the divide is a decimal digit between 0 - 9. It then fails in the "check_for_decimal_error" function after having got a digit 0-9. This is how the fifo refill effect appeared last time, with this error message being generated.
The new thing is lut exec now, so more switching in and out of modes. Not sure if the indexed rd/wr cog to hub from lut exec is the problem, ie all three adress spaces being used?
When it is also adding lut into the mix then there is a problem, as I've just identified to Chip in the post above. I am using cog and hub address space but lut exec at the same time for RDBYTE COG_REG, PTRA. Not sure if that is a help or coincidence?
If I run your program, do I need to give it any input to get it going, or does it just run until it blows up?
RDBYTE COG_REG, PTRA
etc
and then test the value of regs, memory and ptr post execution.
So with executing that in LUT it is reading from HUB and writing to COG, ie all three memory spaces at the same time. It may be a coincidence?
To start the program you need to press ENTER on Serial Terminal and have your buffer set to 8192 lines. It also outputs to leds on OUTA 0 -7. That's all. Takes a couple of seconds to do the HUB and Cog tests first. Then the errors occur and then it crashes.
If it is not resolved it could be handy. Thank you.
Way too much haystack to find a needle in!! Must go pruning.
I think it is the cog - hub - lut switching which is the problem. I tried just a simply lut version from hub exec but that does seem to pass. I think it needs lots of switching turmoil to occur.
Very first one! It reports errors in the returned value, too.
The code is all identical as it is in a loop, the only difference is to differentiate which memory space the test is performed in and the instruction is placed in that memory space, and then later executed.
The instruction is in lut, the destination register is in cog, the source data is in hub addressed by ptra.
These are the RD/WR Byte/Word/Long instruction unit tests, which Seairth started off and I offered to do this one.
The same program, but it now tries to perform the lut tests first and fails. Program name ends in "_lut_first.spin". I'll also attach to the top post for easy reference. This just saves having 2,000 odd tests scroll up the screen.
Function "place_instruction", in hub, places the instruction in memory space depending on which test is selected, by calling a relevant function in that memory space, eg "place_inst_lut", so puts it in lut first 618 * 2 times (all the different pointer and index forms * 3 for byte, word, long * the final 2 for read and write. Then ultimately *3, for hub, cog and lut exec).
Function "execute", in hub, is called and detemines the memory space under test, which calls the relevant function in that memory space, eg. "execute_lut".
I think it is @ execute proper where the fault lies, or upon it's return. I can not see anything wrong with the code. The lut is loaded initially near "start" and uses labels "lut_start" and "lut_end".
Ops, my error
I had a dream about my lut program problem, and I came down and looked, and lo.
1. I had commented out the org $200 and not noticed.
2. I was addresing a lut location in the same manner as a cog location. A definite No-No if ever there was one! It wasn't flagged as an error due to the org commenting out.
3. Due to this, the jump to lut could be somewhere in cog exec space, however the instruction was written to lut by a wrlut using a default NOP as the address for access!
4. Consequently mayhem ensured.
I offer my deepest and sincere apology for having sent you and others on a wild goose chase. If I could grovel on the floor in front of you I think that would be fitting. I shall now hide until 2030 when the Propeller Plasma Drive is due for testing and hope everyone will have forgotten!
The good news is that the lut exec now works correctly and 3708 tests now complete without error!
However, the bad news is I haven't finished the tests yet as I realise I have left off the immediate WRBYTE #0,ptra, etc and also the 0 <=a <= $ff hub address. What is the syntax for the very low hub memory access. Is it WRBYTE #$0, ptra? ie the $ for hex address?
Attached is a working file, if you run this and then go to the top of the screen you will see an "Easter Egg".
I will update the top of thread with "SOLVED" in a second.
Once again "Sorry".
I looked at your code and saw that commented-out 'org $200' and didn't know what to make of it, so I quit looking, figuring I'd try again later.
Now the low hub memory addressing as mentioned in the same mesage ?
If you want to write a low immediate address, you'd do 'WRBYTE D/#,#$00.. $FF'
Of course, hub address would then be low. I am not able to see the wood for the trees at times. Thank you.
Over the course of the past five minutes I have given myself a virtual 'beating' by pressing* my nose close to the fpga on the Nano. As the hub egg-beater clocked round I received a slight biff at each step.
* DO NOT remove the perspex 'machine guard' from the fpga board, unless you have adult supervision.
I think I will have to start compiling a compendium of "78 Common Coding Mistakes on the Propeller 2", to be included in the documentation!
Seems it is not going to be that uncommon in the future, with many users.
The test code, the errors, the compendium or all of it!?
I am putting together a little tool for macro processing, I think it can do a very unoptimised C too. I was thinking of expermineting with those thoughts in there. It is a preprocessor tool for PNut.
I think calling it HELP is appropriate in my case: It would also allow you to easily do loops using C style constructs, ie make the code less verbose and reduce coding and run-time errors.
Sounds useful. Will this allow GAS code like