PDA

View Full Version : wierd loop cycle



nicolad76
03-21-2009, 01:50 AM
Hello,
I have run itno the following problem"

I have a loop



repeat c from 0 to 80
eink.RAM_TestW(c,1)
repeat until eink.IsDone

RAM_TestW(c,1) is:



PUB RAM_TESTW (addr, data)
address:=addr
memData:=data
command:= RAM_TESTWRITE

which will trig a cog to execute:


RAM_TestWr
mov temp, par
add temp, #4
rdlong m_Address, temp ' get the HUB address of the image
add temp, #4
rdlong m_image_size, temp
add temp, #4
mov m_memData, temp
mov m_RAM_Address, m_address
rdlong m_DataByte, m_memData
call #RAM_SetAddress
call #Ram_Write
RAM_TestWr_ret ret

Full code is in attachment. Main file is pview.spin

The problem I have is that when I change the upper bound limit of the loop I can see that everything stop working.

When the loop runs from 0 to 7, everything works, when it goes over 7 then something wrong happens.

I could find out that running the cycle from 0 to 8, for example, the row that creates the trouble is
·······

mov m_RAM_Address, m_address

I do not understand why.
The programm is a test that should write some some values (in this case it is always 1) into a memory and then I should read it as bitmap and send it to display. The routin to show the image works so I can see thet varying the value of the loop, the display shows a standard pattern (%0000001) or just it does not show anything.
By commenting out a line at the time (!?!?!? I am crazy...or desperated) I found that when I comment the above row, the display shows the standard pattern with any values of the loop.

I hope I have been clear enough bcs...I really need help here!!! :)

Thanks to all!!!!
Nicola

nicolad76
03-21-2009, 05:13 AM
bug found!!!!!

I was not masking the data before sending it out so I was altering other pins value.http://forums.parallax.com/images/smilies/burger.gif

MagIO2
03-21-2009, 06:15 AM
Wow ... that's disadvantageous coding ... I'm sorry that I have to say so.

···················· call··· #init
···················· call··· #Init_RAM
Ok, that's not ugly, suppose it's done to make the beginning of the code more pleasant looking. But you waste 4 longs for it. 2 for the calls and 2 for the return. Might be valuable in case the code gets bigger.

wait_for_command
··················· mov···· timer,param_idle_wait········ 'init timer
··················· add···· timer,cnt
wait_for_command_loop
·······················
··················· rdlong· temp2,m_command·········· wz· 'check for a command
··················· add···· timer,#40···················· 'add max clocks this loop may take to timer
······· if_z······· waitcnt timer,param_idle_wait········ 'if no command, sleep before checking again
······· if_z······· jmp···· #wait_for_command_loop

param_idle_wait is zero and nowhere changed. So, what is it good for?

··················· mov···· timer, cnt
··················· add···· timer, param_idle_wait
wait_for_command_loop
··················· rdlong· temp2, m_command
········if_z········waitcnt timer, param_idle_wait········ ' this already adds idle wait to the timer value
········if_z······· jmp·····wait_for_command_loop
···················
··················· '---- Run Command ----
···················
··················· cmp···· temp2,#CMD_INIT············ wz
······· if_e······· call··· #init_pvi
······· if_e······· jmp···· #command_done
········...
This is also a big waste of longs and of runtime. Did you ever hear about jumping tables? Your commands are enumerated. So you only have to use the command and add the adress of the jumping table to it and jump to that adress. There are fast ways (the jumping table is simply a list of "jmp #adr"es) or·RAM optimized versions (array of·bytes with which·half of the COG RAM can be reached).

invert
··················· mov m_DataByte, #0·············
··················· mov m_DataByte, m_NegativePicture································· ······
··················· call #writebyteCMD
invert_ret········· ret
What's the first mov good for? m_DataByte is immediately overwritten by the next move.

Some commands do exactly the same as others except of one command. Please see my thread "Why JMP JMP JMP makes sense". You can simply replace this one command and use the same sequence for all subroutines.

You use par + 4, par + 8 more often. So, you should calculate these values in your init and use the precalculated values everywhere. Saves some more longs and runtime.





·

nicolad76
03-21-2009, 09:26 AM
Hi MagIO2,
thanks for your analisys....this script is coming out from tests I am doing....the final version of the code will be cleaner of course.
So far I just need to learn how to work with the RAM I am using and the display...
The move of #0 is an altenative to masking the value of m_NegativePicture before shr #16 and sending it to the OUTA. Now I am masking so I wil not need it.

I do not know much about "jumping table" but I'll take a look to it....
This is what I like of this forum....even a newbie like me can learn a lot!!!!! :)
Thanks

MagIO2
03-22-2009, 03:57 AM
Here is an example·for·a jumping-table. It's a version with no optimisation, but it should show the concept. I used this for an IR-driver that I currently try to implement. I do pretty much the same than you ... checking a HUB-buffer for a command which has to be bigger than 0. The jump to ir_wait10 simply waits for some ms before checking again by jumping back to ir_com_loop.
You simply add the adress of the jumping-table-list to the command. Storing the result in the source-part of ir_read allows to read the adress contained in this position of the table. The NOP is needed to avoid interference, as the COG has a pipeline.

Of course there are different ways to improve this code. E.G. it would be enough to use words to store the adresses. Even bytes would work if the functions to be called stay in the lower or upper half of the COG RAM (you can find an example of that in the sources of the SPIN interpreter). This of course saves memory.
Storing the jump commands in the table instead of the adresses would allow faster execution. You only have to jump to ir_cmd after the add. No movs, no nop, no ir_get.



ir_com_loop············ xor············ outa, #1····················· ' debug
······················· rdlong········· ir_cmd, ir_com WZ············ ' copy the command given from outside this COG
······· if_z··········· jmp············ #ir_wait10··················· ' idle for a while
······················· add············ ir_cmd, #ir_cmd
······················· movs··········· ir_get, ir_cmd
······················· nop
ir_get················· mov············ ir_cmd, ir_cmd
······················· jmp············ ir_cmd
ir_trim
' here comes the trim-code
ir_read
' here comes the read-code
ir_write
' here comes the write-code

ir_cmd······· long····· 0, ir_trim, ir_read, ir_write

MagIO2
03-22-2009, 04:48 AM
Here is the version which is shorter in the loop.

ir_com_loop xor outa, #1 ' debug
rdlong ir_cmd, ir_com WZ ' copy the command given from outside this COG
if_z jmp #ir_wait10 ' idle for a while

add ir_cmd, #ir_cmd
jmp ir_cmd

The jump-table now reads like that:

ir_cmd long 0
jmp #ir_trim
jmp #ir_read
jmp #ir_write

MagIO2
03-22-2009, 05:43 AM
This Propeller is sooo cool!

I mentioned in my first reply, that it makes sense to pre-calculate par+4 and par+8 ... if these are used more often to transfer results from COG to HUB RAM. Now I reworked a program of mine to do that as well. And this is the result:

dat
······· org 0
IR_commander
ir_sense··············· mov············ ir_com1, par················· ' copy parameter which holds the adress of communication buffer
ir_0_max··············· add············ ir_com2, par
ir_1_max··············· add············ ir_com3, par
ir_0_min··············· add············ ir_com4, par
ir_1_min··············· add············ ir_com5, par

.....

ir_com1······ long····· 0
ir_com2······ long····· 4
ir_com3······ long····· 8
ir_com4······ long····· 12
ir_com5······ long····· 16

I like that! Normal microcontroller-programming-logic would tell you: move par to ir_com1, ir_com2, ir_com3 and ir_com4 and then add 4 to ir_com2, 8 to ir_com3 and so on. But in the COG variable initialisation is always done when loading code to the COG. So, if initializing it with the right values helps saving code. And now the best. This initialization-part is only needed once. That's why you find labels in each of these lines. This RAM is simply reused later in the code.

Somewhere in the code you only have to do the following for writing the result into HUB memory:

······················· wrlong········· ir_1_max, ir_com2
······················· wrlong········· ir_1_min, ir_com3
······················· wrlong········· ir_0_max, ir_com4
······················· wrlong········· ir_0_min, ir_com5
......

Boah .... one idea after the other. While writing this post I got the following:
All subroutines in the COG should return the same bunch of variables (even if not all are needed). Then we would only have one block of wrlong. And this block can be self-modified to contain the right adresses:

IR_commander is the same as above but the long section and the wrlong section merged:

ir_com1············ wrlong········· ir_1_max, 4
ir_com2············ wrlong········· ir_1_min, 8
ir_com3············ wrlong··········ir_0_max, 12
ir_com4············ wrlong········· ir_0_min, 16
······················ ret


You currently see me discovering the propeller ... sorry for that ... if it's to annoying, please tell me and I'll stop posting all my findings here ;o)


·

nicolad76
03-22-2009, 09:00 AM
I would say you should keep doing it. This is a good lesson for me.
Thanks