Addresses in asm & spin
BTX
Posts: 674
Hi all.
I've this issue, and I can't found the error in this.
I've the following code (I leave the important pieces).
All works fine until I uncoment the start line for the four cog, after that, data begins to come incorrect for the last three cogs. Why ????
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
I've this issue, and I can't found the error in this.
I've the following code (I leave the important pieces).
All works fine until I uncoment the start line for the four cog, after that, data begins to come incorrect for the last three cogs. Why ????
''This is the main code. CON _clkmode = xtal1 + pll16x _xinfreq = 5_000_000 var byte data1[noparse][[/noparse]128] ' byte data2[noparse][[/noparse]128] ' byte data3[noparse][[/noparse]128] ' byte data4[noparse][[/noparse]128] ' OBJ cog1 : "file1" cog2 : "file2" cog3 : "file3" cog4 : "file4" PUB Start | i,j cog1.start(@data1) cog2.start(@data1) cog3.start(@data1) ' cog4.start(@data1) All cogs works fine until you uncomment this line, after that, data fails in cogs 2, 3, and 4. '*************************************************************************** '*************************************************************************** '*************************************************************************** con begin = 1 {{This is the file1 driver code. }} var long cog PUB Start(Adrr) : Success Stop Success := (Cog := cognew(@Process, Adrr) + 1) ' PUB Stop {{Stop toggling process, if any.}} if Cog cogstop(Cog~ - 1) dat ' org Process mov Init,par ' add Init,#0 ' Init res 1 '*************************************************************************** '*************************************************************************** '*************************************************************************** con begin = 1 {{This is the file2 driver code. }} var long cog PUB Start(Adrr) : Success Stop Success := (Cog := cognew(@Process, Adrr) + 1) ' PUB Stop {{Stop toggling process, if any.}} if Cog cogstop(Cog~ - 1) dat ' org Process mov Init,par ' add Init,#128 ' Init res 1 '*************************************************************************** '*************************************************************************** '*************************************************************************** con begin = 1 {{This is the file3 driver code. }} var long cog PUB Start(Adrr) : Success Stop Success := (Cog := cognew(@Process, Adrr) + 1) ' PUB Stop {{Stop toggling process, if any.}} if Cog cogstop(Cog~ - 1) dat ' org Process mov Init,par ' add Init,#256 ' Init res 1 '*************************************************************************** '*************************************************************************** '*************************************************************************** con begin = 1 {{This is the file4 driver code. }} var long cog PUB Start(Adrr) : Success Stop Success := (Cog := cognew(@Process, Adrr) + 1) ' PUB Stop {{Stop toggling process, if any.}} if Cog cogstop(Cog~ - 1) dat ' org Process mov Init,par ' add Init,#384 ' Init res 1 '
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
Comments
I think we need a little more info. Can you also provide the missing Process code?
-Phil
all cogs starts with data1
cog1.start(@data1)
cog2.start(@data2)
cog3.start(@data3)
cog4.start(@data4)
@Phill I've posted the real code here.
@KIH Yes all begin with the same data pointer, then I increment that in each cog.
There is a reason, for what I'm needing to do this in this way, if not I would start each cog in their own address, it works, but it is not usefull for me when I'll have the final code.
Thanks for your help !!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
byte data[noparse][[/noparse]128]
So all cogs should start at the same pointer, and then correct that pointer for each one of them in 128 bytes, but the problem continue.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
You appear to be synchronizing four cogs, which share the same I/O pins. However, after synchronization, each cog invokes a rdlong in a looping structure, which can cause loss of sync. I don't know if this is the problem, but it's where I'd start.
-Phil
You're correct, that is my job, but when I used differents pointers for each cog, I'd no problems and all worked fine.
Now I'm only need to have all data in only one vector, do you still think it could be a syncro problem for the rdlong in th
I just don't know. There's a lot of code there to decipher. As a test, you could put your rdlong ahead of the synchronization and replace it in the loop with a mov. Your program wouldn't work with changing data, but at least you might be able to tease out whether loss of sync is the culprit. If that's the problem, you could use a spare pin, which one cog would toggle and the others wait for to resync after each rdlong.
-Phil
Like you suggested me, I'm doing a demo without any syncro, and more simple to test, then, after view the results I'll ask again.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
Here's a test that I made for the demo board, without sincronizing the cogs, and very simple.
For my surprise, I'm having the same problems as before, and I can't deduce what is happening, your colaboration to solve this is wellcomed.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
Where is the error, or what am I doing wrong?? Parallax people pls.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
I will try with your suggest my real code, and then I will ask again if I've some doubt.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
I'm looking into this and will respond as soon as I have something conclusive to say. So far, it doesn't make sence to me why there'd be a probelm at 1024 longs.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Jeff Martin
· Sr. Software Engineer
· Parallax, Inc.
BTX, your 4th Cog was overwriting the variable space that you allocated for your DataOut.
in testcog4 changing...
...in yuor original code to read...
... Will fix the issue you were experiencing
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 5/21/2009 5:44:49 AM GMT
The reason 1024 seemed to be some kind of boundry is because it is. [noparse]:)[/noparse] Your Test_Demo code declared DataIn and DataOut as being 128 longs each (that's 1024 bytes total)... incrementing the Init pointer by 4*256 puts it one long past the last long of DataOut, and into some other object's variable space (TV_Text's, I believe). Bumping the size of each array up by just one long causes the 4*256 adjustment to work properly (of course, at that point, you'd have to also make it 4*257 if you're expecting data in element 128 of DataOut).
It's very easy to get mixed up when doing this kind of stuff... heck, I was lost for a while too when I was seeing the results you indicated, then I took a mental step back to see the whole picture and it jumped out at me! [noparse]:)[/noparse]
I still do not know what is going on with the original code, however, I suspect it is a memory overwrite issue or synchronization-related one. I'm looking into that now.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Jeff Martin
· Sr. Software Engineer
· Parallax, Inc.
I would say that you beat me to it, but I think that we were thinking in Parallel. See my post just before yours.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
The problem you're experiencing is a sequence of events thing, related to the synchronization error that was mentioned as a possibility.· I'm glad you posted the actual code, otherwise, I wouldn't have been able to find the problem.
Solution:
In short, you're initializing the memory used by the assembly cogs after you start the assembly cogs.· To fix it, move your bytefill instructions before the pwmx.start instructions.
The·details of why it worked with 1, 2, or 3 new cogs but not 4:
In your code, you are launching assembly code into a cog (or cogs) and then perform a bytefill on the data that the assembly-based cog will use.· Like this:
When an assembly program is launched into a cog, the Spin cog executing the cognew command·starts the state machine that loads the assembly code into the new·cog and then returns to execute another Spin instruction.· It takes about 8K cycles for the assembly cog to load up 512 longs of assembly code (and data and whatever else follows the assembly image), plus a little more to clear out the special purpose registers and start the cog's clock.· During this time (the approximate 8K cycles), the Spin cog is executing another instruction.
So, in the example above, the bytefill takes 4880 cycles to write 1 to 128 bytes of FrameData_R1.· Thus, after starting the assembly cog loading process, the bytefill could be fully executed before the new assembly cog has finished loading up.
Launching two more instances of the object, like follows:
doesn't end up causing a problem because the Spin cog reaches the first bytefill line 6,816 cycles after starting the first assembly cog, so it's able to stay just ahead of each of the three new assembly cogs as they start up.
BUT... adding a fourth to the mix:
causes major problems because it takes the Spin cog 10,224 cycles to get to the first bytefill instruction after it starts the first assembly cog... thus, the first assembly cog starts reading its FrameData memory before it's been cleared by the Spin cog.
Want to know how I figured out the timing?
I checked the System Counter (cnt) before and after certain lines of code and subtracted the difference.· For example:
First I added the Parallax Serial Terminal object to your code, started it, and started the actual Parallax Serial Terminal software (comes with Propeller Tool v1.2.6) and configured it for the right serial port and baud rate.
Then I did this:
to determine the "overhead" of the i := cnt to j := cnt lines.· The number I got back in the Parallax Serial Terminal software was 368, meaning it takes 368 clock cycles of overhead for my timing code (the i := cnt line to the j := cnt line).
So, afterwards, I placed the i := cnt line just after the first pwm1.start... line, then placed the following just before the first bytefill... line:
Hope this all made sense.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Jeff Martin
· Sr. Software Engineer
· Parallax, Inc.
Post Edited (Jeff Martin (Parallax)) : 5/21/2009 5:49:50 AM GMT
See my post just above this one for my diagnosis based on the code "W_ASMdriver - Archive [noparse][[/noparse]Date 2009.05.14 Time 15.02].zip" that BTX (Alberto) attached to his second post.
Please correct me if I'm missing something, but I'm pretty sure I found the problem now.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Jeff Martin
· Sr. Software Engineer
· Parallax, Inc.
In Test_Demo.Spin DataIn and DataOut are each allocated with 128 longs or 512 bytes, so together they make up 1K Bytes of data.
For·testcog{1..4}.spin, the variable NewPtr has a fixed value of 512·and is·basically the byte offset between the DataIn and DataOut
A second variable called adition get's modified differently for each testcog{1..4}.spin and is the offset from the very first address·value of DataIn[noparse][[/noparse]0].
For testcog1.spin, the value of·adition is 0 since this is where it needs to be to properly redirect·DataIn[noparse][[/noparse]0] to the corresponding DataOut[noparse][[/noparse]0] address.
For testcog2.spin, the value of·adition is·128 since this is where it needs to be to properly redirect·DataIn[noparse][[/noparse]32] to the corresponding DataOut[noparse][[/noparse]32] address.
For testcog3.spin, the value of·adition is·256 since this is where it needs to be to properly redirect·DataIn[noparse][[/noparse]64] to the corresponding DataOut[noparse][[/noparse]64] address.
For testcog4.spin, the value of·adition is·512 and should be 384 (256+128=384) instead·to properly redirect·DataIn[noparse][[/noparse]96] to the corresponding DataOut[noparse][[/noparse]96] address.
With a value of 512 for adition in testcog4.spin, it actually writes to the long variable just after the allocation of DataOut[noparse][[/noparse]127], so·whatever data is there is being overwritten and causing the problem.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 5/21/2009 6:28:41 AM GMT
@Beau
You're right, in my last posted demo, I was wrong using 512 instead 384, that was pretty simple and stupid from me. sorry.
I'll add two cogs more that is my target, and try again (being carrefully when write the addresses).
@Jeff
I understood what you said about the timming problems when starting the cogs in my first poted code in this thread.
But, so, how to avoid that, or how to find the solution in that case ??
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
This wont work adding two more cogs.
adition for different cogs are:
Testcog1 value of adition:= 0
Testcog2 value of adition:= 128
Testcog3 value of adition:= 256
Testcog4 value of adition:= 384
Testcog5 value of adition:= 512
Testcog6 value of adition:= 640
If I can't increment by 512 or more, how to pass all data to the six cogs ????
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
Like this:
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Jeff Martin
· Sr. Software Engineer
· Parallax, Inc.
Remember that the testcog adition value represents the number of Bytes, while your DataIn and DataOut are representing Longs so...
Testcog5 would have the same problem that Testcog4 did before you changed the value from 512 to 384.
To Fix this...
1) In Test_Demo.spin, you need to increase the reserved variable space for the DataIn and DataOut from 128 to 192 and adjust the longfill's and the repeat loop accordingly
2) In each testcog{1..6} you need to change NewPtr to 768 instead of 512
Test_Demo - Archive [noparse][[/noparse]Date 2009.05.21 Time 10.42] ... is a correction of your code to get testcog{1..6} to work properly.
Test_Demo_Alternate - Archive [noparse][[/noparse]Date 2009.05.21 Time 10.56] ... is an example using only one testcog that has the same functionallity as your original code intention.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 5/21/2009 4:04:49 PM GMT
I'm so stupid, again the same issue, sorry please.
@Jefff & @Beau.
The idea of all this, is to start the six cogs with the same starting address, and then add in each cog, the displacement for his own use.
Like Jeff told me about to do the bytefill before start the cogs is ok, but what will happend when in internal loop of the code, I will need to actualize the data using bytefill ??
The idea of the task is some like this:
1- I defined a "Frame" array of xx bytes and initialize it.
2- I start six cogs using that array data, 1/6 of the data is for each cog.
3- Some external device will give me a new array of data and I 'll save it in another array, lets name it "NewFrame[noparse][[/noparse]xx]" .
4- I will pass the data from the "NewFrame" array to the first "Frame" array.
5- I'll loop to point 3.
Since my external device is actualizing the data constantly, during the cogs are using the "Frame" array, It's impossible for me to use only one array, hope it is clear.
And this is the easy part, of what I need to do.... [noparse]:([/noparse]
Thanks a lot . !!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.