1) If the methods are never calling each other, then I suggest to use variant c). You need only one variable for all subroutines.
The difference between b) and c) is, that you can use the res keyword only at the end of your code after long, word or byte declarations. Otherwise your code will not work properly and the compiler does not inform you about this.
So you could use only b) to declare a locale variable.
2) Your variables x1, x2 and so on are temporary variables. Therefore you should name these as tmp1 and so on, while x is more used as a variable of coordinates. So it would be better readable for others.
The best goal should be to minimize your code. So when you not must have separate variables for different routines try to share the temporary variables. I would recommended to write a comment on each routine which describes the usage of variables.
janb said...
Does it make sense to declare 2 local variables of the same name ´:x1´´in both subroutines
to save one register ?
I don't know what you mean. The Propeller doen't have registers. All registers are used as variables in Cog memory. So if you would declare 2 (local) variables it takes 2 longs of Cog memory.
Hi,
could you help me to understand why the following code works correctly:
after certain pins reach desired state it reads 2 bytes from upper pins 16-31 and stores the value in
3 consecutive local variables o21,o22,o23
This is what I want.
However ...
'-------------------------
CaptureFrame_slow
waitpeq frameState, frameMask 'wait new fframe
waitpeq lineState, lineMask 'wait for start condition
mov n,#3
:newPix waitpeq pixState, pixMask 'wait for start condition
mov x, ina 'store pins state
waitpeq zero, pixMask
shr x,#16
:save mov o21,x
add :save,d_inc 'increment destination in instruction above
djnz n, #:newPix 'go for next transition
d_inc LONG $0000_0200
CaptureFrame_slow_ret ret
....
o20 long 0
o21 long 0
o22 long 0
o23 long 0
the full code is much longer (~200 lines). If I move the definition of
d_inc· LONG· $0000_0200·····
way back at the end of the ASM code (I wanted to reuse this constant in many places to save COG memory)
the code seems to not increment address in ':save' instruction. It replaces 3 time the value of the variable 021. Why is that?
Thanks
Jan
janb,
One obvious mistake is that you've put d_inc (a constant) between two instructions where it will be executed as an instruction. Move it at least to after the CaptureFrame_slow_ret. It shouldn't really matter though since this value would be executed as a NOP. Also, do you reinitialize the instruction at :save?
What you've posted (at least the store part) ought to work. I don't see anything else that's obvious. I'd have to see the whole thing to understand why moving the constant causes a problem. If you have some RES statements before the constant, that would throw things off.
Hi Mike,
the full OBJ code is attached. Unfortunatly it is not short, nor one can run it - there is more objects needed.
The problematic subroutine is
CaptureFrame_slow
waitpeq frameState, frameMask 'wait new frame
waitpeq lineState, lineMask 'wait for new line
mov n,lineLen
:newPix waitpeq pixState, pixMask 'wait for new pixel
mov x, ina 'acquire new value
and x,dataBusMask
shr x,dataBusPin0
:save2 mov CogArr,x
add :save2,d_inc 'increment destination in instruction above
waitpeq zero, pixMask
djnz n, #:newPix 'go for next transition
CaptureFrame_slow_ret ret
d_inc LONG $0000_0200
I did moved d_inc below subroutine as you have suggested.
The code as is works.
But if I move this one line at the end the line
add :save2,d_inc
does nothing and data pileup on the first address of 'CogArr'.
If you have time could you advice me if the order of local/global variables is optimal.
I started with your earlier advice to declare local trmporary variables in every subroutine but this reduced the avaliable space for my big local array - so now there are few working variables decalerd at the end. E.g. ':n' and 'n' .
Also the lines
mov x, ina 'acquire new value
and x,dataBusMask
shr x,dataBusPin0
are ment to acquire one value from a bus made out of K pins starting form pin0 - is it optimal?
Thanks ·Jan
1) You do have variables at the end in the pattern "long ... long ... res ... long ... long" and this is not legal. The RES statements must come last (although a FIT statement can follow it). No LONG statements or instructions may follow a RES in that DAT section. There are some exceptions to that rule, but that's a longer discussion.
2) You're using :d_inc for many of the copies of $0000_0200. That won't work if you want to use a single copy of the constant. Put the "d_inc LONG $0000_0200" at the end (before the RES) and change all the ":d_inc" to just "d_inc".
- When you modify an instruction, it is good style to clearly mark the modified part as 0 (or #0 if appropriate) - some even use 0-0.
So you will alwys be remainded that you have to initialize it; though there will be rare cases when code ist performed only once and a static preset will do....
- There is no such concept as "local variables" in a COG, just "local names". These NAMES are valid between two non-local names only! This can lead to dangerously inserting variables (LONG) within the instruction flow...
The idea behind this practice - to save space - however is absolutly wrong. The idea to protect local static variables by hiding their names on the other hand is good and valid. Because of the JMPRET conventions this can only be accomplished using an additional "jump-over".
- Saving memory (i.e. registers!!) by overlaying local variables (i.e. using registers in the old fashioned way) is a tricky practice needing much discipline, experience and a good naming practice! Nested routines will always play tricks with your best intentions
deSilva said...
- Saving memory (i.e. registers!!) by overlaying local variables (i.e. using registers in the old fashioned way) is a tricky practice needing much discipline, experience and a good naming practice! Nested routines will always play tricks with your best intentions
It's easy to run out of space on a cog. The two approaches I've used up to now to reuse variable memory have been:
The register approach.
r0 RES 1
r1 RES 1
r2 RES 2
Then taking pains to work out where each can be reused.
And the multiple labels on shared memory approach.
foo
bar res 1
x
trumpet res 1
y
trombone res 1
This makes for more clarity in reading the code, but far more difficulty in working out which locations can be reused.
But it occurs to me now that this is purely a matter of scope, which a high level language will deal with by means of having local variables on the stack, and safely reusing memory that way. So how about using a naming convention to indicate scope. e.g. When writing a TV driver, you might have the following structure:
scanline loop
frame loop
set background
process sprites
do sprite
So the rule for whether you can share a memory location is to compare the scope part of each varable name, and you can reuse if it is different. Different, not just shorter.
Scanline_count RES 0
ScanlineFrame_count
ScanlineSprites_count res 0
ScanlineFrame_foo
ScanlineSpritesSprite_bar res 0
What do you thiink? It won't appeal to those who like their assembly code terse, that's for sure. And scope names would need to be short to fit in the 30 char identifier limit. I think it might be a workable solution to the problem. But I'd have to try it out for real to get a feel for how useful it is.
Post Edited (CardboardGuru) : 6/24/2007 2:24:24 PM GMT
(1) There can be variables in a routine being used statically, i.e. some "local memory"; they are kind of global and must be left alone under all circimstances (thus ned for a special naming convention)
(2) In most cases a "static call tree" can be constructed, i.e. a simple graph showing who calls who. As recursion is not straightforward with the Propeller, it is generally assumed that this is always possible, but it is not!
Consider a routine A calling B, deping on a flag - then using the same global flag B calls C.
Now the flag beeing unset, A could call C and C than B without violation of any law except the lawe of good prgramming practice
Thus it is neccessary for any rules to have a unambigious static call tree in the first place!
Under those circumstances, all routines can use any variables not in their calling path; this is similar to what CardboardGuru has in mind
(3) Alas, issues start already earlier within longer routines, e.g. using "I", and "J" as loop indexes.... Note however that there is no help for this even in highest level languages (except of the functional kind!).
I sometimes used extensive documentation of the kind:
"Using: R1, R2, R3"
and
"Free: R3"
which could help, when beeing painstakingly updated with each change...
deSilva said...
(1) There can be variables in a routine being used statically, i.e. some "local memory"; they are kind of global and must be left alone under all circimstances (thus ned for a special naming convention)
Then they'd have larger scope. Such as the Scanline_count variable I gave. Some variables will have file scope - they'd not have any scope prefix, and couldn;t be resused for anything else.
deSilva said...
(2) In most cases a "static call tree" can be constructed, i.e. a simple graph showing who calls who. As recursion is not straightforward with the Propeller, it is generally assumed that this is always possible, but it is not!
Consider a routine A calling B, deping on a flag - then using the same global flag B calls C.
Now the flag beeing unset, A could call C and C than B without violation of any law except the lawe of good prgramming practice
Thus it is neccessary for any rules to have a unambigious static call tree in the first place!
True. It wouldn't work automatically, you'd have to be intelligent about what scope you gave to a variable. It's just a notational way of expressing what you know about how long a variable's value remains used. Where there is doubt, you'd give the widest scope to a variable. Perhaps file scope - which means may not be reused for anything.
It wouldn't be a replacement for the dynamic scoping that comes with a stack. But then I haven't yet seen any prop code complicated enough that it would need dynamic scoping.
Thanks Guys!
this one will be a simple on: how do multiply to vales in assembler?
Assume register x & y hold to arbitrary (not too large) integers.
how do I get its product in to register z?
Do I need to make a loop and add X to register Z Y-times, like in kidergarten?
Jan
There was a long posting by Chip soon after the Propeller was released containing, among other things, multiplication, division, and square root routines. I don't remember the link to it, but I've attached a copy I had on hand. It was under the subject "Propeller Guts".
I see,
so it is like in the kindergarten, except instead of multiplying Y times you rather shift & sum for powers of 2 used in Y.
Very tricky!
Thanks
Jan
Hi Paul,
Good to hear.
I did counted - this 'mul'-emulation costs ~55 instructions (~250 clock ticks) - much more than 1 instruction per 'add'
So I'm forced to rething my ASM algo.
An obvoius questions: when? will it be stil propeler or sth else/different?
Thanks for the feedback
jan
No answer to when, it will be code compatible with the current Propeller, but there will be enough enhancements that there wont be an absolute compatibility.
Sorry
Well, I did never expect the Propeller II to be fully bit-code compatible - you squeezed out to many bit fields in the Propeller I
And - of course - marketing would not allow you to point out any timings
Hi Guys,
I'm stuck. The following code works properly:
'---------------------------
ClearArray
mov x,#12
movd :self,#CogArr 'reset address of Cog array
mov i,bufLen ' clear only used portion of the buffer
:self mov 0-0,x
add :self,d_inc 'increment destination in instruction above
djnz i, #:self 'continue till end of line
'mov n,#22
ClearArray_ret ret
It fills internal COG array 'CogArr' of lenght 'bufLen' with values of 12.
But if I enable one more line in this routine ·mov n,#22
which changes some other register it caus this routine to write zeros instead of 12 to my array.
I do not get this entangelment.
The full code is attached.
Please advice
Jan
Hi,
with time and patience I solved·my previous·problem, the ASM code was writeing on itself.
Perhaps you could help me with new problem?
I'd like to measure exactly (1 prop clock tick accuracy) the time of· some pin going from ·high to low . The time interval is of the order of· ~20 prop ticks.
The following code works only approximate, according to the manual· waitpeq takes 5+ clock ticks - it is not fully deterministic.
waitpeq zero, pixMask
mov T1,cnt
waitpeq pixMask, pixMask
waitpeq zero, pixMask
mov T2,cnt
zero long 0
Since I want to measure deltaT only once· would it be possible to use a prop counter instead?
waitpeq zero, pixMask
'??? activate ctra, frqa,dira
'wait fixed amount of time, slightly longer then expected measurement time
'extract content of ctra to get T2-T1
Can it be done? Would be be more accurate, providing there is enough clock ticks to initialize the counter while pin pointed by pinMask is low?
Any suggestions? Perhaps there is somewhere a code I could look it up?
Thanks
Jan
The easiest thing is to set up the counter ahead of time to use mode %01100 which counts clock cycles whenever the selected pin is zero. Your program would do a "waitpne zero,pixMask". At that point, phsa or phsb would contain the number of clock cycles the pin was low or it would contain zero (or whatever its previous value was). If T2-T1 is zero, there was no pulse. If T2-T1 is greater than zero, there was a pulse and you have its width in 12.5ns clock cycles (with an 80MHz system clock).
Hi,
Thanks again.
The code below does work!· It measures separately duration of negative and positive state of pin#5 and stores 4 cnt values in variables t0,...t4.
Now I want to do more·complex stuff:
Q1:·to measure time for a coincidence state of 2 pins I should use mode %10001 or %10010, ...etc, right?
Q2: what is the difference between modes %01100 and %10101 ?
Thanks
Jan
:modeNEG long %01100 <<26
:modePOS long %01000 <<26
:mask long 1<<5
:pin long 5
mov frqb,#1 ' increment cog counterB by 1 per clock tick (at 80 MHz)
'negative state duration
mov x,:modeNEG
add x,:pin
mov ctrb,x ' count when pin#1 is LOW
waitpeq :mask, :mask ' wait untill counter stops counting 1st time
mov t0,phsb ' save counter value
waitpeq zero, :mask ' give it a chance to count again
waitpeq :mask, :mask ' wait untill it stops counting 2nd time
mov t1,phsb
'positive state duration
mov x,:modePOS
add x,:pin
mov ctrb,x ' count when pin#1 is LOW
waitpeq zero, :mask ' wait untill counter stops counting 1st time
mov t2,phsb ' save counter value
waitpeq :mask, :mask ' wait untill it stops counting 2nd time
waitpeq zero, :mask ' give it a chance to count again
mov t3,phsb
Jan,
Yes, to measure a logical coincidence use mode %10001 or %10010.
I believe modes %01100 and %10101 are functionally equivalent. I suspect the logic was easier this way than trying to add some additional conditions for the duplicate mode values.
Hi Jan,
To clarify a point, waitpeq is deterministic with respect to the pin state. The reason it is listed as 5+ is it takes 4 clocks to process the instruction, plus however many cycles of compare necessary to achieve the wait state. If the value is true at the beginning it will take 5 cycles since only one compare cycle occurs. For situations where more than one compare cycle occurs, the next instruction begins execution on the next clock cycle after a comparison evaluates true.
Hi,
thanks for the explanation. I'm more interested in the case when a cog stops at waitpeg and now waits for a given pin state - in my example pin goes high at the beginning of new frame from a CMOS camera, shortly after data transmission starts.
How much time it will take between the frame-pin (driven by the camera) goes high and execution of the next ASM instruction in the cog after weitpeq?
I need to know this, since the 1 st image data in the frame will show e.g. 1 usec after the frame-pin goes high and pixel data will change every 50 ns (i.e. one ASM instruction time). I need to know very precise which pixel I'm reading and which skipping
Thanks
Jan
The next instruction will begin executing the next clock cycle, since at 80MHz each clock cycle is 12.5 ns, the first stage of the next instruction will begin executing between 0 and 12.5ns after the line goes high, since that is the resolution of the comparator.
Comments
1) If the methods are never calling each other, then I suggest to use variant c). You need only one variable for all subroutines.
The difference between b) and c) is, that you can use the res keyword only at the end of your code after long, word or byte declarations. Otherwise your code will not work properly and the compiler does not inform you about this.
So you could use only b) to declare a locale variable.
2) Your variables x1, x2 and so on are temporary variables. Therefore you should name these as tmp1 and so on, while x is more used as a variable of coordinates. So it would be better readable for others.
The best goal should be to minimize your code. So when you not must have separate variables for different routines try to share the temporary variables. I would recommended to write a comment on each routine which describes the usage of variables.
I don't know what you mean. The Propeller doen't have registers. All registers are used as variables in Cog memory. So if you would declare 2 (local) variables it takes 2 longs of Cog memory.
@Mike
Thomas
could you help me to understand why the following code works correctly:
after certain pins reach desired state it reads 2 bytes from upper pins 16-31 and stores the value in
3 consecutive local variables o21,o22,o23
This is what I want.
However ...
the full code is much longer (~200 lines). If I move the definition of
d_inc· LONG· $0000_0200·····
way back at the end of the ASM code (I wanted to reuse this constant in many places to save COG memory)
the code seems to not increment address in ':save' instruction. It replaces 3 time the value of the variable 021. Why is that?
Thanks
Jan
One obvious mistake is that you've put d_inc (a constant) between two instructions where it will be executed as an instruction. Move it at least to after the CaptureFrame_slow_ret. It shouldn't really matter though since this value would be executed as a NOP. Also, do you reinitialize the instruction at :save?
What you've posted (at least the store part) ought to work. I don't see anything else that's obvious. I'd have to see the whole thing to understand why moving the constant causes a problem. If you have some RES statements before the constant, that would throw things off.
the full OBJ code is attached. Unfortunatly it is not short, nor one can run it - there is more objects needed.
The problematic subroutine is
I did moved d_inc below subroutine as you have suggested.
The code as is works.
But if I move this one line at the end the line
does nothing and data pileup on the first address of 'CogArr'.
If you have time could you advice me if the order of local/global variables is optimal.
I started with your earlier advice to declare local trmporary variables in every subroutine but this reduced the avaliable space for my big local array - so now there are few working variables decalerd at the end. E.g. ':n' and 'n' .
Also the lines
are ment to acquire one value from a bus made out of K pins starting form pin0 - is it optimal?
Thanks
·Jan
2) You're using :d_inc for many of the copies of $0000_0200. That won't work if you want to use a single copy of the constant. Put the "d_inc LONG $0000_0200" at the end (before the RES) and change all the ":d_inc" to just "d_inc".
- When you modify an instruction, it is good style to clearly mark the modified part as 0 (or #0 if appropriate) - some even use 0-0.
So you will alwys be remainded that you have to initialize it; though there will be rare cases when code ist performed only once and a static preset will do....
- There is no such concept as "local variables" in a COG, just "local names". These NAMES are valid between two non-local names only! This can lead to dangerously inserting variables (LONG) within the instruction flow...
The idea behind this practice - to save space - however is absolutly wrong. The idea to protect local static variables by hiding their names on the other hand is good and valid. Because of the JMPRET conventions this can only be accomplished using an additional "jump-over".
- Saving memory (i.e. registers!!) by overlaying local variables (i.e. using registers in the old fashioned way) is a tricky practice needing much discipline, experience and a good naming practice! Nested routines will always play tricks with your best intentions
It's easy to run out of space on a cog. The two approaches I've used up to now to reuse variable memory have been:
The register approach.
Then taking pains to work out where each can be reused.
And the multiple labels on shared memory approach.
This makes for more clarity in reading the code, but far more difficulty in working out which locations can be reused.
But it occurs to me now that this is purely a matter of scope, which a high level language will deal with by means of having local variables on the stack, and safely reusing memory that way. So how about using a naming convention to indicate scope. e.g. When writing a TV driver, you might have the following structure:
So how about naming vars starting with the scope.
So the rule for whether you can share a memory location is to compare the scope part of each varable name, and you can reuse if it is different. Different, not just shorter.
What do you thiink? It won't appeal to those who like their assembly code terse, that's for sure. And scope names would need to be short to fit in the 30 char identifier limit. I think it might be a workable solution to the problem. But I'd have to try it out for real to get a feel for how useful it is.
Post Edited (CardboardGuru) : 6/24/2007 2:24:24 PM GMT
(1) There can be variables in a routine being used statically, i.e. some "local memory"; they are kind of global and must be left alone under all circimstances (thus ned for a special naming convention)
(2) In most cases a "static call tree" can be constructed, i.e. a simple graph showing who calls who. As recursion is not straightforward with the Propeller, it is generally assumed that this is always possible, but it is not!
Consider a routine A calling B, deping on a flag - then using the same global flag B calls C.
Now the flag beeing unset, A could call C and C than B without violation of any law except the lawe of good prgramming practice
Thus it is neccessary for any rules to have a unambigious static call tree in the first place!
Under those circumstances, all routines can use any variables not in their calling path; this is similar to what CardboardGuru has in mind
(3) Alas, issues start already earlier within longer routines, e.g. using "I", and "J" as loop indexes.... Note however that there is no help for this even in highest level languages (except of the functional kind!).
I sometimes used extensive documentation of the kind:
"Using: R1, R2, R3"
and
"Free: R3"
which could help, when beeing painstakingly updated with each change...
Then they'd have larger scope. Such as the Scanline_count variable I gave. Some variables will have file scope - they'd not have any scope prefix, and couldn;t be resused for anything else.
True. It wouldn't work automatically, you'd have to be intelligent about what scope you gave to a variable. It's just a notational way of expressing what you know about how long a variable's value remains used. Where there is doubt, you'd give the widest scope to a variable. Perhaps file scope - which means may not be reused for anything.
It wouldn't be a replacement for the dynamic scoping that comes with a stack. But then I haven't yet seen any prop code complicated enough that it would need dynamic scoping.
this one will be a simple on: how do multiply to vales in assembler?
Assume register x & y hold to arbitrary (not too large) integers.
how do I get its product in to register z?
Do I need to make a loop and add X to register Z Y-times, like in kidergarten?
Jan
so it is like in the kindergarten, except instead of multiplying Y times you rather shift & sum for powers of 2 used in Y.
Very tricky!
Thanks
Jan
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
Good to hear.
I did counted - this 'mul'-emulation costs ~55 instructions (~250 clock ticks) - much more than 1 instruction per 'add'
So I'm forced to rething my ASM algo.
An obvoius questions: when? will it be stil propeler or sth else/different?
Thanks for the feedback
jan
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
Well, I did never expect the Propeller II to be fully bit-code compatible - you squeezed out to many bit fields in the Propeller I
And - of course - marketing would not allow you to point out any timings
I'm stuck. The following code works properly:
It fills internal COG array 'CogArr' of lenght 'bufLen' with values of 12.
But if I enable one more line in this routine
·mov n,#22
which changes some other register it caus this routine to write zeros instead of 12 to my array.
I do not get this entangelment.
The full code is attached.
Please advice
Jan
with time and patience I solved·my previous·problem, the ASM code was writeing on itself.
Perhaps you could help me with new problem?
I'd like to measure exactly (1 prop clock tick accuracy) the time of· some pin going from ·high to low . The time interval is of the order of· ~20 prop ticks.
The following code works only approximate, according to the manual· waitpeq takes 5+ clock ticks - it is not fully deterministic.
Since I want to measure deltaT only once· would it be possible to use a prop counter instead?
Can it be done? Would be be more accurate, providing there is enough clock ticks to initialize the counter while pin pointed by pinMask is low?
Any suggestions? Perhaps there is somewhere a code I could look it up?
Thanks
Jan
Thanks
Jan
thanks a lot. So my code should look like:
I'll give it a try. I appreciate the immediate feedback, it is a lot of fun to work in such enviroment
Jan
·
Thanks again.
The code below does work!· It measures separately duration of negative and positive state of pin#5 and stores 4 cnt values in variables t0,...t4.
Now I want to do more·complex stuff:
Q1:·to measure time for a coincidence state of 2 pins I should use mode %10001 or %10010, ...etc, right?
Q2: what is the difference between modes %01100 and %10101 ?
Thanks
Jan
Yes, to measure a logical coincidence use mode %10001 or %10010.
I believe modes %01100 and %10101 are functionally equivalent. I suspect the logic was easier this way than trying to add some additional conditions for the duplicate mode values.
To clarify a point, waitpeq is deterministic with respect to the pin state. The reason it is listed as 5+ is it takes 4 clocks to process the instruction, plus however many cycles of compare necessary to achieve the wait state. If the value is true at the beginning it will take 5 cycles since only one compare cycle occurs. For situations where more than one compare cycle occurs, the next instruction begins execution on the next clock cycle after a comparison evaluates true.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
thanks for the explanation. I'm more interested in the case when a cog stops at waitpeg and now waits for a given pin state - in my example pin goes high at the beginning of new frame from a CMOS camera, shortly after data transmission starts.
How much time it will take between the frame-pin (driven by the camera) goes high and execution of the next ASM instruction in the cog after weitpeq?
I need to know this, since the 1 st image data in the frame will show e.g. 1 usec after the frame-pin goes high and pixel data will change every 50 ns (i.e. one ASM instruction time). I need to know very precise which pixel I'm reading and which skipping
Thanks
Jan
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.