Questions about the P2 instrution set for anyone but Chip

msrobots · 2015-10-24 01:52

I think that the event(interrupt?) driven mailboxes are really neat to do a pipelining process between cogs.

In opposite to the P1 the P2 has some special mailboxes, able to execute a interrupt (one each cog? not sure).

Else a HUB read or write of a long is atomic, no need for semaphores. Same for P1 and P2.

Locks are just needed if more then one long need to get changed in a atomic way. Then you will need some lock/semaphore mechanism to ensure data integrity with multiple cogs.

And maybe the still upcoming smart pins have some interesting fast data links between cogs on one or more props without hub access. Just share a smart pin.

Enjoy!

Mike

rjo__ · 2015-11-15 03:32

using cordic to rotate a point.

from Chip's instruction list we have:

CCCC 1101010 0LI DDDDDDDDD SSSSSSSSS QROTATE D/#,S/#

And from his online documentation:

"Rotate (X32,Y32) by Theta32 → (X32,Y32)"

This makes it look like the syntax should be QROTATE D/S/theta.

In the old calls to the cordic there was a setqx and setqy... do we set theta now?

When we say "X32"... is decimal notation implied?

When we specify the angle... is the part 1/2^32 or are we using some decimation here?

Thanks

Electrodude · 2015-11-15 20:35

rjo__ wrote: »

using cordic to rotate a point.

from Chip's instruction list we have:

CCCC 1101010 0LI DDDDDDDDD SSSSSSSSS QROTATE D/#,S/#

And from his online documentation:

"Rotate (X32,Y32) by Theta32 → (X32,Y32)"

This makes it look like the syntax should be QROTATE D/S/theta.

In the old calls to the cordic there was a setqx and setqy... do we set theta now?

When we say "X32"... is decimal notation implied?

When we specify the angle... is the part 1/2^32 or are we using some decimation here?

Thanks

Have you tried SETQ theta, QROTATE x, y (or y, x)? I thought that was what SETQ was originally intended for before it also became used for block hub transfers?

I don't think it makes any difference if you think of X32 as referring to an integer or a fixed-point number. You should be able to put your decimal (or binary) point wherever you want. For theta, I would guess that 2^32-1 = 360 degrees.

rjo__ · 2015-11-15 22:55

Thank you! Have the rugrat right now so it is a matter of who sleeps first.

rjo__ · 2015-12-04 01:54

As usual I cannot see the obvious.

I am trying to do something very simple... copy the contents of cog ram into lut ram.
This should be simple... we have wrlut. We have @,#,and ## ... and we have a source and a destination... without knowing anything there should be a limited number of ways to do it wrong. In this case, I know how to get the destination right... so this gives me only

@
@#
#@
#
@##
##@
###
for the source.

I've tried most of these, but since I'm not taking notes and keep getting interrupted, I can't remember which ones:)

Electrodude · 2015-12-04 02:11

You want # or ## since you want to give it an immediate source value, and you don't need ## since all cogram addresses fit in 9 bits. You don't want @ since you want a cogram address and not a hubram address.

Therefore, I would think wrlut #0, #0 should copy all of cogram into lutram.

What does ### mean? I've never seen it before.

rjo__ · 2015-12-04 02:38

Thanks

I was trying to do it with a djns loop not with rep ... makes perfect sense though.
as I recall Chip used just 0 for the source... so maybe just 0,0. I'll try it.
But first... a beer:)

rjo__ · 2015-12-04 02:42

to get ### I added # to ##, but I could just as easily added ## to #

kwinn · 2015-12-04 02:51

rjo__ wrote: »

Thanks

I was trying to do it with a djns loop not with rep ... makes perfect sense though.
as I recall Chip used just 0 for the source... so maybe just 0,0. I'll try it.
But first... a beer:)

Got to lubricate the thinking machinery ;-)

Ariba · 2015-12-04 05:47

In this special case a code snippet may help more than a beer

To copy the whole ram I would do it like that:

	mov   i,#$1FF		'index 511..0
loop    altd  i,#0		'index+offset = cog addr
        wrlut 0-0,i		'copy cog to lut
	djns  i,#loop		'repeat until addr 0 done

No # or @ is involved here.

Andy

rjo__ · 2015-12-04 06:29

While I was stumbling around in the fog, I came close to an epiphany... or an epiphora. Pretty foggy; hard to tell.

Thanks Ariba!!!!

I think that little snippet is a good argument for the general case:)

rjo__ · 2016-01-31 17:03

I have a routine, which runs in its own cog. The entire code is less than 200 lines, including variables. There are about 25 constants. The routine ends with a "fit $1F0" statement.

Everything was going along fine until I added a very simple statement, which couldn't blow anything up... and then everything blew up. I spent a couple of days looking at my logic and eventually refined it so that it used fewer instructions and variables and then everything was ok again. I am left with the impression that even though the code is short and I don't get any warnings about space, somehow, my cog's usable programming space is being impacted by the variable space being used by other cogs... is this true? Or is there a limit on the total code size of a .spin file?

rjo__ · 2016-01-31 17:07

'**************************************************************************************************************
                         org

stereo_point 
               cogid cogcom                   'wait for signal to start
               add cogcom, ##1000000          
               rdbyte start_analysis,cogcom
               cmp start_analysis,#1 wz                    'watch byte at hub memory address ##1000008 wait until contain value "1".
               if_nz jmp #stereo_point
               mov horz,#3
               mov x0,#0
             
               wrbyte #0, cogcom
set_up1        mov verts,#20
               mov y0,#24
               add x0,#40
set_up2        
              add y0,#8
              mov confused_points,#0
               
              
               mov var1,#30                        'parameter 1, tolerance of pixel differnce to establish "goodness"
               
               mov reg_w,#5                    'search and target region dimensions
               mov reg_h,#3
               mov target_line,y0
               mul target_line,#320
               add target_line,##_Screen_Buffer       'hub address when x=0 and y=y0
               mov target_point,target_line
               add target_point,x0 
               mov max_offsets,#160
               sub max_offsets,reg_w
               mov reg_h_off,reg_h
               sub reg_h_off,#1
               shr reg_h_off,#1
               mov reg_w_off,reg_w
               sub reg_w_off,#1                      'reg_w_off/reg_h_off distance to top and left of a region from center point
               shr reg_w_off,#1                      'reg_w_off/reg_h_off distance to top and left of a region from center point
     
               mov reg_offset,#320
               sub reg_offset,reg_w                  'offset from right edge point of a region to left edge point of region one line b
    
               mov reg_line,y0
               sub reg_line,reg_h_off
               mul reg_line,#320                        '!!!fix this
               add reg_line,##_Screen_Buffer         'reg_line is location of first point on line at top of region, x=0,y=y0-reg_h_off
               mov target_reg_loc,reg_line           'eventually, target_reg_loc will be upper left corner of target region
               mov search_reg_loc,reg_line           'eventually, search_reg_loc will be upper left corner of  search region
               add search_reg_loc,#160
               add target_reg_loc,x0

               sub target_reg_loc,reg_w_off           'now target_reg_loc is upper left corner of target region
               
               
            rdbyte target_val,target_point                               
               mov search_line,target_line            ' search_line will hold _screen+buffer location of first pixel in right half of target_line
               add search_line,#160                   'find hub address when y=y0 x=160+ regionx+reg_w_off
              ' mov search_x,reg_w_off                 'x value point on right side that is being compared to target_loc

                                                     
               mov search_loc,search_line        
               add search_loc,reg_w_off               'point on right side to examine first=search_point    
               mov search_dx,#0        
            
               add search_point,search_loc

            
target_region  
               mov   hubaddr,target_reg_loc
               mov   lut_addr,#0
               mov   line_reps,reg_h

mov2lut        mov    pixel_reps,reg_w 
               rdfast 0,hubaddr
nextpix        rep @.pixrep,pixel_reps
               rfbyte  pixel_val               ',ptra++
               wrlut   pixel_val,lut_addr
               add     lut_addr,#1           
.pixrep           
               
               add     hubaddr,#320
               djnz    line_reps,#mov2lut  
                               
               


{
debug_region   
               mov     hubaddr,target_reg_loc
               add     hubaddr,##_Buffer_Size
               mov     lut_addr,#0
               mov     line_reps,reg_h
               
lpix2hub       mov     pixel_reps,reg_w

next_lutpix    rdlut  pixel_val,lut_addr
               wrbyte pixel_val,hubaddr
               add    hubaddr,#1
               add    lut_addr,#1
               djnz   pixel_reps,#next_lutpix
               
               add    hubaddr,reg_offset                           
               djnz   line_reps,#lpix2hub
}
          
               mov var_reg_loc,search_reg_loc
'single_point_match 

              
               mov    max_reg_tot,#0
               mov    best_dx,#0
               mov    search_dx,#0
doregions                         
                rdbyte target_val,target_point    
                mov hubaddr,var_reg_loc  
                add hubaddr,search_dx                                                                               'get hub pixel from search_region
 		mov   lut_addr,#0
                mov   line_reps,reg_h
                mov   reg_tot,#0
                rdbyte pixel_val,search_point
                sub    pixel_val,target_val
                abs    pixel_val,pixel_val
                cmp    pixel_val,var1 wc,wz
   if_nc_or_nz  mov    line_reps,#1 
  
                                                            'cmp target_loc and search_loc points... if no match go to next
                                                                  'search_loc

.do_1_line         
                rdfast  0,hubaddr
                mov     pixel_reps,reg_w
                sub     pixel_reps,1
                rep     @.pixrep2,reg_w                            'start of repeat loop... does one line length=reg_w
                rfbyte  pixel_val               ',ptra++
                rdlut   pixel_val2,lut_addr                       'get pixel from target_region in LUT
                sub     pixel_val,pixel_val2                      'subtract these two
                abs     pixel_val,pixel_val                                 'take absolut value
                cmp     pixel_val,var1 wc                         'compare this with var1
           if_c add     reg_tot,#1                                'if less add 1 to total for search region
                add     lut_addr,#1                               'increment lut_addr
.pixrep2                                                          'end of repeat loop
                add     hubaddr,#320               'beginning of next line of search_region in hub
                djnz    line_reps,#.do_1_line  
             
                cmp     max_reg_tot,reg_tot wc,wz
           if_z add     confused_points,#1
           if_c mov     max_reg_tot,reg_tot
           if_c mov     confused_points,#0
           if_c mov     best_dx,search_dx                              
skipper_a           

                add     search_dx,#1
                add     search_point,#1
                cmp     search_dx,max_offsets wc
           
           if_c jmp #doregions
                cmp confused_points,#0 wz
                wrbyte #253,target_point
                add best_dx,target_line
                add best_dx,#160
                add best_dx,reg_w_off
                
           if_z wrbyte #253,best_dx
                mov  search_dx,#0
                djnz verts,#set_up2
                djnz horz,#set_up1
                jmp #stereo_point
              


               
   
              

x0             res 1
y0             res 1
var1           res 1
target_val     res 1                       'value for target point region pixels... x value on left to be search for on right
search_val     res 1                      'byte value for search region pixels
search_point   res 1                      'point on right image to be examined
lutptr         res 1
hubptr         res 1
target_x       res 1
target_loc     res 1
target_point   res 1
target_line    res 1
target_index   res 1
target_reg_loc res 1
search_reg_loc res 1
search_loc     res 1
cardinal_index res 1
search_line    res 1
search_dx      res 1
search_x       res 1
which_line     res 1
region_offset  res 1
'search_loc_start res 1
start_analysis res 1
region_reps    res 1
out_line       res 1
lut_addr       res 1
reg_h          res 1
reg_w          res 1
reg_h_off      res 1
reg_w_off      res 1
reg_line       res 1
pixel_val      res 1
pixel_val2     res 1
pixel_reps     res 1
line_reps      res 1
reg_offset     res 1
var_reg_loc    res 1
reg_tot        res 1
max_reg_tot    res 1
lut_rec        res 1
best_dx        res 1
max_offsets    res 1
confused_points res 1
verts          res 1
horz           res 1

                                         fit     $1F0

rjo__ · 2016-01-31 17:10

This is first step in a stereo-ranging application. It was working as expected for a single point.
and then blew up when I tried to add loops to index the x and y of the target point.
This code now works as expected... the issue isn't the code. I'm confident of that:)

sort of.

The issue is size... isn't that always the way?

Dave Hein · 2016-01-31 21:29

Can you attach the actual spin file you're using? I tried assembling the code that you had included in your second last post and it has problems. First I had to add a DAT statement to get it to assemble, and then it complained that cogcom is undefined.

rjo__ · 2016-02-01 00:00

sorry... have a flu, been sleeping. To see it in all its glory... you have to adjust the baud rate and then send the ascii equivalent of zero. This is working as it should for me. I don't have the series of tries that didn't work... but add in a few variables and some nothing code and see what happens:) It expects the bmp file in the same folder.

This version is in a continuous loop... so even though it doesn't look like it, it does the same thing ... over and over and over. This interfaces to my P2Explorer program so that I can send it a continuous stream of stereo-images from a Bloggie 3D camera... for testing, etc.

This is bare-bones... region based, difference testing with no logic added in... except to exclude targets that equally match to more than one stereo point. I very much dislike this kind of stereo matching... but it is fast. And it might be enough for some kinds of applications. Matches are indicated by white dots on the matched points. Looking at a single line, right down the center of the left image trying to match every other point (top to bottom) the way down the line.

Thanks

Dave Hein · 2016-02-01 00:37

One obvious problem is that cogcom is defined in one cog, but it is used in two different cogs. The cog that did not define cogcom will be writing in a location defined by the other cog, which most likely contains another variable or instruction. You should use another variable name, such as cogcom1 for the other cog.

You should make sure that you don't have any other variable conflicts that could be causing a problem.

rjo__ · 2016-02-01 04:07

ok... and that seems to have had some effect elsewhere down at the bottom of the code:)
The rest are declared where they are supposed to be in this routine.... but I need to check my other routines now... and my brain isn't working:)

See you in the morning.

rjo__ · 2016-02-07 13:58

by golly... I think that might have been the issue. Am going to play with it some more today.

rjo__ · 2016-03-03 06:07

I finally have an OV9655 camera working. Straight out of the box, no I2C stuff yet. I'm driving it with a 75.000001 MHz NCO:) If my memory serves me correctly, the OV9655 divides the input frequency by two giving an effective capture of pixels with a pixel clock of 37.5MHz. Pretty darned snazzy!!! The only thing is I wanted proof.
So, I wrote a little program to find out.

snippet
             pinsetm pm_cntPedge,#_CAM_PCLK_PIN    
             pinsetx cntPedge_x,#_CAM_PCLK_PIN    
             
             setb dira,#_CAM_PCLK_PIN    

           
             getct temp
             addct1 temp,##80_000_000
             waitct1
             pingetz temp2,#_CAM_PCLK_PIN   
             wrlut temp2,#0
               
                ret
.
.
.
pm_cntPedge     long    %1_00_01101_0000_0000_00_0_0000000000000 
cntPedge_x      long 80_000_000

This works fine for verifying the frequency of my generated clock, but when I put the camera in the mix my acquisition stops... because now the pclk is now a smart pin rather than a standard input.

No problem ... I'll just watch for edges and count some and look at the elapsed time them ... doesn't seem to work...

Can two different cogs watch for edges on the same standard input pin?
In cases as above, I feel pretty stupid putting a waitct1 in there... how can I tell that the smart pin is done doing what I asked it to do... which in this case is to count positive edges for 80_000_000 clocks.

Thanks,

Rich

rjo__ · 2016-03-03 06:09

couple of typos in there... tonight my browser won't let me edit?

evanh · 2016-03-03 06:40

The edit button only works arbitrarily via scripting. It's missing its fall-back, one of those little hangovers of the new forums.

rjo__ · 2016-03-03 06:41

I have scripting enabled... I dunno:)

IncVoid · 2016-03-28 04:17

sorry to bump a thread, Anybody have any pre-eliminary pdfs on instruction set or theory of operation?
Can't wait for the first experimenters board with "learning the prop 2" book or some title.

I owe everything I know about micro's to these forums, prop manual and Lamothe's Hydra manual, which first half go really worn out from reading over and over.

I suppose it would have a long section on "How it came to be" with direct quotes from the forums on why changes were made to prop 2.

evanh · 2016-03-28 11:49

THE WIND UP
How the Propeller II came to be.

evanh · 2016-03-28 11:53

:P

IncVoid,
I presume you've had a look at the stickies. Cluso has a good selection - http://forums.parallax.com/discussion/144199/propeller-ii-emulation-of-the-p2-on-fpga-boards-prop123-a7-a9-de0-nano-de2-115-etc/p1

The two document links at http://forums.parallax.com/discussion/162298/prop2-fpga-files/p1 have the most details.

rjo__ · 2016-04-02 20:00

1. I don't want to give Ken an ulcer.

2. The answer to my question would probably be obvious if I knew a little more about micro-electronics.

3. I seriously doubt that I am the first one to wonder about this, but I haven't seen it talked about.

4. There are so many ways to get things done with the Prop2 that adding very much more would seem to be adding guild to a lily.

We can't use a pin as both a smart pin and as a stupid pin at the same time. Analysis of any particular signal might take more than one kind of measurement. Jumping back an forth between logics can become an issue in time sensitive analyses.

So...................

why can't Chip tap the input to each pin to provide an additional input line from each pin to each cog?

Thanks

rjo__ · 2016-04-25 16:51

I can't wait for Spin2... but then I'm wondering, do I really need it?
... or is there something I just don't understand about PNUT?

1. Does PNUT have any compiler options that are useful from the command line?

2. Here is one of my issues... I'm trying to build a PASM project basically to keep track of useful code snippets. It has gotten so large that it is practically useless to anyone but myself, and so I now find myself stripping out the good bits when I want to communicate or ask a question... assbackwards of my original intent.

So... 2b. is it possible to have PNUT load code from various .spin files or do they have to be compiled first? Does anyone have an example?

cgracey · 2016-04-25 18:31

rjo__ wrote: »

I can't wait for Spin2... but then I'm wondering, do I really need it?
... or is there something I just don't understand about PNUT?

1. Does PNUT have any compiler options that are useful from the command line?

2. Here is one of my issues... I'm trying to build a PASM project basically to keep track of useful code snippets. It has gotten so large that it is practically useless to anyone but myself, and so I now find myself stripping out the good bits when I want to communicate or ask a question... assbackwards of my original intent.

So... 2b. is it possible to have PNUT load code from various .spin files or do they have to be compiled first? Does anyone have an example?

You could use the 'file' directive in PASM to load pre-assembled data, but you would have to make something to extract data post-assember from the .obj file that PNut outputs.

Once we get some higher-level language support, what you want will be simple and automatic. For now, we have just enough tool to test the design.

David Betz · 2016-04-25 19:11

cgracey wrote: »

rjo__ wrote: »

I can't wait for Spin2... but then I'm wondering, do I really need it?
... or is there something I just don't understand about PNUT?

1. Does PNUT have any compiler options that are useful from the command line?

2. Here is one of my issues... I'm trying to build a PASM project basically to keep track of useful code snippets. It has gotten so large that it is practically useless to anyone but myself, and so I now find myself stripping out the good bits when I want to communicate or ask a question... assbackwards of my original intent.

So... 2b. is it possible to have PNUT load code from various .spin files or do they have to be compiled first? Does anyone have an example?

You could use the 'file' directive in PASM to load pre-assembled data, but you would have to make something to extract data post-assember from the .obj file that PNut outputs.

Once we get some higher-level language support, what you want will be simple and automatic. For now, we have just enough tool to test the design.

Didn't Dave Hein write an assembler for P2 that he was using with his P2 C compiler project?

Questions about the P2 instrution set for anyone but Chip

Comments