LMM2 - Propeller 2 LMM experiments (50-80 LMM2 MIPS @ 160MHz)

Sapieha · 2012-12-10 15:07

Hi Chip

For Assembly programmers Little.

For people that write GCC --- Can give big problems

cgracey wrote: »

I think having to set the QUAD window at a QUAD-aligned address is a big pain. Do you guys agree? It keeps the circuitry simple but causes pain for the programmer.

Ariba · 2012-12-10 15:08

cgracey wrote: »

Can any of you think of any reason why QUAD registers should be assignable to 4 separate addresses, without concern for alignment? Just curious.

This would be more flexible, you can for example do something like that:

lmm  rdquad pc
ins1 nop
ins2 nop
     jmpd #lmm
ins3 nop
ins4 nop
     add pc,#16

Andy

cgracey · 2012-12-10 15:12

Ariba wrote: »
This would be more flexible, you can for example do something like that:
lmm  rdquad pc
ins1 nop
ins2 nop
     jmpd #lmm
ins3 nop
ins4 nop
     add pc,#16
Andy

Yes, that's what I was wondering about. Let me see...

Bill Henning · 2012-12-10 15:13

Ariba wrote: »

This would be more flexible, you can for example do something like that
Andy

I think this may work better:

lmm  rdquad pc
ins1 nop
ins2 nop
ins3 nop
     jmpd #lmm
ins4 nop
     add pc,#16

I don't mind such flexibility, as long as it does not slow things down...

Cluso99 · 2012-12-10 15:15

i just lost my post

so much for android!

i was posting a way i thought you could separate the first serquad & readquad from the subsequent readquad loop so that each part could be verified.

However i seechip has found a bug. Great work guys. much better to find this now. makes us happy chip released the emulation code.

i am expecting to join you with the fun around xmas although sneaking away from the family might not go down so well.

cgracey · 2012-12-10 15:20

cgracey wrote: »

Yes, that's what I was wondering about. Let me see...

It looks like it would be kind of messy at this point. I think I'll just make it so that there's one address that doesn't need to be quad-aligned.

I figured you could do what you were attempting above like this:

ins0    nop
ins1    nop
ins2    nop
ins3    nop
        jmpd    #ins0
        add     pc,#16
        nop
        rdquad  pc

Cluso99 · 2012-12-10 15:28

i can see it would be nice to be able to map the quads to any non-contiguous cog locations. however i would still be happy with the restriction they are on a 00 boundary and/or 4 contiguous, depending on the complexity. obviously bill & andy have worked a non-contiguous loop which is very clever for performing the lmm and this will be a major use in high level languages at least.

i will be using readquads for fast loading of binary overlays.

Sapieha · 2012-12-10 15:31

Hi Chip.

In my opinion ---> Most usable

cgracey wrote: »
It looks like it would be kind of messy at this point. I think I'll just make it so that there's one address that doesn't need to be quad-aligned.

I figured you could do what you were attempting above like this:
ins0    nop
ins1    nop
ins2    nop
ins3    nop
        jmpd    #ins0
        add     pc,#16
        nop
        rdquad  pc

Bill Henning · 2012-12-10 16:18

Sounds good!

Could SETQUAD zero the cache registers? That way they would be guaranteed to execute as NOP's first pass through the loop.

In the code below, would the rdquad ever execute? I thought JMPD only had two delay slots - or is my memory playing tricks on me?

cgracey wrote: »
It looks like it would be kind of messy at this point. I think I'll just make it so that there's one address that doesn't need to be quad-aligned.

I figured you could do what you were attempting above like this:
ins0    nop
ins1    nop
ins2    nop
ins3    nop
        jmpd    #ins0
        add     pc,#16
        nop
        rdquad  pc

Ariba · 2012-12-10 16:18

cgracey wrote: »
It looks like it would be kind of messy at this point. I think I'll just make it so that there's one address that doesn't need to be quad-aligned.

I figured you could do what you were attempting above like this:
ins0    nop
ins1    nop
ins2    nop
ins3    nop
        jmpd    #ins0
        add     pc,#16
        nop
        rdquad  pc

I also have no problem if the one cog address must be quad aligned. This is really a minor problem compared with the LMM code in hub ram, that must be structured in quad packets. I already can see a lot of nops in a GCC generated code.

Andy

Roy Eltham · 2012-12-10 16:21

If you choose to make setquad clear the quad registers, please make it optional. I can see value in not clearing them every time you call setquad.

Bill Henning wrote: »

Sounds good!

Could SETQUAD zero the cache registers? That way they would be guaranteed to execute as NOP's first pass through the loop.

In the code below, would the rdquad ever execute? I thought JMPD only had two delay slots - or is my memory playing tricks on me?

Ariba · 2012-12-10 16:23

Bill Henning wrote: »

Sounds good!

Could SETQUAD zero the cache registers? That way they would be guaranteed to execute as NOP's first pass through the loop.

In the code below, would the rdquad ever execute? I thought JMPD only had two delay slots - or is my memory playing tricks on me?

All delayed jumps have 3 delay slots - See also Chips post #50 in this thread.

cgracey · 2012-12-10 16:27

Bill Henning wrote: »

Sounds good!

Could SETQUAD zero the cache registers? That way they would be guaranteed to execute as NOP's first pass through the loop.

In the code below, would the rdquad ever execute? I thought JMPD only had two delay slots - or is my memory playing tricks on me?

There are three instruction slots that must be filled in a single-task JMPD scenario.

I've been thinking about some way to clear the QUADs, as it would make things tidy. On cog startup, they contain the data loaded into $1F4..$1F7.

Does anyone have any thoughts about SETQUAD clearing the registers? I have it set up now so that if you do a SETQUAD $1F8..$1FF the QUADs go into hiding. The hiding case might be a convenient time to clear the QUADs. Does anything think you might want to do a SETQUAD to simply move them, without clearing them?

Bill Henning · 2012-12-10 16:31

Roy:

The reason I'd like the quad cache cleared at a SETQUAD is so that a tight RDQUAD LMM loop will not execute random code.

Andy:

Thanks! Jeez my memory is going... or is that lack of sleep... or lack of cofee...

Chip:

Thanks, I don't know why I had a memory error on that. I'd better check my ECC bits.

How about SETQUAD D wz means clear the cache, without wz does not clear it?

The reason I'd like it cleared is the problem you pointed out many messages ago - the LMM look running random instructions on the first pass through the loop.

Sapieha · 2012-12-10 16:35

Hi Chip.

I think that answer Yours last question.

Hide ---> and clear --> BUT if moved to other position in COG -- leave as it is.

Roy Eltham wrote: »

If you choose to make setquad clear the quad registers, please make it optional. I can see value in not clearing them every time you call setquad.

Bill Henning · 2012-12-10 16:42

That does not work for what I need, I'd have to SETQUAD twice, once to hide it and clear it, and again to repositionit where I need it - every time the LMM2 loop is initiated.

Frankly, I am still trying to figure out a case where it is useful to SETQUAD and NOT clear the cache... I am tring to find a use for that.

Sapieha wrote: »

Hi Chip.

I think that answer Yours last question.

Hide ---> and clear --> BUT if moved to other position in COG -- leave as it is.

Ariba · 2012-12-10 16:47

Bill Henning wrote: »

Thanks, I don't know why I had a memory error on that. I'd better check my ECC bits.

I think the Prop2PreliminaryFeatureList - PDF speaks of 2 delay slots, so there's a big chance that your memory still is okay.

Andy

cgracey · 2012-12-10 16:48

Here's a new configuration file for the DE0_Nano which fixes the QUAD mapping bug:

Prop2_DEO_Nano_v2.zip

I'll make a DE2_115 and StratixIII version a little later. I want to address the QUAD clearing issue and get REPS/REPD so that it works by task, not in general.

Sapieha · 2012-12-10 16:52

Hi Bill.

Best case as I see it was Yours proposal.
give all possibility's.

My one as second one if other things not work.

Bill Henning wrote: »

Roy:

The reason I'd like the quad cache cleared at a SETQUAD is so that a tight RDQUAD LMM loop will not execute random code.

Andy:

Thanks! Jeez my memory is going... or is that lack of sleep... or lack of cofee...

Chip:

Thanks, I don't know why I had a memory error on that. I'd better check my ECC bits.

How about SETQUAD D wz means clear the cache, without wz does not clear it?

The reason I'd like it cleared is the problem you pointed out many messages ago - the LMM look running random instructions on the first pass through the loop.

Roy Eltham · 2012-12-10 16:54

Yes, Bill, I understand that, but there are uses for the cog besides running LMM kernels.

chip, I can't give you any real use examples right now, but I think it would be good if setquad could be called to move the quad mappings without clearing. So if it isn't difficult, I think clearing should be optional somehow. Is the NR/WR bit available?

Bill Henning wrote: »

Roy:

The reason I'd like the quad cache cleared at a SETQUAD is so that a tight RDQUAD LMM loop will not execute random code.

Sapieha · 2012-12-10 16:54

Hi Chip.

Thanks.

Good if REPx can work with task's --- Give more flexibility

cgracey wrote: »

Here's a new configuration file for the DE0_Nano which fixes the QUAD mapping bug:

Prop2_DEO_Nano_v2.zip

I'll make a DE2_115 and StratixIII version a little later. I want to address the QUAD clearing issue and get REPS/REPD so that it works by task, not in general.

Bill Henning · 2012-12-10 17:03

Thanks Chip!

Wifey is going to kill me... more Prop2 play time :-) :-) :-)

REPS/REPD working by task would be great.

cgracey wrote: »

Here's a new configuration file for the DE0_Nano which fixes the QUAD mapping bug:

Prop2_DEO_Nano_v2.zip

I'll make a DE2_115 and StratixIII version a little later. I want to address the QUAD clearing issue and get REPS/REPD so that it works by task, not in general.

cgracey · 2012-12-10 17:12

Roy Eltham wrote: »

Yes, Bill, I understand that, but there are uses for the cog besides running LMM kernels.

chip, I can't give you any real use examples right now, but I think it would be good if setquad could be called to move the quad mappings without clearing. So if it isn't difficult, I think clearing should be optional somehow. Is the NR/WR bit available?

I agree, Roy. Integrating clearing somehow into SETQUAD would be nice. We've got the Z and C bits which I can cause not to do anything. Biggest problem is what do you call it? Seven characters, max, or you blow the tab stop. SETQUAD vs SETQUAZ or ZETQUAD?

Peter Jakacki · 2012-12-10 17:18

cgracey wrote: »

I agree, Roy. Integrating clearing somehow into SETQUAD would be nice. We've got the Z and C bits which I can cause not to do anything. Biggest problem is what do you call it? Seven characters, max, or you blow the tab stop. SETQUAD vs SETQUAZ or ZETQUAD?

This might sound dumb but can't we say SETQUAD for setting without clearing and the current SETQUAD would then be CLRQUAD?

Sapieha · 2012-12-10 17:19

Hi Chip.

Why not simple.

SETQUAD D, wz
SETQUAD #n, wz

Ps. else new directive

rz

cgracey wrote: »

I agree, Roy. Integrating clearing somehow into SETQUAD would be nice. We've got the Z and C bits which I can cause not to do anything. Biggest problem is what do you call it? Seven characters, max, or you blow the tab stop. SETQUAD vs SETQUAZ or ZETQUAD?

Bill Henning · 2012-12-10 17:21

One of:

Does not clear:

SETQUAD D
SETQUAD #n

Does clear:

SETQUAD D, wz
SETQUAD #n, wz

or

SETQUADZ D, wz
SETQUADZ #n, wz

cgracey wrote: »

I agree, Roy. Integrating clearing somehow into SETQUAD would be nice. We've got the Z and C bits which I can cause not to do anything. Biggest problem is what do you call it? Seven characters, max, or you blow the tab stop. SETQUAD vs SETQUAZ or ZETQUAD?

Roy Eltham · 2012-12-10 17:35

SETQM, and SETQMZ

QM for Quad Mapping

Bill Henning · 2012-12-10 17:37

Hi Chip,

I just tried to re-program my Nano, and I got:

"Error: File Z:/2012/prop2/Prop2_DEO_Nano_v2/DE0_Nano_Prop2.jic is corrupted"

I donloaded the zip twice, an tried it about four times.

Help!

cgracey wrote: »

Here's a new configuration file for the DE0_Nano which fixes the QUAD mapping bug:

Prop2_DEO_Nano_v2.zip

I'll make a DE2_115 and StratixIII version a little later. I want to address the QUAD clearing issue and get REPS/REPD so that it works by task, not in general.

Sapieha · 2012-12-10 17:39

Hi Roy

I think it will be more flexible if it can be organized in same MNEMONIC with adding only extra directive

wz --- Else if Chip make new one ---- Flush

Roy Eltham wrote: »

SETQM, and SETQMZ

QM for Quad Mapping

Sapieha · 2012-12-10 17:40

Hi Bill.

I don't had any problems with reprograming

Bill Henning wrote: »

Hi Chip,

I just tried to re-program my Nano, and I got:

"Error: File Z:/2012/prop2/Prop2_DEO_Nano_v2/DE0_Nano_Prop2.jic is corrupted"

I donloaded the zip twice, an tried it about four times.

Help!

LMM2 - Propeller 2 LMM experiments (50-80 LMM2 MIPS @ 160MHz)

Comments