Fast Bytecode Interpreter

1202122232426»

Comments

  • evanhevanh Posts: 4,046
    edited September 2 Vote Up0Vote Down
    Dave,
    Intriguing, so the FIFO stops being used when the code path enters CogExec, the FIFO is then being used for a buffered block write, then automatically refills with HubExec code when the RET is hit.

    There's probably a FIFO flush delay at this point but "That's so easy!" :) Now for the block copy version ... it'll need to chop it up into chunks like Chip has.
  • cgraceycgracey Posts: 7,835
    edited September 2 Vote Up0Vote Down
    One nice thing about SETQ+RDLONG/WRLONG is that the FIFO is not used. Data is streamed directly between the cog and the hub at one clock per long. FIFO status and mode are unchanged.
  • So many ways in the Prop2.

    I've been using Dave's comport code for my testing. It's simple bit-bashing code (presumably been like that since Prop2Hot days) but it saved me going through the trial and error effort myself. Had to hack it around a little to remove the GCC parts. I wanted all the printf() code too but that was just too hard to extract. So I settled for making my own itoa() that uses the putch() primitive.
  • cgraceycgracey Posts: 7,835
    edited September 2 Vote Up0Vote Down
    Dave Hein wrote: »
    Chip, that looks great. However, I wonder if you could get a speed improvement by separating xxxxFILL from xxxxMOVE. Also, because the P2 does unaligned reads and writes the BYTE and WORD operations could be almost as fast as the LONG operations. It just requires performing 0 to 3 BYTE accesses at the beginning, and then doing LONG accesses after that. This is the code I used for memset() in p2gcc, which is basically the same as BYTEFILL() in Spin.
    ' This code resides in cog RAM
    __LONGFILL
            wrfast  #0, r0
            rep     #1, r2
            wflong  r1
            ret
    
    // This code resides in hub RAM
    // ptr, val and num are in r0, r1 and r2
    void memset(void *ptr, int val, int num)
    {
        __asm__("       mov     r3, #3");
        __asm__("       and     r3, r2 wz");
        __asm__(" if_z  jmp     #label1");
        __asm__("label2 wrbyte  r1, r0");
        __asm__("       add     r0, #1");
        __asm__("       djnz    r3, #label2");
        __asm__("label1 setnib  r1, r1, #1");
        __asm__("       setbyte r1, r1, #1");
        __asm__("       setword r1, r1, #1");
        __asm__("       shr     r2, #2 wz");
        __asm__(" if_nz call    #\\__LONGFILL");
    }
    
    I could speed it up a bit by moving all the code to cog RAM, but p2gcc C code currently calls functions that only reside in hub RAM.

    Dave, there was a lot of overlap between the move and fill functions. So, I melded them together. I didn't want one full-length SETQ+WRLONG fill, since interrupt latency would have been compromised.

    If you don't care about interrupt latency, you can do one SETQ+WRLONG combo, using a constant for the WRLONG's D. It will fill at the rate of 1 long per clock.
  • I'll have to change it to use SETQ+WRLONG. Since it doesn't use the FIFO I should be able to do this from hub RAM and I won't need to use cog RAM. I guess if I need to worry about interrupt latency in the future I could break the SETQ+WRLONG into to smaller chunks.
  • Work on the interpreter is going slowly, but almost done. It is really challenging for me because there are so many subtle things that can be done to create optimal code, that it's enough to drive one nuts.

    We almost need some kind of compiler that can optimize variable placement and coordinate it with SETQ+RDLONG activity to get the big speed improvements possible.

    I've been thinking for years now that it would be really neat to have a tool where you can define some kind of outcome that you want, based on inputs, and the tool, using random instruction selection and simulation, comes up with really efficient solutions that would have been too difficult for a human to produce in as timely a way.
  • cgracey wrote: »
    Work on the interpreter is going slowly, but almost done. It is really challenging for me because there are so many subtle things that can be done to create optimal code, that it's enough to drive one nuts.

    We almost need some kind of compiler that can optimize variable placement and coordinate it with SETQ+RDLONG activity to get the big speed improvements possible.

    I've been thinking for years now that it would be really neat to have a tool where you can define some kind of outcome that you want, based on inputs, and the tool, using random instruction selection and simulation, comes up with really efficient solutions that would have been too difficult for a human to produce in as timely a way.

    So you mean sort of like a Place & Route in FPGA, only for Opcodes... ?
    Could be interesting to try, but likely to quickly become exponentially slower, as the task increases.

    Easier could be more-automated tools for your masked skip opcodes.
    ie fairly simple ones that re-create masks as lines are added and removed.
  • Chip, that's interesting...a Monte Carlo approach to coding.

    https://en.m.wikipedia.org/wiki/Monte_Carlo_method
    San Mateo, CA
  • cgraceycgracey Posts: 7,835
    edited September 7 Vote Up0Vote Down
    Yes, it seems almost like an FPGA compilation problem, but with a larger-grain and a more-random set of building blocks that cover a different set of functions. It's like chess, whereas FPGA's are more like Legos.

    Yesterday, I had the problem of needing to find if a register held a value that was between two other random register values. I can do it in six instructions, but there must be a simpler solution. I think it would be reasonable to have a compiler come up with a solution for that simple of a problem.
  • Jeff Haas wrote: »
    Chip, that's interesting...a Monte Carlo approach to coding.

    https://en.m.wikipedia.org/wiki/Monte_Carlo_method

    That didn't occur to me, at first, but that would be the best way to do it. You could eliminate bad combos right away with that approach. A full parametric sweep would be impossible.
  • Maybe the simplest way to make such a thing is to give the tool some code sequence that works, and have it try to find a shorter/faster sequence. That way, there's less abstraction needed on defining the goal. It just simulates the model while it tries to find a better one.
  • cgracey wrote: »
    Maybe the simplest way to make such a thing is to give the tool some code sequence that works, and have it try to find a shorter/faster sequence. That way, there's less abstraction needed on defining the goal. It just simulates the model while it tries to find a better one.

    Sounds like your moving toward genetic algorithms :-)
  • cgraceycgracey Posts: 7,835
    edited September 7 Vote Up0Vote Down
    The fastest way to build this and the fastest way to run it, is on the prop2, itself. No need for a simulator, then:

    1) generate new random input values
    2) run 'model' routine with generated random input values, capture outputs
    4) run 'test' routine with generated random input values, capture outputs
    5) compare outputs and iterate 'test' routine if mismatch
    6) repeat until n cycles with no mismatch
  • isn't that exactly what neuronal networks do?

    provide them with input-data and a goal output to train them?

    nothing new here

    Mike
    I am just another Code Monkey.

    A determined coder can write COBOL programs in any language. -- Author unknown.

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.
  • msrobots wrote: »
    isn't that exactly what neuronal networks do?
    Mike

    No, thats exactly what Genetic Algorithms do.

    "The genetic algorithm is a method for solving both constrained and unconstrained optimization problems that is based on natural selection, the process that drives biological evolution. The genetic algorithm repeatedly modifies a population of individual solutions."
  • cgracey wrote: »
    Yesterday, I had the problem of needing to find if a register held a value that was between two other random register values. I can do it in six instructions, but there must be a simpler solution. I think it would be reasonable to have a compiler come up with a solution for that simple of a problem.

    Hi Chip,
    I would like to show you a solution that I came up with to your above problem. I have tested it pretty thoroughly
    using Cluso's debugger, so I hope that it helps.
    InRange       cmps n,r1  wc,wz        
    IF_NC_AND_NZ  cmps r2,n  wc,wz            
    IF_NC         jmp #InRange_ret        
                  cmps r1,n  wc,wz        
    IF_NC_AND_NZ  cmps n,r2  wc,wz       
    InRange_ret   ret
    

    HydraHacker

  • cgracey wrote: »
    Yesterday, I had the problem of needing to find if a register held a value that was between two other random register values. I can do it in six instructions, but there must be a simpler solution. I think it would be reasonable to have a compiler come up with a solution for that simple of a problem.

    Hi Chip,
    I would like to show you a solution that I came up with to your above problem. I have tested it pretty thoroughly
    using Cluso's debugger, so I hope that it helps.
    InRange       cmps n,r1  wc,wz        
    IF_NC_AND_NZ  cmps r2,n  wc,wz            
    IF_NC         jmp #InRange_ret        
                  cmps r1,n  wc,wz        
    IF_NC_AND_NZ  cmps n,r2  wc,wz       
    InRange_ret   ret
    

    HydraHacker

    Neat! I was wondering if something like this was possible - checking the range in two directions. It hurt my head so bad that I couldn't get on top of it. Looks like you got there, though.

    Meanwhile, here is what I came up with (comments are first, to state the goal of the code):
    '
    '
    ' LOOKDOWN(target : ,,,value1..value2,,,)
    '
    '  if value1 <= value2
    '    delta = value2 - value1
    '    if value1 <= target <= value2
    '      result = - value1 + index + target
    '      branch to address
    '    else index += delta + 1
    '
    '  if value1 > value2
    '    delta = value1 - value2
    '    if value2 <= target <= value1
    '      result = value1 + index - target
    '      branch to address
    '    else index += delta + 1
    '
    '
    '  x		value2		x
    '  ptra[-1]	value1		t
    '  ptra[-2]	index		u
    '  ptra[-3]	target		v
    '  ptra[-4]	address		w
    '
    '
    lookdnr		setq	#3		'pop value1+index+target+address into t+u+v+w
    		rdlong	w,--ptra[4]
    
    		cmps	x,t	wc	'z = (value1 > value2)
    		modcz	0,_c	wz
    
      if_nz		cmps	v,t	wc	'if nz, nc = (value1 <= target <= value2)
      if_nz_and_nc	cmps	x,v	wc
    
      if_z		cmps	v,x	wc	'if z, nc = (value2 <= target <= value1)
      if_z_and_nc	cmps	t,v	wc
    
      if_nc		negnz	x,t		'if target within range,
      if_nc		add	x,u		'..get result on stack top
      if_nc		sumz	x,v
      if_nc		jmp	#casex		'..and branch
    
      if_nz		sub	x,t		'else, get new index on stack top
      if_z		subr	x,t
    		addx	x,u		'(c=1, results in +1)
    
      _ret_		add	ptra,#2*4	'..and unpop address+target
    

    This code gets the direction of the compare into Z, so that it can still be used after C indicates if we are in range, or not.
  • cgracey wrote: »
    cgracey wrote: »
    Yesterday, I had the problem of needing to find if a register held a value that was between two other random register values. I can do it in six instructions, but there must be a simpler solution. I think it would be reasonable to have a compiler come up with a solution for that simple of a problem.

    Hi Chip,
    I would like to show you a solution that I came up with to your above problem. I have tested it pretty thoroughly
    using Cluso's debugger, so I hope that it helps.
    InRange       cmps n,r1  wc,wz        
    IF_NC_AND_NZ  cmps r2,n  wc,wz            
    IF_NC         jmp #InRange_ret        
                  cmps r1,n  wc,wz        
    IF_NC_AND_NZ  cmps n,r2  wc,wz       
    InRange_ret   ret
    

    HydraHacker

    Neat! I was wondering if something like this was possible - checking the range in two directions. It hurt my head so bad that I couldn't get on top of it. Looks like you got there, though.

    Meanwhile, here is what I came up with (comments are first, to state the goal of the code):
    '
    '
    ' LOOKDOWN(target : ,,,value1..value2,,,)
    '
    '  if value1 <= value2
    '    delta = value2 - value1
    '    if value1 <= target <= value2
    '      result = - value1 + index + target
    '      branch to address
    '    else index += delta + 1
    '
    '  if value1 > value2
    '    delta = value1 - value2
    '    if value2 <= target <= value1
    '      result = value1 + index - target
    '      branch to address
    '    else index += delta + 1
    '
    '
    '  x		value2		x
    '  ptra[-1]	value1		t
    '  ptra[-2]	index		u
    '  ptra[-3]	target		v
    '  ptra[-4]	address		w
    '
    '
    lookdnr		setq	#3		'pop value1+index+target+address into t+u+v+w
    		rdlong	w,--ptra[4]
    
    		cmps	x,t	wc	'z = (value1 > value2)
    		modcz	0,_c	wz
    
      if_nz		cmps	v,t	wc	'if nz, nc = (value1 <= target <= value2)
      if_nz_and_nc	cmps	x,v	wc
    
      if_z		cmps	v,x	wc	'if z, nc = (value2 <= target <= value1)
      if_z_and_nc	cmps	t,v	wc
    
      if_nc		negnz	x,t		'if target within range,
      if_nc		add	x,u		'..get result on stack top
      if_nc		sumz	x,v
      if_nc		jmp	#casex		'..and branch
    
      if_nz		sub	x,t		'else, get new index on stack top
      if_z		subr	x,t
    		addx	x,u		'(c=1, results in +1)
    
      _ret_		add	ptra,#2*4	'..and unpop address+target
    

    This code gets the direction of the compare into Z, so that it can still be used after C indicates if we are in range, or not.

    Thanks chip for looking at my code, I really busted my brain! Trying to make a range check that works in two directions. I’ll add some comments soon to my code, I just have to figure out a good way to explain it. Chip your code really looks good and because you write such efficient code sometimes I have trouble understanding what you wrote :-)

    HydraHacker
Sign In or Register to comment.