Shop OBEX P1 Docs P2 Docs Learn Events
Prop 1 Rev C - Feasible ??? - Page 2 — Parallax Forums

Prop 1 Rev C - Feasible ???

2»

Comments

  • AleAle Posts: 2,363
    edited 2011-07-05 08:11
    We don't even get a 32x32 multiplier but you want a 4 cycle divider...
  • Dave HeinDave Hein Posts: 6,347
    edited 2011-07-05 09:08
    I was just responding to Leon's incorrect statement that a "Four cycle divide isn't possible". hinv is the one that suggested the four cycle divider along with a four cycle multiplier. Personally, I don't think a fast divider is worth the amount of silicon it would require. I believe Prop 2 will have a single-cycle 16x16 multiplier, which will be very useful. A 32-bit multiply could be implemented with four 16-bit multiplies. I believe the Prop 2 will also have multi-cycle macro instructions that will implement 32-bit multiplies and divides.
  • Roy ElthamRoy Eltham Posts: 3,000
    edited 2011-07-05 12:17
    Ale wrote: »
    We don't even get a 32x32 multiplier but you want a 4 cycle divider...

    Who said we don't get a 32x32 multiply? ;)
  • Mike GreenMike Green Posts: 23,101
    edited 2011-07-05 12:27
    Roy,
    Last I remember, it was Chip that said we would have a 16x16 unsigned multiply with a 32 bit result.
  • Roy ElthamRoy Eltham Posts: 3,000
    edited 2011-07-05 14:03
    Mike Green,
    There is indeed a 16x16 with 32bit result multiply. That doesn't exclude a 32x32 with 64bit result multiply. Nor does it exclude a 32bit divide. :)
  • LeonLeon Posts: 7,620
    edited 2011-07-05 14:27
    I've just been playing with some assembler maths stuff on the PIC32 (MIPS32 core) using the MPLAB simulator. A 32 bit multiply with a 64 bit result takes one clock, as I expected, but a 32 bit divide (32 bit quotient and 32 bit remainder) takes 15 clocks. Here is my test code:
    #include <p32xxxx.h>
    	.global main
    	.data
    var1:	.word	2
    var2:	.word	8
    	.text
    .ent main
    main:
    loop:
    	nop
    	lw	$t1,var1
    	lw	$t2,var2
    	add     $t0,$t1,$t2	# $t0 = $t1 + $t2
    	mult    $t1,$t2	# (Hi,Lo) = $t1 * $t2
    	div	$t2,$t1	# Lo = $t2 / $t1 Hi = $t2 mod $t1
    	nop
    	j	loop
    
    .end main
    

    It took me a couple of hours to work out how to write a standalone MIPS32 assembly language program, everyone seems to use C! The assembly language is rather nice when one gets into it, but is a lot harder to use than Propeller assembler. We have yet to see what the equivalent operations on the Propeller II will look like, of course.

    In case anyone asks, mult and div place the results in a pair of dedicated Hi and Lo registers.
  • AleAle Posts: 2,363
    edited 2011-07-05 22:04
    MIPS assembly... mmm... it was always nice and clean and had delayed slots :). 15 clocks for a 32/32 is rather good. Sadly we don't get 8 of them in one package :(
  • LeonLeon Posts: 7,620
    edited 2011-07-06 00:44
    I thought it would be useful to have something for comparison.

    It's not actually as good as I thought; it takes 22 clocks in some circumstances, which is weird. I found the reason in a MIPS document: it uses an iterative algorithm and depends on the size of the operand (8/16/24/32 bits). The 5 stage pipeline complicates things, as well.

    Eight MIPS cores could go into an FPGA, of course.
Sign In or Register to comment.