Shop OBEX P1 Docs P2 Docs Learn Events
What are the limits of in COG with C or BASIC? - Page 2 — Parallax Forums

What are the limits of in COG with C or BASIC?

2

Comments

  • jmgjmg Posts: 15,182
    edited 2015-04-09 14:30
    David Betz wrote: »
    If you give the COGC program a chunk of hub memory to use as a stack you don't have to worry about this. It is only necessary if you want to operate without a stack which is kind of unnatural for C anyway.

    The general premise/expectation behind COG mode is to run COG local, with PASM like speed.
    ie to use C as a high level assembler, and that will of course be somewhat constrained.
    The link mentions 2 calls deep ?
    What is the speed impact, when it does flip to use a Hub Stack ?
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 14:34
    jmg wrote: »
    What CLK speeds does that use ?
    I see 1MHz is widespread, and 3.4MHz is mentioned on some parts (eg Cypress FM24V10) not cheap, but could test 3.4MHz code.
    1MHz parts can likely be over-clocked with tuning of pullup ressitors and code.
    Not sure how fast it is trying to go. Jazzed wrote the EEPROM external memory driver. I've attached it here in case you want to look over the code. It doesn't use waitcnt for timing but instead uses djnz loops.

    eeprom_xmem.spin
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 14:37
    jmg wrote: »
    The general premise/expectation behind COG mode is to run COG local, with PASM like speed.
    ie to use C as a high level assembler, and that will of course be somewhat constrained.
    The link mentions 2 calls deep ?
    What is the speed impact, when it does flip to use a Hub Stack ?
    Remember, PropGCC is built on GCC. The compiler itself doesn't have any Propeller-specific constructs. It just has a Propeller code generator. The GCC compiler and C itself pretty much assume a stack so any code that can get away without one is exceptional. You *can* generate that sort of code by being a bit careful about how you use nested functions but it isn't really that straight forward. It would, of course, be possible to write an entirely new compiler that is tuned to generate code for the COG but that would be a much bigger effort than writing a code generator for GCC. We didn't do that. Even Catalina didn't do that.
  • Dave HeinDave Hein Posts: 6,347
    edited 2015-04-09 14:48
    jmg wrote: »
    What is the speed impact, when it does flip to use a Hub Stack ?
    I just tried the fibo program in LMM and COG modes, and COG is about twice as fast as LMM. Of course, the fibo program uses recursive calling, so has to use the stack. I would expect a COG program to be at least 4 times faster than LMM as long as it doesn't use the stack or hub variables.
  • jmgjmg Posts: 15,182
    edited 2015-04-09 14:59
    David Betz wrote: »
    Remember, PropGCC is built on GCC. The compiler itself doesn't have any Propeller-specific constructs. It just has a Propeller code generator. The GCC compiler and C itself pretty much assume a stack so any code that can get away without one is exceptional. You *can* generate that sort of code by being a bit careful about how you use nested functions but it isn't really that straight forward. It would, of course, be possible to write an entirely new compiler that is tuned to generate code for the COG but that would be a much bigger effort than writing a code generator for GCC. We didn't do that. Even Catalina didn't do that.

    It's impressive that PropGCC can generate COG-mode code.
    Does the use of the HUB-stack appear in reports, or can it generate a warning ?
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 15:02
    jmg wrote: »
    It's impressive that PropGCC can generate COG-mode code.
    Does the use of the HUB-stack appear in reports, or can it generate a warning ?
    COG mode is thanks to Eric. I didn't think it could be done!

    No, no warning or message of any kind is generated for stack usage. Stack usage is normal and expected in C.
  • jmgjmg Posts: 15,182
    edited 2015-04-09 15:03
    Dave Hein wrote: »
    I just tried the fibo program in LMM and COG modes, and COG is about twice as fast as LMM. Of course, the fibo program uses recursive calling, so has to use the stack. I would expect a COG program to be at least 4 times faster than LMM as long as it doesn't use the stack or hub variables.

    Sounds tolerable, what about the PropBASIC Frequency counter I linked in #27 as a more real-world example ?
    Can that fit into COG mode, (sans-stacks?), and produce similar size / speed code ?
  • jmgjmg Posts: 15,182
    edited 2015-04-09 15:05
    David Betz wrote: »
    COG mode is thanks to Eric. I didn't think it could be done!

    No, no warning or message of any kind is generated for stack usage. Stack usage is normal and expected in C.

    Yes, I was meaning more as a user option ?
    When they want COG mode and no stacks, (which is a special case) being able to avoid the need to manually double-check ASM listing, makes code a lot more maintainable.
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 15:34
    jmg wrote: »
    Yes, I was meaning more as a user option ?
    When they want COG mode and no stacks, (which is a special case) being able to avoid the need to manually double-check ASM listing, makes code a lot more maintainable.
    I suppose it would be possible to do that by setting a flag in the code generator every time a stack operation is generated. That is, if it is possible at that level to distinguish between stack operations and other hub accesses. I'm not sure how useful it would be though. It would basically just tell you that a stack was needed but not why.
  • jmgjmg Posts: 15,182
    edited 2015-04-09 15:42
    David Betz wrote: »
    I suppose it would be possible to do that by setting a flag in the code generator every time a stack operation is generated. That is, if it is possible at that level to distinguish between stack operations and other hub accesses. I'm not sure how useful it would be though. It would basically just tell you that a stack was needed but not why.
    Any warning is going to need additional investigation, it is the change in warning level that matters.

    More useful than a simple flag, could be a stack counter, that INCs on every code generator stack case.
    Reporting that would give a reference point for code maintainers ( & retains usefulness in >0 cases ).
  • davidsaundersdavidsaunders Posts: 1,559
    edited 2015-04-09 16:54
    WOW, a lot of good information here. I only expected a few quick answers.

    So as I had thought, the use of stack space is still a concern (and a big one with only 496 longs of memory to work with). And much of the code I have seen uses some macros (if I remember correctly the header file is something like prop.h), macros that look a lot like spin commands, though have different parameters.

    Just as a thought:
    Why does not someone concentrate on a subset of C, and tuning it for COG execution. Perhaps start with something very simple like CUCU and add a little bit to take care of redundant code generated (to save space), and also have it generate native Propeller code directly. Then if you add a stack limit, most things should be fairly simple.

    I think a simple subset C compiler like that would be a perfect fit on the Propeller. That is just my view.
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 17:06
    So as I had thought, the use of stack space is still a concern (and a big one with only 496 longs of memory to work with).
    The stack, if one is needed, goes in hub memory not COG memory.
  • jmgjmg Posts: 15,182
    edited 2015-04-09 17:23
    Just as a thought:
    Why does not someone concentrate on a subset of C, and tuning it for COG execution. Perhaps start with something very simple like CUCU and add a little bit to take care of redundant code generated (to save space), and also have it generate native Propeller code directly. Then if you add a stack limit, most things should be fairly simple.

    I think a simple subset C compiler like that would be a perfect fit on the Propeller. That is just my view.
    Things are pretty close to that already - the discussion is around how to make that 'simple subset C' a little more control-able and maintainable.

    The Stack use is code-optional, but currently any use is a little hidden from users, hence my suggestion of some reporting (eg counter?)

    I think a COG-Mode working example, that is real-world in nature (not fibc), would help demonstrate the issues here, and the PropBASIC freqCounter I linked seems a good reference.
    That has some maths, some control flow, some string work, & comms, and gives a useful result, and seems to fit easily entirely in a COG ( ~152 Longs) with the small number-string in HUB..
    It is small enough, that Prop GCC can be 2x the size and still fit to demonstrate COG-mode.
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 17:26
    Here is a COG mode example that works and uses no stack space. In fact, this is the driver I was working on when I wrote those notes about using COG mode and eliminating stack usage.

    i2c_driver.c
  • jmgjmg Posts: 15,182
    edited 2015-04-09 17:43
    David Betz wrote: »
    Here is a COG mode example that works and uses no stack space. In fact, this is the driver I was working on when I wrote those notes about using COG mode and eliminating stack usage.
    Can you include the ASM listing from that too ?

    Is it easy to then make it use a stack, and include the .C/ASM for that one too ?
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 17:44
    jmg wrote: »
    Can you include the ASM listing from that too ?

    Is it easy to then make it use a stack, and include the .C/ASM for that one too ?
    It will use a stack automatically if you call nested procedures that aren't declared _NAKED or _NATIVE or whatever.
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 17:49
    jmg wrote: »
    Can you include the ASM listing from that too ?

    Is it easy to then make it use a stack, and include the .C/ASM for that one too ?
    Here is the assembly generated using -mcog -Os.
    .text
    	.balign	4
    _i2cStart
    	mov	r7, DIRA
    	andn	r7, _scl_mask
    	mov	DIRA, r7
    	mov	r7, DIRA
    	andn	r7, _sda_mask
    	mov	DIRA, r7
    	mov	r7, CNT
    	add	r7, _half_cycle
    	waitcnt	r7,#0
    	or	DIRA,_sda_mask
    	mov	r7, CNT
    	add	r7, _half_cycle
    	waitcnt	r7,#0
    	or	DIRA,_scl_mask
    	'native return
    _i2cStart_ret
    	ret
    	.balign	4
    _i2cSendByte
    	mov	r7, #9
    	'' loop_start register r7 level #1
    	jmp	#.L3
    .L6
    	mov	r6, DIRA
    	test	r0,#0x80 wz
    	IF_NE andn	r6, _sda_mask
    	IF_E  or	r6, _sda_mask
    	mov	DIRA, r6
    	mov	r6, CNT
    	add	r6, _half_cycle
    	waitcnt	r6,#0
    	and	DIRA,r5
    	mov	r6, CNT
    	add	r6, _half_cycle
    	waitcnt	r6,#0
    	mov	r6, DIRA
    	or	r6, _scl_mask
    	shl	r0, #1
    	mov	DIRA, r6
    	and	r0,#255
    .L3
    	mov	r5, _scl_mask
    	xor	r5,__MASK_FFFFFFFF
    	djnz	r7,#.L6
    	mov	r7, DIRA
    	andn	r7, _sda_mask
    	mov	DIRA, r7
    	mov	r7, CNT
    	add	r7, _half_cycle
    	waitcnt	r7,#0
    	and	DIRA,r5
    	mov	r7, INA
    	test	r7,_sda_mask wz
    	mov	r0, #0
    	mov	r7, CNT
    	muxnz	r0,#1
    	add	r7, _half_cycle
    	waitcnt	r7,#0
    	or	DIRA,_scl_mask
    	or	DIRA,_sda_mask
    	'native return
    _i2cSendByte_ret
    	ret
    	.balign	4
    _i2cReceiveByte
    	mov	r3, _sda_mask
    	mov	r7, DIRA
    	xor	r3,__MASK_FFFFFFFF
    	and	r7, r3
    	mov	DIRA, r7
    	mov	r6, #9
    	mov	r7, #0
    	'' loop_start register r6 level #1
    	jmp	#.L9
    .L10
    	mov	r5, CNT
    	add	r5, _half_cycle
    	waitcnt	r5,#0
    	and	DIRA,r4
    	mov	r5, INA
    	test	r5,_sda_mask wz
    	shl	r7, #1
    	mov	r5, #0
    	muxnz	r5,#1
    	and	r7, #254
    	or	r7, r5
    	mov	r5, CNT
    	add	r5, _half_cycle
    	waitcnt	r5,#0
    	or	DIRA,_scl_mask
    .L9
    	mov	r4, _scl_mask
    	xor	r4,__MASK_FFFFFFFF
    	djnz	r6,#.L10
    	mov	r6, DIRA
    	cmps	r0, #0 wz,wc
    	IF_NE or	r6, _sda_mask
    	IF_E  and	r6, r3
    	mov	DIRA, r6
    	mov	r6, CNT
    	add	r6, _half_cycle
    	waitcnt	r6,#0
    	and	DIRA,r4
    	mov	r6, CNT
    	add	r6, _half_cycle
    	waitcnt	r6,#0
    	or	DIRA,_scl_mask
    	or	DIRA,_sda_mask
    	mov	r0, r7
    	'native return
    _i2cReceiveByte_ret
    	ret
    	.balign	4
    _i2cStop
    	mov	r7, CNT
    	add	r7, _half_cycle
    	waitcnt	r7,#0
    	mov	r7, DIRA
    	andn	r7, _scl_mask
    	mov	DIRA, r7
    	mov	r7, DIRA
    	andn	r7, _sda_mask
    	mov	DIRA, r7
    	'native return
    _i2cStop_ret
    	ret
    	.balign	4
    	.global	_main
    _main
    	mov	r6, PAR
    	mov	r4, #1
    	mov	r3, r4
    	rdlong	r7, r6
    	shl	r3, r7
    	mov	r7, r6
    	add	r7, #4
    	mov	_scl_mask, r3
    	rdlong	r7, r7
    	shl	r4, r7
    	mov	r7, r6
    	add	r7, #8
    	add	r6, #12
    	mov	_sda_mask, r4
    	rdlong	r7, r7
    	shr	r7, #1
    	cmp	r7, #32 wz,wc
    	mov	_half_cycle, r7
    	IF_A  sub	r7, #32
    	IF_A  mov	_half_cycle, r7
    	mov	r7, #0
    	rdlong	r6, r6
    	mov	_mailbox, r6
    	wrlong	r7, r6
    	mov	r6, r3
    	mov	r5, DIRA
    	xor	r6,__MASK_FFFFFFFF
    	and	r5, r6
    	mov	DIRA, r5
    	mov	r7, r4
    	mov	r5, DIRA
    	xor	r7,__MASK_FFFFFFFF
    	and	r5, r7
    	mov	DIRA, r5
    	and	OUTA,r6
    	and	OUTA,r7
    .L30
    	mov	r6, _mailbox
    .L17
    	rdlong	r7, r6
    	cmps	r7, #0 wz,wc
    	IF_E 	jmp	#.L17
    	cmp	r7, #2 wz,wc
    	IF_B 	jmp	#.L31
    	cmp	r7, #3 wz,wc
    	IF_BE	jmp	#.L19
    	cmp	r7, #5 wz,wc
    	IF_A 	jmp	#.L31
    	jmp	#.L41
    .L19
    	mov	r5, r6
    	add	r5, #12
    	add	r6, #16
    	cmps	r7, #2 wz,wc
    	rdlong	lr, r5
    	rdlong	r14, r6
    	IF_NE	jmp	#.L38
    	call	#_i2cStart
    	mov	r7, _mailbox
    	add	r7, #8
    	rdlong	r0, r7
    	and	r0,#255
    	call	#_i2cSendByte
    	cmps	r0, #0 wz,wc
    	IF_NE mov	lr, #2
    	IF_NE	jmp	#.L18
    	jmp	#.L38
    .L25
    	rdbyte	r0, lr
    	call	#_i2cSendByte
    	cmps	r0, #0 wz,wc
    	IF_NE	jmp	#.L33
    	add	lr, #1
    	sub	r14, #1
    .L38
    	cmps	r14, #0 wz,wc
    	IF_NE	jmp	#.L25
    	mov	lr, #0
    	jmp	#.L24
    .L33
    	mov	lr, #3
    .L24
    	mov	r7, _mailbox
    	add	r7, #20
    	rdlong	r7, r7
    	cmps	r7, #0 wz,wc
    	IF_E 	jmp	#.L18
    	jmp	#.L40
    .L41
    	mov	r5, r6
    	add	r5, #12
    	add	r6, #16
    	cmps	r7, #4 wz,wc
    	rdlong	r14, r5
    	rdlong	lr, r6
    	IF_NE	jmp	#.L39
    	call	#_i2cStart
    	mov	r7, _mailbox
    	add	r7, #8
    	rdlong	r0, r7
    	and	r0,#255
    	call	#_i2cSendByte
    	cmps	r0, #0 wz,wc
    	IF_NE mov	lr, #4
    	IF_NE	jmp	#.L18
    	jmp	#.L39
    .L29
    	cmps	lr, #1 wz,wc
    	mov	r0, #0
    	muxnz	r0,#1
    	sub	lr, #1
    	call	#_i2cReceiveByte
    	wrbyte	r0, r14
    	add	r14, #1
    .L39
    	cmps	lr, #0 wz,wc
    	IF_NE	jmp	#.L29
    	mov	r7, _mailbox
    	add	r7, #20
    	rdlong	r7, r7
    	cmps	r7, #0 wz,wc
    	IF_E  mov	lr, #0
    	IF_E 	jmp	#.L18
    .L40
    	call	#_i2cStop
    	jmp	#.L18
    .L31
    	mov	lr, #1
    .L18
    	mov	r7, _mailbox
    	mov	r6, r7
    	add	r6, #4
    	wrlong	lr, r6
    	mov	r6, #0
    	wrlong	r6, r7
    	jmp	#.L30
    _scl_mask
    	long	0
    _sda_mask
    	long	0
    _half_cycle
    	long	0
    _mailbox
    	long	0
    
  • jmgjmg Posts: 15,182
    edited 2015-04-09 18:22
    David Betz wrote: »
    Here is the assembly generated using -mcog -Os.
    Thanks, interesting code generation,
    Comments :

    byte <<= 1;
    if (INA & sda_mask) byte++;

    should code smaller than
    byte <<= 1;
    byte |= (INA & sda_mask) ? 1 : 0;

    and I notice using vars of byte, can give larger code than using the native prop size, as the compiler adds a masking step.

    addit : The compiler knows about andn, so nicely does this (open drain)
    	mov	r6, DIRA                           ;         if (byte & 0x80)
    	test	r0,#0x80 wz                        ;             i2c_set_sda_high();
    	IF_NE andn	r6, _sda_mask              ;         else
    	IF_E  or	r6, _sda_mask              ;             i2c_set_sda_low();
    	mov	DIRA, r6                           ;        
    

    but in other places, it seems to forget that, and instead uses another register to load the mask, then flips the bits and uses and, instead of andn ?!
  • davidsaundersdavidsaunders Posts: 1,559
    edited 2015-04-09 19:17
    Well a lot of good C examples.

    My question on BASIC still remains open. Compiled structured BASIC is a great language, and I see no reason not to use it. PropBASIC is great, though I would like to know of other options before I commit to a particular version of BASIC.
  • David BetzDavid Betz Posts: 14,516
    edited 2015-04-09 19:24
    Well a lot of good C examples.

    My question on BASIC still remains open. Compiled structured BASIC is a great language, and I see no reason not to use it. PropBASIC is great, though I would like to know of other options before I commit to a particular version of BASIC.
    I don't know of any other Basic for the Propeller that produces PASM or LMM code. The others are either straight interpreters like FemtoBasic or byte code compilers like my xbasic or ebasic.
  • davidsaundersdavidsaunders Posts: 1,559
    edited 2015-04-09 19:53
    David Betz wrote: »
    I don't know of any other Basic for the Propeller that produces PASM or LMM code. The others are either straight interpreters like FemtoBasic or byte code compilers like my xbasic or ebasic.
    Thank you for that information. It is to bad we do not have more BASIC compilers for the Propeller.
  • jmgjmg Posts: 15,182
    edited 2015-04-09 20:43
    Thank you for that information. It is to bad we do not have more BASIC compilers for the Propeller.
    PropBASIC is being added to the Propeller IDE, (after being somewhat ignored) so that should boost the usage and support of PropBASIC.
    Not sure why you'd want more than one ?
    I'd rather see a PropPASCAL, before a second PropBASIC.
  • potatoheadpotatohead Posts: 10,261
    edited 2015-04-09 20:59
    Seconded jmg.
  • davidsaundersdavidsaunders Posts: 1,559
    edited 2015-04-10 05:00
    jmg wrote: »
    PropBASIC is being added to the Propeller IDE, (after being somewhat ignored) so that should boost the usage and support of PropBASIC.
    Not sure why you'd want more than one ?
    I'd rather see a PropPASCAL, before a second PropBASIC.
    Just that having more than one helps push the use and development of what there is.
  • davidsaundersdavidsaunders Posts: 1,559
    edited 2015-04-10 05:31
    Though I do agree that it would be nice to see a PropPascal. And having PropBASIC with PropellerIDE would likely increase its usage.
  • Dave HeinDave Hein Posts: 6,347
    edited 2015-04-10 10:39
    jmg wrote: »
    Sounds tolerable, what about the PropBASIC Frequency counter I linked in #27 as a more real-world example ?
    Can that fit into COG mode, (sans-stacks?), and produce similar size / speed code ?
    I converted the PropBASIC frequency counter program to C, and it's in the attached zip file. FreqCounter3.c doesn't use any of the special cog attributes, and it uses the stack. FreqCounter3cog.c uses the _NAKED, _NATIVE and _COGMEM attributes, and does not use the stack. It does use hub memory for strings and constants.

    I use CTRB to generate a signal, which is measured by CTRA. You can change the frequency by changing the #define for FREQ. The signal pin number is defined by Signal.

    I also generated the assembly output, which is included in the zip file.
  • idbruceidbruce Posts: 6,197
    edited 2015-04-10 11:31
    Dave

    In your code, I see you disabled the standard serial driver. I would assume you did this to reduce program overhead. If so, what kind of reduction in size is there?
  • Dave HeinDave Hein Posts: 6,347
    edited 2015-04-10 11:33
    It didn't fit in the cog with the standard driver. It was something like 70 bytes too large.
  • idbruceidbruce Posts: 6,197
    edited 2015-04-10 11:46
    I see.... I should have paid closer attention to the discussion. :) , but cool anyhow, nice to see how it is done. Thanks.
  • jmgjmg Posts: 15,182
    edited 2015-04-10 13:26
    Just that having more than one helps push the use and development of what there is.
    Or, the opposite can occur as you can disperse the effort, and just confuse new users.
    ( rather like PropIDE and SimpleIDE is doing right now... )
    That said, I can see a place for ByteCode BASIC and compiled Basic, tho ideally they are compatible...
Sign In or Register to comment.