Register r14 is being used as the frame pointer, except it wasn't initialized with the value of ptra first.
I noticed this use of r14 too. I wonder if it makes sense if the other P2 pointer register PTRB could/should be used for doing the frame pointer? Then potentially any local hub variables/arguments could be accessed directly with these types of indexed/offset hub access instruction formats:
Initially I though the SP (PTRA) register itself could also be used for this same purpose, but its base can obviously change during execution as data is pushed/popped on/off the stack dynamically which changes all the offsets required, and I'm not sure if GCC compensates automatically for that or not (hopefully it could though I don't think it would easily work if extra dynamic stack memory is ever allocated with alloca for example). So perhaps using a separate frame pointer register is probably somewhat more standard, and may make good use of PTRB.
I'd also earlier thought PTRB might be good to use for doing pointers to structures and directly accessing the data members at known offsets too but perhaps it's easier to use instead as a frame pointer which could give some good bang per buck for argument/local accesses on the stack as it frees the need to construct separate pointers everywhere the stack frame data gets accessed (which is probably quite a lot in C, though structure access can still get somewhat common too depending on how things are coded).
The good news is that I've been able to build both binutils and gcc. Unfortunately, I just tried to update using the following commands and got an error.
git pull
git submodule update
The error was:
error: no such remote ref bdf8861e1312d64f3858f00e3290e2b499df12d4
Fetched in submodule path 'binutils-gdb', but it did not contain bdf8861e1312d64f3858f00e3290e2b499df12d4. Direct fetching of that commit failed.
The good news is that I've been able to build both binutils and gcc. Unfortunately, I just tried to update using the following commands and got an error.
git pull
git submodule update
The error was:
error: no such remote ref bdf8861e1312d64f3858f00e3290e2b499df12d4
Fetched in submodule path 'binutils-gdb', but it did not contain bdf8861e1312d64f3858f00e3290e2b499df12d4. Direct fetching of that commit failed.
That looks like someone rewrote the history of the submodule repository resulting in a broken commit reference.
It took my brain some time to grok exactly how ELF relocations are used by LD to link an executable, but it's finally working.
Here's a new demo of a simple cogexec boot snippet that sets up the clock, sets the stack pointer and jumps to _main.
boot code loaded to 0x0
.file "crti.S"
.section ".boot"
_start:
hubset _clkmode
waitx ##200000
or _clkmode, #3
hubset _clkmode
mov ptra, #__stack
jmp #_main
// PLL on div 0+1 mul 7+1 div 2 drive RCFAST
// (1<<24) + (0<<18) + (7<<8) + (0<<4) + (2<<2) + (0)
_clkmode:
.long 0x1000708
.section ".hubram"
__stack:
.zero 4*100
C++ in hubexec with some variables in COGRAM
#define _COGRAM __attribute__((cogram))
#define _LUTRAM __attribute__((lutram))
#define _HUBRAM __attribute__((hubram))
unsigned int num _COGRAM = 3;
unsigned int period _COGRAM = 10000000;
void waitx(int x) {
asm("waitx %[x]"
:
: [x] "r" (x));
}
void blink(unsigned int cnt, unsigned int pin) {
unsigned int mask = 1 << pin;
asm("or dirb, %[mask]\n\t"
:
: [mask] "r" (mask));
for(int i=0; i<cnt; i++) {
asm("andn outb, %[mask]\n\t"
:
: [mask] "r" (mask));
waitx(period);
asm("or outb, %[mask]\n\t"
:
: [mask] "r" (mask));
waitx(period);
}
}
int main() {
unsigned int p = 24;
while(true) {
blink(num, p);
p++;
if(p>31)
p = 24;
}
}
Build/run steps
propeller-elf-as crti.S -o crti.o
propeller-elf-g++ test.cpp -c -O2
propeller-elf-ld crti.o test.o -o test
/loadp2/build/loadp2 -CHIP -v -p /dev/ttyUSB0 test
Also here's the default linker script:
/* Default linker script, for normal executables */
/* Copyright (C) 2014-2019 Free Software Foundation, Inc.
Copying and distribution of this script, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved. */
OUTPUT_FORMAT("elf32-littlepropeller")
OUTPUT_ARCH(propeller)
MEMORY
{
cogram : org = 0, l = 4 * 0x1EF
lutram : org = 0x800, l = 4 * 0x200
hubram : org = 0x0, l = 512K
}
PHDRS
{
headers PT_PHDR ;
text PT_LOAD ;
}
SECTIONS
{
. = SIZEOF_HEADERS ;
.data :
{
*(.boot)
*(.cogram)
. = 0x1000 ;
*(.hubram)
} >hubram :text
}
This is great Brian! Have updated my tree and am re-compiling your latest gcc as I type and can already hear the fans spinning up so I know my Mac is hard at work in preparation for this.
You've now got an end to end toolchain for the P2 showing great signs of life. Keep at it!
Very :cool: work. I know to get things going this far requires a lot.
HUBSET police calling ... making sure you are aware of the important workaround, after the PLL is engaged, needed in any subsequent _hubset() primitive ... Heh, funnily, there isn't an actual example of this posted anywhere.
@evanh Yeah I noticed the glaring warning in the v33 docs about saving the last value used. In the above example that would be hub address 0x1c (or let the linker find it).
Is there actually a way to read the current value?
Is there actually a way to read the current value?
Nope, all the special purpose registers are read only or write only, with maybe the exception of OUT and DIR. There is a lot of them in the prop2, all the smartpins for example, so it does save transistors.
Eric and Ross will have nicely curated versions for C primative. Here's what I do in my dynamic setting routine after recalculating a new sysclock frequency and baud divider. No bounds checking, I know. "clk_mode" in this case is just a general cogRAM variable internal to my programs.
'adjust hardware to new XMUL sysclock frequency
andn clk_mode, #%11 'clear the two select bits to force RCFAST selection
hubset clk_mode '**IMPORTANT** Switches to RCFAST using known prior mode
mov clk_mode, xtalmul 'replace old with new ...
sub clk_mode, #1 'range 1-1024
shl clk_mode, #8
or clk_mode, ##(1<<24 + (XDIV-1)<<18 + XPPPP<<4 + XOSC<<2)
hubset clk_mode 'setup PLL mode for new frequency (still operating at RCFAST)
or clk_mode, #XSEL 'add PLL as the clock source select
waitx ##22_000_000/100 '~10ms (at RCFAST) for PLL to stabilise
hubset clk_mode 'engage! Switch back to newly set PLL
ret wcz
Don't know about Ross but Eric has followed Dave's lead and used the preassigned mailbox at hub address $18 instead. The mailbox is important if you are passing a preset PLL mode between programs. If you can count on RCFAST as the handover between programs then the mailbox mechnism is optional. Further reading - https://forums.parallax.com/discussion/comment/1475472/#Comment_1475472
I updated (via git pull) and recompiled both gcc and binutils using your feature/propeller2 branch.
I can't replicate the results you have above, are you sure it is checked into gitlab or is it still something you are working on, or perhaps exists in some other branch I should use instead?
I get this result during compile on the code above:
It took my brain some time to grok exactly how ELF relocations are used by LD to link an executable, but it's finally working.
Here's a new demo of a simple cogexec boot snippet that sets up the clock, sets the stack pointer and jumps to _main.
boot code loaded to 0x0
.file "crti.S"
.section ".boot"
_start:
hubset _clkmode
waitx ##200000
or _clkmode, #3
hubset _clkmode
mov ptra, #__stack
jmp #_main
// PLL on div 0+1 mul 7+1 div 2 drive RCFAST
// (1<<24) + (0<<18) + (7<<8) + (0<<4) + (2<<2) + (0)
_clkmode:
.long 0x1000708
.section ".hubram"
__stack:
.zero 4*100
C++ in hubexec with some variables in COGRAM
#define _COGRAM __attribute__((cogram))
#define _LUTRAM __attribute__((lutram))
#define _HUBRAM __attribute__((hubram))
unsigned int num _COGRAM = 3;
unsigned int period _COGRAM = 10000000;
void waitx(int x) {
asm("waitx %[x]"
:
: [x] "r" (x));
}
void blink(unsigned int cnt, unsigned int pin) {
unsigned int mask = 1 << pin;
asm("or dirb, %[mask]\n\t"
:
: [mask] "r" (mask));
for(int i=0; i<cnt; i++) {
asm("andn outb, %[mask]\n\t"
:
: [mask] "r" (mask));
waitx(period);
asm("or outb, %[mask]\n\t"
:
: [mask] "r" (mask));
waitx(period);
}
}
int main() {
unsigned int p = 24;
while(true) {
blink(num, p);
p++;
if(p>31)
p = 24;
}
}
Build/run steps
propeller-elf-as crti.S -o crti.o
propeller-elf-g++ test.cpp -c -O2
propeller-elf-ld crti.o test.o -o test
/loadp2/build/loadp2 -CHIP -v -p /dev/ttyUSB0 test
Also here's the default linker script:
/* Default linker script, for normal executables */
/* Copyright (C) 2014-2019 Free Software Foundation, Inc.
Copying and distribution of this script, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved. */
OUTPUT_FORMAT("elf32-littlepropeller")
OUTPUT_ARCH(propeller)
MEMORY
{
cogram : org = 0, l = 4 * 0x1EF
lutram : org = 0x800, l = 4 * 0x200
hubram : org = 0x0, l = 512K
}
PHDRS
{
headers PT_PHDR ;
text PT_LOAD ;
}
SECTIONS
{
. = SIZEOF_HEADERS ;
.data :
{
*(.boot)
*(.cogram)
. = 0x1000 ;
*(.hubram)
} >hubram :text
}
Very nice work! Parallax is lucky to have you working on this. Have you been in touch with Ken Gracey or Jeff Martin? I'm sure they will be excited about your progress.
@rogloh Oh boy, I've seen that bug before and I thought I'd fixed it. You're on the correct branch. I may have to re-write a small piece of the assembler to fix this. Weirdly it behaves on my machine.
@"David Betz" I haven't reached out but they're probably aware. This is mostly a hobby project for me, but it's fun to be able to share something useful.
@"David Betz" I haven't reached out but they're probably aware. This is mostly a hobby project for me, but it's fun to be able to share something useful.
Well, your hobby project is making great progress and will likely be quite useful to the P2 community as are Eric Smith's Fastspin efforts. Parallax is lucky to have some great people volunteering their time.
@"David Betz" I haven't reached out but they're probably aware. This is mostly a hobby project for me, but it's fun to be able to share something useful.
Whether you realize it or not, your "hobby project" is likely the best C++ compiler the P2 will ever have. Once PropGCC for P2 is at feature-parity with PropGCC for P1, I don't expect anyone will have the motivation necessary to port LLVM. And arguably, this may be the best C compiler for P2 as well (I'll certainly prefer it to FastSpin or Catalina, simply because it will integrate nicely with the rest of my tools) - but others may well disagree and I've no interest in finding out who is right
Anyway, don't kid yourself. You are creating something that will be useful for the entire Parallax community for the next decade most likely. No pressure or anything.....
Having both Hubexec and an industry standard C/C++ compiler finally working together will be a very powerful combination for the P2 and will unlock a lot of potential.
Can someone help me understand what I'm doing wrong here? Here's a simple example that should alternate blinking LED 56 and 63 by calling functions using a stack at ptra = 0x1044 via calla/reta. Instead it just appears to repeatedly call "_blink56".
A few posibilities:
1. I'm misusderstanding how these instructions should work
2. The bits aren't being assembled correctly
blink.s
.section ".boot"
_start:
hubset _clkmode
waitx ##200000
or _clkmode, #3
hubset _clkmode
loc ptra, #_stack
jmp #_main
// PLL on div 0+1 mul 7+1 div 2 drive RCFAST
// (1<<24) + (0<<18) + (7<<8) + (0<<4) + (2<<2) + (0)
_clkmode:
.long 0x1000708
.section ".cogram"
p56:
.long 0x1000000
p63:
.long 0x80000000
t: .long 40000000
pin:
.long 24
.section ".hubram"
.global _main
_blink56:
andn outb, p56
waitx t
or outb, p56
waitx t
reta
_blink63:
andn outb, p63
waitx t
or outb, p63
waitx t
reta
_main:
or outb, p56
or dirb, p56
or outb, p63
or dirb, p63
_loop:
calla #_blink56
calla #_blink63
jmp #_loop
_stack:
.zero 4*100
@ntosme2 you may wish to use a disassembler that returns your binary back into assembly instructions to see what it did and make sure it did what you expected. A useful one has already been written by Dave Hein in his gcc work.
Cool. When you do get to that, I've found there is scope there to include P2 specific optimizations using setq burst transfers to/from the stack. Though best just get it working first, then try to optimize.
Ok I finally had some time to poke at this more. There were several bugs/typos in GAS that are fixed and it in most cases generates binaries that dump correctly with p2dump. Most common instructions are supported. I need to refactor GAS because I realize I over-complicated things a bit, at which point all instructions will be supported.
Now I have a much better grasp of gcc's handling of stack/frame pointers (ptra/ptrb in this case) and I fixed a few related lines of code.
The following will now compile and run in both -O0 and -O2 optimization with the newest commits in git.
To get further with this I'd need some help from someone experienced with gcc's instruction constraint mapping. The documentation is written from the perspective of someone already familiar with the internals. As-is it seems to compile and link most things correctly, but to compile the full libgcc support libraries I'm missing certain instruction patterns.
I just pushed changes to move the conditional execution prefix/flag before the opcode, and enabled conditional execution in the general case. GAS wants to treat the '#' character as a comment and I've resorted to using '$' or '$$' to denote immediate operands.
The assembler may also still be missing a few instructions.
Anyone else up to the task to push this across the finish line?
Comments
I noticed this use of r14 too. I wonder if it makes sense if the other P2 pointer register PTRB could/should be used for doing the frame pointer? Then potentially any local hub variables/arguments could be accessed directly with these types of indexed/offset hub access instruction formats:
rdlong rxx, ptrb[8]
rdword ryy, ptrb[4]
wrbyte rzz, ptrb[-10]
etc
Initially I though the SP (PTRA) register itself could also be used for this same purpose, but its base can obviously change during execution as data is pushed/popped on/off the stack dynamically which changes all the offsets required, and I'm not sure if GCC compensates automatically for that or not (hopefully it could though I don't think it would easily work if extra dynamic stack memory is ever allocated with alloca for example). So perhaps using a separate frame pointer register is probably somewhat more standard, and may make good use of PTRB.
I'd also earlier thought PTRB might be good to use for doing pointers to structures and directly accessing the data members at known offsets too but perhaps it's easier to use instead as a frame pointer which could give some good bang per buck for argument/local accesses on the stack as it frees the need to construct separate pointers everywhere the stack frame data gets accessed (which is probably quite a lot in C, though structure access can still get somewhat common too depending on how things are coded).
That looks like someone rewrote the history of the submodule repository resulting in a broken commit reference.
The other way to update is to go into 'gcc' and 'binutils-gdb' and do:
GAS + LD can now output a runnable ELF image loadable by loadp2!
blink.s
steps to reproduce:
Here's a new demo of a simple cogexec boot snippet that sets up the clock, sets the stack pointer and jumps to _main.
boot code loaded to 0x0
C++ in hubexec with some variables in COGRAM
Build/run steps
Also here's the default linker script:
You've now got an end to end toolchain for the P2 showing great signs of life. Keep at it!
Very :cool: work. I know to get things going this far requires a lot.
Is there actually a way to read the current value?
Eric and Ross will have nicely curated versions for C primative. Here's what I do in my dynamic setting routine after recalculating a new sysclock frequency and baud divider. No bounds checking, I know. "clk_mode" in this case is just a general cogRAM variable internal to my programs.
Don't know about Ross but Eric has followed Dave's lead and used the preassigned mailbox at hub address $18 instead. The mailbox is important if you are passing a preset PLL mode between programs. If you can count on RCFAST as the handover between programs then the mailbox mechnism is optional. Further reading - https://forums.parallax.com/discussion/comment/1475472/#Comment_1475472
I updated (via git pull) and recompiled both gcc and binutils using your feature/propeller2 branch.
I can't replicate the results you have above, are you sure it is checked into gitlab or is it still something you are working on, or perhaps exists in some other branch I should use instead?
I get this result during compile on the code above:
I attached the same file converted to assembly code below (with the -S option to gcc) if that helps identify the issue.
The assembler doesn't seem to like it. Similar types of errors when assembling the crti.S file.
@"David Betz" I haven't reached out but they're probably aware. This is mostly a hobby project for me, but it's fun to be able to share something useful.
Whether you realize it or not, your "hobby project" is likely the best C++ compiler the P2 will ever have. Once PropGCC for P2 is at feature-parity with PropGCC for P1, I don't expect anyone will have the motivation necessary to port LLVM. And arguably, this may be the best C compiler for P2 as well (I'll certainly prefer it to FastSpin or Catalina, simply because it will integrate nicely with the rest of my tools) - but others may well disagree and I've no interest in finding out who is right
Anyway, don't kid yourself. You are creating something that will be useful for the entire Parallax community for the next decade most likely. No pressure or anything.....
A few posibilities:
1. I'm misusderstanding how these instructions should work
2. The bits aren't being assembled correctly
blink.s
generated machine code
*** note the above dump is unfortunately in little-endian and kinda gross to parse; here's a python snippet I've been using
https://github.com/davehein/p2gcc/tree/master/util
EDIT: Here's the disassembled output from p2dump
With GAS, ".zero 4" is equivalent to ".long x" except it doesn't get initialazed.
Now I have a much better grasp of gcc's handling of stack/frame pointers (ptra/ptrb in this case) and I fixed a few related lines of code.
The following will now compile and run in both -O0 and -O2 optimization with the newest commits in git.
crti.S
test.cpp - cycle through LEDs [56..63] repeatedly
compile/run:
@rogloh let me know if you're still seeing that assembler bug; I haven't been able to replicate it again
I just pushed changes to move the conditional execution prefix/flag before the opcode, and enabled conditional execution in the general case. GAS wants to treat the '#' character as a comment and I've resorted to using '$' or '$$' to denote immediate operands.
The assembler may also still be missing a few instructions.
Anyone else up to the task to push this across the finish line?