You should use "mov ijmp1, ##_my_fancy_isr". The value of of _my_fancy_isr is greater than 0x3ff since the hub code starts at 0x400.
EDIT: Each C file gets compiled into an object file starting at 0x400. The linker combines all the object files together, and relocates the object files so they fit in hub memory one after another.
You should use "mov ijmp1, ##_my_fancy_isr". The value of of _my_fancy_isr is greater than 0x3ff since the hub code starts at 0x400.
EDIT: Each C file gets compiled into an object file starting at 0x400. The linker combines all the object files together, and relocates the object files so they fit in hub memory one after another.
WOOHOO! Just had to change my set_isr macro and it's working
So #asdf refers to a symbol "asdf" in cog memory and ##asdf refers to a symbol in hub memory? This would be part of the Spin2/PASM language standard then, is that the right vocabulary?
No, #asdf does not refer to cog memory. The "mov ijmp1, #_my_fancy_isr" instructions generates a single instruction that loads the 9 LSBs of _my_fancy_isr into the ijmp1 register. "mov ijmp1, ##_my_fancy_isr" actually generates 2 instructions that load the 32-bit value of _my_fancy_isr into ijmp1. The first instruction is an AUGS that gets the 23 MSBs of _my_fancy_isr, and the second instruction is essentially "mov ijmp1, #_my_fancy_isr & 511". These two instructions work together to load the full 32 bits of _my_fancy_isr into ijmp1.
No, #asdf does not refer to cog memory. The "mov ijmp1, #_my_fancy_isr" instructions generates a single instruction that loads the 9 LSBs of _my_fancy_isr into the ijmp1 register. "mov ijmp1, ##_my_fancy_isr" actually generates 2 instructions that load the 32-bit value of _my_fancy_isr into ijmp1. The first instruction is an AUGS that gets the 23 MSBs of _my_fancy_isr, and the second instruction is essentially "mov ijmp1, #_my_fancy_isr & 511". These two instructions work together to load the full 32 bits of _my_fancy_isr into ijmp1.
My whole understanding of C/C++ compilers is getting thrown for a loop. I thought, upon entering a function, all the local registers were pushed to the top of the stack and before returning they were popped off again, allowing the calling function to have its state saved and restored. But I'm not seeing any of that in the code generated from PropGCC. How does PropGCC avoid clobbering local variable when it calls a function???
So my post is two questions:
1) How the heck does this work
and
2) Assuming it works by "magic" and said "magic" will break if an ISR is called in the middle of this function executing, can p2gcc wrap my ISR function logic with push/pop instructions? Maybe it could search for symbols starting with "_isr" and insert push instructions at the top. That seems "easy" enough. Then maybe p2gcc could find the "jmp lr" and replace it with pop instructions and a reti<X>? How to determine X though... maybe further restrictions on the function name, such as "function must start with isr_X" where "X" is a number 1-3?
One temporary option is that I write two macros which push & pop all the registers. This methodology would be a bit wasteful though, because in the C code I wouldn't know which registers actually need to be saved. p2gcc could, potentially, look at the function implementation and only save those registers which are going to be clobbered.
Dave Hein and David Zemon, thank you for addressing the subject of interrupts. This is currently appreciated and will be much more appreciated by me in the future.
The first 6 arguments are placed in registers r0 to r5. If there are more than 6 arguments the rest of the arguments are put on the stack. This makes function calling more efficient.
For functions with a variable number of arguments, the extra arguments plus the one previous to them are all put on the stack. As an example, for printf, all of the arguments are on the stack. For ssprintf, the first argument is in r0, and the rest are on the stack.
r0 is used to return a single 32-bit value. r0 and r1 are used to return a 64-bit value, such as double.
If a function uses a pointer to an argument it must copy the argument to memory, and use the memory copy instead of the value in the register.
Also, PropGCC uses 15 general purpose registers call r0 to r14. The are 2 special purpose registers call sp and lr, which are the stack pointer and link register. The link register hold the return address, and a function save it if it needs to call another function. Of the registers r0 through r14, some of the registers can be modified, and some of the must be preserved. I'm pretty sure r0 through r5 don't need to be preserved. I don't know if there are any others that can be changed without restoring their original values.
So we have the scenario of func1(...) calling func2(...). Am I correct in understanding that, prior to invoking func2(), PropGCC will ensure that none of r0-r14 are being used for anything other than arguments into func2? PropGCC is going to ensure any local variables that it created and stored in r0-r14 which were needed before func2 and will be needed again after func2 have been saved onto the stack for safe keeping?
And that does seem to confirm that we both need and should be able to push all the registers onto the stack at the beginning of an ISR and pop them off when we're done. Alternatively, we could "switch register banks." That might be an interesting thought exercise: is it feasible for p2gcc to implement switchable register banks on top of PropGCC and, for the future, how hard is it to implement switchable register banks in PropGCC itself? Seems much more complicated, but it would save a lot of CPU time not having to push/pop for each ISR.
My whole understanding of C/C++ compilers is getting thrown for a loop. I thought, upon entering a function, all the local registers were pushed to the top of the stack and before returning they were popped off again
Right, a quick run down on calling conventions in high level languages and compiling for them:
Typically the code generator for a particular target architecture will assign the machine registers into several
disjoint sets, caller-save, callee-save and scratch/temporary. When function a() calls function b(), a is the
caller, b is the callee.
The protocol is that a procedure/function's code has to save and restore the callee-save registers it touches,
they are strictly property of the calling function.
Caller-save registers are assumed to always be trashed by procedure/function calls (unless global dataflow
analysis indicates otherwise), so must be saved/restored by the caller across calls (if live at that point).
A live register is one whose value is needed later in at least one execution path.
Temporary/scratch registers are for use within a code fragment, they never need to be saved and are assumed
trashed by any callee. Special purpose machine registers and flags are typical examples, and sometimes a
couple of general purpose registers are set aside for complex fragments.
Interrupt handlers have a different calling convention in which all registers (and flags) are callee-save.
You could make all registers caller-save, or you could make all callee-save, but neither performs as well
as using some of each - what you are striving for is that the leaf functions and the next level up (which is
where code spends most of its time) are typically avoiding unnecessary stack-traffic - each caller-save
register is free for the leaf function to use, each callee-save is free for the next level up functions to use...
Higher up in the call levels you get less benefit, but leaf calls are usually where a program spends most time.
When one language can call into another (like C to assembler), the calling conventions may differ, and such
"foreign" calls require extra code fragments to convert between conventions.
There is of course more to it than this, as usually a set of the caller-save/temp registers are also the argument/return registers, and many architectures have complicated rules about which registers arguments go into
when there is a mix of types and sizes, and when the number of argments means the stack is also involved
in passing arguments.
Heavily optimizing compilers can do more advanced things, like assign different calling conventions to
different functions, and inlining leaf functions automatically, based on analysis of the inner loops and
the call tree - however anything visible to the linker has to use the standard calling conventions.
Only r0-r5 are used for passing arguments, and as I said, I believe these register do not need to be preserved. I know that the higher registers do need to be preserved, but I don't know which register this starts at. One way to determine this is to look at generated assembly code and see what it does. That's how I figured out the calling argument rules.
A C ISR needs to save all the registers on entry, and restore all the registers on exit. With the eggbeater architecture this can be done quite efficiently with the streaming instructions. It takes about the same number of cycles to store all 15 registers as it would to store 2 or 3 registers separately.
Only r0-r5 are used for passing arguments, and as I said, I believe these register do not need to be preserved. I know that the higher registers do need to be preserved, but I don't know which register this starts at. One way to determine this is to look at generated assembly code and see what it does. That's how I figured out the calling argument rules.
A C ISR needs to save all the registers on entry, and restore all the registers on exit.
Makes sense that normal functions wouldn't need to save off r0-r5 - for my purposes, I only care about ISRs though. I trust PropGCC to do the right thing everywhere else and don't want to mess with it. But I do want to be able to use a high-level language for my ISRs.
With the eggbeater architecture this can be done quite efficiently with the streaming instructions. It takes about the same number of cycles to store all 15 registers as it would to store 2 or 3 registers separately.
I haven't gotten far enough in my explorations to know how to do this yet. Right now, I'm trying to make the simple-but-ugly solution work, by sticking this at the end of prefix.spin2:
__PUSH_FRAMEmov isr_bak_r0, r0mov isr_bak_r1, r1mov isr_bak_r2, r2mov isr_bak_r3, r3mov isr_bak_r4, r4mov isr_bak_r5, r5mov isr_bak_r6, r6mov isr_bak_r7, r7mov isr_bak_r8, r8mov isr_bak_r9, r9mov isr_bak_r10, r10mov isr_bak_r11, r11mov isr_bak_r12, r12mov isr_bak_r13, r13mov isr_bak_r14, r14mov isr_bak_sp, sp
mov isr_bak_temp, temp
mov isr_bak_temp1, temp1mov isr_bak_temp2, temp2ret__POP_FRAMEmov r0, isr_bak_r0mov r1, isr_bak_r1mov r2, isr_bak_r2mov r3, isr_bak_r3mov r4, isr_bak_r4mov r5, isr_bak_r5mov r6, isr_bak_r6mov r7, isr_bak_r7mov r8, isr_bak_r8mov r9, isr_bak_r9mov r10, isr_bak_r10mov r11, isr_bak_r11mov r12, isr_bak_r12mov r13, isr_bak_r13mov r14, isr_bak_r14mov sp, isr_bak_sp
mov temp, isr_bak_temp
mov temp1, isr_bak_temp1mov temp2, isr_bak_temp2retisr_bak_r0 long 0isr_bak_r1 long 0isr_bak_r2 long 0isr_bak_r3 long 0isr_bak_r4 long 0isr_bak_r5 long 0isr_bak_r6 long 0isr_bak_r7 long 0isr_bak_r8 long 0isr_bak_r9 long 0isr_bak_r10 long 0isr_bak_r11 long 0isr_bak_r12 long 0isr_bak_r13 long 0isr_bak_r14 long 0isr_bak_sp long 0isr_bak_temp long 0isr_bak_temp1 long 0isr_bak_temp2 long 0
Of course... that's not gonna be good when the second interrupt fires in the middle of the first .... but that's a problem for another day (hopefully by the time I get that far, either I'll know how to use the eggbeater or you'll have come to my rescue and provided the necessary code).
So I paired the above with a couple macros in my headers:
Annnddd if I did everything right, that should work, right? Not that it's necessary for this ISR, but I'm starting here since I know it already works and then will move it to my serial object. But it's not working. I'll bang on this a bit more before I try to post a simple and complete example which demonstrates the problem.
In PropGCC, r0-r7 are scratch registers and are not saved across function calls. If a function only uses those registers then it won't have to do any pushes or pops. We also use r0-r5 for passing arguments and for returning values.
The definitions for all of these things are in the gcc sources, gcc/gcc/config/propeller.h.
Made some progress this evening. I figured out that I can't just put my __PUSH_FRAME and __POP_FRAME functions at the end of prefix.spin2 (right before the "orgh $400" line). I'm thinking that's because my application code is overwriting those functions at link time (or something???). So I stuck a "org 400" right above the "__PUSH_FRAME" label and at least now the behavior is consistent whether I put my ISR definition at the top or bottom of blinky.c Unfortunately, that "consistent" behavior is not the correct behavior. All my application is running perfectly except the ISR... it's as if the ISR is being skipped entirely. It's almost as if... the "ret" instruction in __PUSH_FRAME is returning from the ISR rather than returning back to the ISR. So that led me to try and insert the push/pop functions inline in the ISR in blinky.c Of course... that quickly failed because the mov instructions were being rewritten as wrlong/rdlong. Oh well.... to p2asm I go!
So I combined prefix.spin with some hand-written code and it's working just fine. One key difference is that I removed the "org $400" from between prefix.spin2 and my code. Without that, I was getting a bunch of "illegal literals," which I suppose makes sense.
But this was at least enough to confirm that call/ret is allowed in ISR blocks. I was starting to worry that I missed something, and that the "ret" instruction in __PUSH_STACK was going all the way back to the main code rather than back to the ISR.
Anyway, my sadly broken code is pushed to git. Full commit here, blinky.c with the ISR here, working test_int.p2asm here, and I've attached my modified prefix.spin2 to the post.
In PropGCC, r0-r7 are scratch registers and are not saved across function calls. If a function only uses those registers then it won't have to do any pushes or pops. We also use r0-r5 for passing arguments and for returning values.
The definitions for all of these things are in the gcc sources, gcc/gcc/config/propeller.h.
So taking this and all the other comments above into consideration, is the following correct?
It is up to the calling function to ensure that r0-r7 are stashed away somewhere if it makes a call to another function but still needs those values. And it is up to the callee function to ensure r8-r14 are saved off and restored prior to modifying them, if it needs to modify them?
And, of course, an ISR needs to save off and restore any SPR that will be modified.
So taking this and all the other comments above into consideration, is the following correct?
It is up to the calling function to ensure that r0-r7 are stashed away somewhere if it makes a call to another function but still needs those values. And it is up to the callee function to ensure r8-r14 are saved off and restored prior to modifying them, if it needs to modify them?
And, of course, an ISR needs to save off and restore any SPR that will be modified.
Dave,
loadp2 is still not right for me. I've got it working reliably by outright replacing the call to findp2() with serial_init().
Testing has me a little confused for the moment. There was always an intermittent "Could not find a P2" error, even with the ES silicon, but when changing back to testing on the FPGA it has some sort of resetting after load problem again. But not consistently, sometimes it works fine.
That is with findp2(). With just serial_init() it seems to be 100% reliable on both chips.
Here's the actual change I've made:
/*
// Find a P2 on one of the serial ports, or on the specified port
if (!findp2(PORT_PREFIX, LOADER_BAUD, port))
{
printf("Could not find a P2\n");
exit(1);
}
*/if (1 != serial_init(port, LOADER_BAUD))
{
printf("Could not open port %s\n", argv[1]);
exit(1);
}
The change is just a copy'n'paste from an older version of loadp2.c. I know it's not suitable as general solution but just to let you know the area of problem.
I have the DE2-115 and the P2 Eval board. My development machine are a Windows 10 box and a Linux box running Xubuntu.
I suppose I could add another flag that avoids findp2, and just does a serial_init without testing for a P2 chip. I'm away from home right now, so I won't be able to get to this until Saturday.
I'm flat out of ideas. I've spent another evening trying to find a way to push/pop all the SPRs while in an interrupt without using any of the SPRs and everything I've tried has failed. Every time I try to modify prefix.spin2, the app behaves "strangely." I'm guessing a hardcoded address somewhere and code is getting clobbered? And nothing I've tried in C has produced the desired .spin2 file. The closest I've come is this mess:
which produces an extra rdlong instruction for each mv and, even if I didn't care about the wasted instructions, it's using one of the registers that I'm trying to prevent from being clobbered
Umm, my advise is don't try to. ISRs in the propeller, like "objects" in the Obex, should be integral to the program running in that cog. Each use is custom to suit the job. Mostly that means everything is global within that cog.
Conventions for utilisation of hubRAM might be a good idea though.
I have a question?, Why doesn't the P2 use DFU mode to load code. This seems to be a standard used by a number of chip makers to load code into memory.
This is why the P2 doesn't use DFU mode to load code.
From Wikipedia:
DFU or Device Firmware Upgrade mode allows all devices to be restored from any state. It is essentially a mode where the BootROM can accept iBSS. DFU is part of the SecureROM which is burned into the hardware, so it cannot be removed. On A7+ devices, it generates an ApNonce and recognizes APTickets as well, so even in DFU, it can accept an APTicket.
Comments
This works great and blinks p58 at 10 Hz while 56 and 57 blink at 1 Hz.
CON LOOP_COUNT = 10 dat org hubset #0 init mov ijmp1, #my_fancy_isr blink waitx ##1_000_000 drvnot #58 djnz index, #blink trgint1 drvnot #57 mov index, #LOOP_COUNT jmp #blink my_fancy_isr drvnot #56 reti1 index long LOOP_COUNT
This is the code I would like to write in C
#include "common.h" #include <propeller.h> #include <stdio.h> static const uint32_t XI = 20000000; static const uint32_t INPUT_DIVIDER = 1; static const uint32_t VCO_MULTIPLIER = 4; static const uint32_t FINAL_DIVIDER = 1; uint32_t CLOCK_FREQ; ISR(my_fancy_isr) { io_asm(drvnot, 56); interrupt_return(1); } void do_magic () { int32_t i; while (1) { i = 20; do { drive_invert(58); waitx(CLOCK_FREQ / 20); } while (--i); drive_invert(57); trigger_interrupt(1); } } void main () { int i; // Wait for things to start set_clock_rcfast(); waitx(RCFAST_FREQ); CLOCK_FREQ = compute_clock(XI, INPUT_DIVIDER, VCO_MULTIPLIER, FINAL_DIVIDER); const error_t err = set_clock_pll(INPUT_DIVIDER, VCO_MULTIPLIER, FINAL_DIVIDER); if (err) { printf("Error! %d\n", err); } else { printf("Running at %d Hz\n", CLOCK_FREQ); set_isr(1, my_fancy_isr); do_magic(); } }
But it's not working at all. For reference, here's the exact commit for both blinky.c as well as common.h, not included in this post.
At the bottom of the generated .spin2 file I see that my main code is getting launched into HUB RAM. is that really the case?
'******************************************************************************* ' Program HUB Code '******************************************************************************* orgh $400 .text .balign 4 .global _my_fancy_isr _my_fancy_isr ' 6 "blinky.c" 1 drvnot #56 ' 0 "" 2 ' 7 "blinky.c" 1 reti1 ' 0 "" 2 ' Naked function: epilogue provided by programmer. .balign 4 .global _main _main ' 11 "blinky.c" 1 mov ijmp1, #_my_fancy_isr ' 0 "" 2 mov r7, #10 mov r4, #58 rdlong r5, ##.LC0 mov r6, #57 .L7 ' 241 "common.h" 1 drvnot r4 ' 0 "" 2 ' 45 "common.h" 1 waitx r5 ' 0 "" 2 djnz r7,#.L7 ' 241 "common.h" 1 drvnot r6 ' 0 "" 2 ' 20 "blinky.c" 1 trgint1 ' 0 "" 2 mov r7, #10 jmp #.L7 .balign 4 .LC0 long 1000000
If that is truly the case, would that not prevent the ISR from working correctly? I do get this warning from p2asm:
0: ERROR: Immediate value must be between 0 and 511 mov ijmp1, #_my_fancy_isr
EDIT: Each C file gets compiled into an object file starting at 0x400. The linker combines all the object files together, and relocates the object files so they fit in hub memory one after another.
#include <propeller.h> void __attribute__((noinline)) init_isr(void *ptr) { __asm__("mov ijmp1, r0"); } void __attribute__((noinline)) isr(void) { __asm__("drvnot #57"); __asm__("reti1"); } void trigger_isr(void) { __asm__("trgint1"); } int main(void) { init_isr(isr); while (1) { trigger_isr(); waitcnt(40000000+CNT); } return 0; }
I can only find V7
WOOHOO! Just had to change my set_isr macro and it's working
#define set_isr(interruptNumber, isr) __asm__ volatile ("mov ijmp" #interruptNumber ", ##_" #isr)
So #asdf refers to a symbol "asdf" in cog memory and ##asdf refers to a symbol in hub memory? This would be part of the Spin2/PASM language standard then, is that the right vocabulary?
Wow. Impressive
For instance, here's a C function:
void uart_tx_isr2 (void) { if (uart2->head != uart2->tail) { set_smartpin_y_reg(uart2->pinNumber, *uart2->head); ++uart2->head; if (uart2->head == &uart2->buffer[UART_BUFFER_SIZE]) { uart2->head = &uart2->buffer[0]; } } interrupt_return(2); }
and here's the .s file from PropGCC
.global _uart_tx_isr2 _uart_tx_isr2 rdlong r7, .LC0 mov r6, .LC1 mov r5, r7 mov r4, r7 add r5, r6 add r4, .LC2 rdlong r5, r5 rdlong r4, r4 cmps r5, r4 wz,wc IF_E jmp #.L2 mov r4, r7 add r4, #8 rdlong r3, r5 rdbyte r4, r4 ' 384 "common.h" 1 wypin r3, r4 ' 0 "" 2 add r5, #4 add r6, r7 cmps r5, r6 wz,wc IF_E add r7, #12 wrlong r5, r6 IF_E wrlong r7, r5 .L2 ' 16 "uart.c" 1 reti2 ' 0 "" 2 jmp lr
Nothing to do with pushing/popping the stack.
So my post is two questions:
1) How the heck does this work
and
2) Assuming it works by "magic" and said "magic" will break if an ISR is called in the middle of this function executing, can p2gcc wrap my ISR function logic with push/pop instructions? Maybe it could search for symbols starting with "_isr" and insert push instructions at the top. That seems "easy" enough. Then maybe p2gcc could find the "jmp lr" and replace it with pop instructions and a reti<X>? How to determine X though... maybe further restrictions on the function name, such as "function must start with isr_X" where "X" is a number 1-3?
One temporary option is that I write two macros which push & pop all the registers. This methodology would be a bit wasteful though, because in the C code I wouldn't know which registers actually need to be saved. p2gcc could, potentially, look at the function implementation and only save those registers which are going to be clobbered.
Upon return those registers are reloaded with new values.
No stack is required.
Mike
For functions with a variable number of arguments, the extra arguments plus the one previous to them are all put on the stack. As an example, for printf, all of the arguments are on the stack. For ssprintf, the first argument is in r0, and the rest are on the stack.
r0 is used to return a single 32-bit value. r0 and r1 are used to return a 64-bit value, such as double.
If a function uses a pointer to an argument it must copy the argument to memory, and use the memory copy instead of the value in the register.
And that does seem to confirm that we both need and should be able to push all the registers onto the stack at the beginning of an ISR and pop them off when we're done. Alternatively, we could "switch register banks." That might be an interesting thought exercise: is it feasible for p2gcc to implement switchable register banks on top of PropGCC and, for the future, how hard is it to implement switchable register banks in PropGCC itself? Seems much more complicated, but it would save a lot of CPU time not having to push/pop for each ISR.
Right, a quick run down on calling conventions in high level languages and compiling for them:
Typically the code generator for a particular target architecture will assign the machine registers into several
disjoint sets, caller-save, callee-save and scratch/temporary. When function a() calls function b(), a is the
caller, b is the callee.
The protocol is that a procedure/function's code has to save and restore the callee-save registers it touches,
they are strictly property of the calling function.
Caller-save registers are assumed to always be trashed by procedure/function calls (unless global dataflow
analysis indicates otherwise), so must be saved/restored by the caller across calls (if live at that point).
A live register is one whose value is needed later in at least one execution path.
Temporary/scratch registers are for use within a code fragment, they never need to be saved and are assumed
trashed by any callee. Special purpose machine registers and flags are typical examples, and sometimes a
couple of general purpose registers are set aside for complex fragments.
Interrupt handlers have a different calling convention in which all registers (and flags) are callee-save.
You could make all registers caller-save, or you could make all callee-save, but neither performs as well
as using some of each - what you are striving for is that the leaf functions and the next level up (which is
where code spends most of its time) are typically avoiding unnecessary stack-traffic - each caller-save
register is free for the leaf function to use, each callee-save is free for the next level up functions to use...
Higher up in the call levels you get less benefit, but leaf calls are usually where a program spends most time.
When one language can call into another (like C to assembler), the calling conventions may differ, and such
"foreign" calls require extra code fragments to convert between conventions.
There is of course more to it than this, as usually a set of the caller-save/temp registers are also the argument/return registers, and many architectures have complicated rules about which registers arguments go into
when there is a mix of types and sizes, and when the number of argments means the stack is also involved
in passing arguments.
Heavily optimizing compilers can do more advanced things, like assign different calling conventions to
different functions, and inlining leaf functions automatically, based on analysis of the inner loops and
the call tree - however anything visible to the linker has to use the standard calling conventions.
A C ISR needs to save all the registers on entry, and restore all the registers on exit. With the eggbeater architecture this can be done quite efficiently with the streaming instructions. It takes about the same number of cycles to store all 15 registers as it would to store 2 or 3 registers separately.
Makes sense that normal functions wouldn't need to save off r0-r5 - for my purposes, I only care about ISRs though. I trust PropGCC to do the right thing everywhere else and don't want to mess with it. But I do want to be able to use a high-level language for my ISRs.
I haven't gotten far enough in my explorations to know how to do this yet. Right now, I'm trying to make the simple-but-ugly solution work, by sticking this at the end of prefix.spin2:
__PUSH_FRAME mov isr_bak_r0, r0 mov isr_bak_r1, r1 mov isr_bak_r2, r2 mov isr_bak_r3, r3 mov isr_bak_r4, r4 mov isr_bak_r5, r5 mov isr_bak_r6, r6 mov isr_bak_r7, r7 mov isr_bak_r8, r8 mov isr_bak_r9, r9 mov isr_bak_r10, r10 mov isr_bak_r11, r11 mov isr_bak_r12, r12 mov isr_bak_r13, r13 mov isr_bak_r14, r14 mov isr_bak_sp, sp mov isr_bak_temp, temp mov isr_bak_temp1, temp1 mov isr_bak_temp2, temp2 ret __POP_FRAME mov r0, isr_bak_r0 mov r1, isr_bak_r1 mov r2, isr_bak_r2 mov r3, isr_bak_r3 mov r4, isr_bak_r4 mov r5, isr_bak_r5 mov r6, isr_bak_r6 mov r7, isr_bak_r7 mov r8, isr_bak_r8 mov r9, isr_bak_r9 mov r10, isr_bak_r10 mov r11, isr_bak_r11 mov r12, isr_bak_r12 mov r13, isr_bak_r13 mov r14, isr_bak_r14 mov sp, isr_bak_sp mov temp, isr_bak_temp mov temp1, isr_bak_temp1 mov temp2, isr_bak_temp2 ret isr_bak_r0 long 0 isr_bak_r1 long 0 isr_bak_r2 long 0 isr_bak_r3 long 0 isr_bak_r4 long 0 isr_bak_r5 long 0 isr_bak_r6 long 0 isr_bak_r7 long 0 isr_bak_r8 long 0 isr_bak_r9 long 0 isr_bak_r10 long 0 isr_bak_r11 long 0 isr_bak_r12 long 0 isr_bak_r13 long 0 isr_bak_r14 long 0 isr_bak_sp long 0 isr_bak_temp long 0 isr_bak_temp1 long 0 isr_bak_temp2 long 0
Of course... that's not gonna be good when the second interrupt fires in the middle of the first
So I paired the above with a couple macros in my headers:
#define push_frame() __asm__ volatile ("call #__PUSH_FRAME") #define pop_frame() __asm__ volatile ("call #__POP_FRAME")
and an ISR
ISR(my_fancy_isr) { push_frame(); io_asm(drvnot, 56); pop_frame(); interrupt_return(1); }
Annnddd if I did everything right, that should work, right? Not that it's necessary for this ISR, but I'm starting here since I know it already works and then will move it to my serial object. But it's not working. I'll bang on this a bit more before I try to post a simple and complete example which demonstrates the problem.
The definitions for all of these things are in the gcc sources, gcc/gcc/config/propeller.h.
So I combined prefix.spin with some hand-written code and it's working just fine. One key difference is that I removed the "org $400" from between prefix.spin2 and my code. Without that, I was getting a bunch of "illegal literals," which I suppose makes sense.
But this was at least enough to confirm that call/ret is allowed in ISR blocks. I was starting to worry that I missed something, and that the "ret" instruction in __PUSH_STACK was going all the way back to the main code rather than back to the ISR.
Anyway, my sadly broken code is pushed to git. Full commit here, blinky.c with the ISR here, working test_int.p2asm here, and I've attached my modified prefix.spin2 to the post.
So taking this and all the other comments above into consideration, is the following correct?
It is up to the calling function to ensure that r0-r7 are stashed away somewhere if it makes a call to another function but still needs those values. And it is up to the callee function to ensure r8-r14 are saved off and restored prior to modifying them, if it needs to modify them?
And, of course, an ISR needs to save off and restore any SPR that will be modified.
Yes, that is correct.
loadp2 is still not right for me. I've got it working reliably by outright replacing the call to findp2() with serial_init().
Testing has me a little confused for the moment. There was always an intermittent "Could not find a P2" error, even with the ES silicon, but when changing back to testing on the FPGA it has some sort of resetting after load problem again. But not consistently, sometimes it works fine.
Here's the actual change I've made:
/* // Find a P2 on one of the serial ports, or on the specified port if (!findp2(PORT_PREFIX, LOADER_BAUD, port)) { printf("Could not find a P2\n"); exit(1); } */ if (1 != serial_init(port, LOADER_BAUD)) { printf("Could not open port %s\n", argv[1]); exit(1); }
The change is just a copy'n'paste from an older version of loadp2.c. I know it's not suitable as general solution but just to let you know the area of problem.
I suppose I could add another flag that avoids findp2, and just does a serial_init without testing for a P2 chip. I'm away from home right now, so I won't be able to get to this until Saturday.
#define pop_frame() __asm__ volatile ( \ " drvnot #60 \n\t" \ " drvnot #61 \n\t" \ " mv r0, %[_isr_bak_r0] \n\t" \ " mv r1, %[_isr_bak_r1] \n\t" \ " mv r2, %[_isr_bak_r2] \n\t" \ " mv r3, %[_isr_bak_r3] \n\t" \ " mv r4, %[_isr_bak_r4] \n\t" \ " mv r5, %[_isr_bak_r5] \n\t" \ " mv r6, %[_isr_bak_r6] \n\t" \ " mv r7, %[_isr_bak_r7] \n\t" \ " mv r8, %[_isr_bak_r8] \n\t" \ " mv r9, %[_isr_bak_r9] \n\t" \ " mv r10, %[_isr_bak_r10] \n\t" \ " mv r11, %[_isr_bak_r11] \n\t" \ " mv r12, %[_isr_bak_r12] \n\t" \ " mv r13, %[_isr_bak_r13] \n\t" \ " mv r14, %[_isr_bak_r14] \n\t" \ " mv sp, %[_isr_bak_sp] \n\t" \ " mv temp, %[_isr_bak_temp] \n\t" \ " mv temp1, %[_isr_bak_temp1] \n\t" \ " mv temp2, %[_isr_bak_temp2] \n\t" \ : \ :[_isr_bak_r0] "m" (isr_bak_r0), \ [_isr_bak_r1] "m" (isr_bak_r1), \ [_isr_bak_r2] "m" (isr_bak_r2), \ [_isr_bak_r3] "m" (isr_bak_r3), \ [_isr_bak_r4] "m" (isr_bak_r4), \ [_isr_bak_r5] "m" (isr_bak_r5), \ [_isr_bak_r6] "m" (isr_bak_r6), \ [_isr_bak_r7] "m" (isr_bak_r7), \ [_isr_bak_r8] "m" (isr_bak_r8), \ [_isr_bak_r9] "m" (isr_bak_r9), \ [_isr_bak_r10] "m" (isr_bak_r10), \ [_isr_bak_r11] "m" (isr_bak_r11), \ [_isr_bak_r12] "m" (isr_bak_r12), \ [_isr_bak_r13] "m" (isr_bak_r13), \ [_isr_bak_r14] "m" (isr_bak_r14), \ [_isr_bak_sp] "m" (isr_bak_sp), \ [_isr_bak_temp] "m" (isr_bak_temp), \ [_isr_bak_temp1] "m" (isr_bak_temp1), \ [_isr_bak_temp2] "m" (isr_bak_temp2) \ )
which produces an extra rdlong instruction for each mv and, even if I didn't care about the wasted instructions, it's using one of the registers that I'm trying to prevent from being clobbered
drvnot #60 drvnot #61 rdlong temp, ##.LC0 mv r0, temp rdlong temp, ##.LC1 mv r1, temp rdlong temp, ##.LC2 mv r2, temp rdlong temp, ##.LC3 mv r3, temp rdlong temp, ##.LC4 mv r4, temp rdlong temp, ##.LC5 mv r5, temp rdlong temp, ##.LC6 mv r6, temp rdlong temp, ##.LC7 mv r7, temp rdlong temp, ##.LC8 mv r8, temp rdlong temp, ##.LC9 mv r9, temp rdlong temp, ##.LC10 mv r10, temp rdlong temp, ##.LC11 mv r11, temp rdlong temp, ##.LC12 mv r12, temp rdlong temp, ##.LC13 mv r13, temp rdlong temp, ##.LC14 mv r14, temp rdlong temp, ##.LC15 mv sp, temp rdlong temp, ##.LC16 mv temp, temp rdlong temp, ##.LC17 mv temp1, temp rdlong temp, ##.LC18 mv temp2, temp ' 0 "" 2 ' 25 "blinky.c" 1 reti1 ' 0 "" 2 ' Naked function: epilogue provided by programmer. .balign 4 .LC0 long _isr_bak_r0 .balign 4 .LC1 long _isr_bak_r1 .balign 4 .LC2 long _isr_bak_r2 .balign 4 .LC3 long _isr_bak_r3 .balign 4 .LC4 long _isr_bak_r4 .balign 4 .LC5 long _isr_bak_r5 .balign 4 .LC6 long _isr_bak_r6 .balign 4 .LC7 long _isr_bak_r7 .balign 4 .LC8 long _isr_bak_r8 .balign 4 .LC9 long _isr_bak_r9 .balign 4 .LC10 long _isr_bak_r10 .balign 4 .LC11 long _isr_bak_r11 .balign 4 .LC12 long _isr_bak_r12 .balign 4 .LC13 long _isr_bak_r13 .balign 4 .LC14 long _isr_bak_r14 .balign 4 .LC15 long _isr_bak_sp .balign 4 .LC16 long _isr_bak_temp .balign 4 .LC17 long _isr_bak_temp1 .balign 4 .LC18 long _isr_bak_temp2
Conventions for utilisation of hubRAM might be a good idea though.
Mike
From Wikipedia: