Mixed mode programming tutorial
Daniel Nagy
Posts: 26
Hello,
My question is, do we have a good tutorial on mixed-mode programming, where one can run an LMM program and use some small and quick COGC modules for time critical parts?
My question is, do we have a good tutorial on mixed-mode programming, where one can run an LMM program and use some small and quick COGC modules for time critical parts?
Comments
Advanced Topic - Increase Function Execution Speed with _FCACHE
Here is an example with function code that gets copied unused space in a cog. This reduces the access time for data because the kernel executing the machine codes looks inside its own cog RAM for the next instruction instead of having to wait for the next access window to get the instruction from Main RAM, which is shared with 7 other cogs.
Run the program as-is, and note the cycles per second that P5 toggles at is about 83 kHz. Then, un-comment the __attribute__((fcache)) line, and re-run the program. The toggle rate should increase to about 2.5 MHz. That's about a 30x speed increase and also lends itself to precisely timed loops. You can also un-comment n++ in the pinToggle function to share counted repetitions with the main function. Without fcache, the frequency is about 48 kHz. With fcache, it's about 828 kHz. Since there is now some communication with Main RAM that slows loop execution, so there is only a 17x speed increase. ...but hey, that's still 17x!
According to ersmith in the How can I get most out of Fcache thread, functions can have the fcache attribute applied to them for improved performance if the function is small enough to fit in a cog along with the kernel for the memory model. The allowable size of the code varies from one memory model to the next. If the function is too large, you will see a compiler error. fcached functions are also severely restricted on the functions they can call; they can only call NATIVE functions. You can add a native function by adding __attribute__((native)) or _NATIVE.
Both fcached and native functions have to reside in the Cog RAM. This means, you cannot use many cmm/lmm/xmm library functions. For example, simpletools library functions like high, low and pause are not native. The propeller.h library has macros like OUTA, DIRA, and waitcnt that can be used to get the same functionality. So, instead of high(5), use OUTA |= 5 and DIRA |= 5. Instead of pause(100), use waitcnt(CNT + CLKFREQ/10).
How it Works
This application measures the number of times the while(1) loop in the pinToggle function repeats by measuring the number of low-to-high transitions on P5. By commenting and commenting different parts of the example, you can measure different execution rates. Keep in mind that the actual repeat rate is twice that fast because the pinToggle function's loop performs a low-to-high transition on one repetition, and a high-to-low transition on the next.
The simpletools library has some has some convenience functions used by main.
Each Propeller cog has CTRA and CTRB modules that can be configured to perform certain processes independently. One of the features is positive edge detection. In this mode, a counter module adds 1 to its corresponding PHSA/PHSB register every time a low-to-high transition is detected on a certain I/O pin. This macro definition is a value that can be ORed with an I/O pin number and then stored in the cog's CTRA or CTRB register to provide edge counting for measuring signal frequencies.
This is a function prototype for the pinToggle function. This function is designed to be launched into another cog. The actual function is below main.
These volatile variables are going to be modified/accessed by more than one function running in more than one cog. The volatile declaration ensures that the compiler doesn't optimize out code that re-checks its value before performing each operation. This could happen if one function is modifying the variable from another cog because the compiler cannot figure that out, so volatile just prevents it from ever happening.
Compact memory model (CMM) code running in another cog needs 176 bytes (44 ints) of stack space for the C kernel, and often additional stack space for function call/return and, local variables, and etc. I'm not sure if an fcached cog really needs any stack space, so this line is just out of habit at this point.
This main function is running in cmm mode. In this mode, the C kernel runs in a cog and it fetches and executes machine stored in the Propeller chip's Main RAM.
{
Pin is one of the volatile variables shared by main and pinToggle. Main has to set it before starting pinToggle in another cog because pinToggle uses pin to determine which pin to toggle.
The pinToggle function also has commented code to keep the I/O pin on/off for a certain number of clock ticks. If statements with dt are un-commented in pinToggle, this also needs to be un-commented.
This starts the pinToggle function in another cog. For more info on this, see Multicore Example.
__ stack, sizeof(stack));
As mentioned earlier, POS_EDGE_CTR was defined so that it could be ORed with a pin number to configure a counter module. Here, the cog that's executing the main function gets its counter module A configured to count low-to-high transitions on P5.
The rule for counting positive transitions counter module A is that the value in the FRQA register gets added to PHSA every time a positive transition is detected. So, we'll set FRQA to 1, so that PHSA increases by 1 with each transition.
These two variables are created for setting up a loop that repeats at precise time intervals. The first one establishes the time interval by setting dtm (time increment for main) to CLKFREQ, the number of system clock ticks in a second. The second marks the current number of clock ticks that have elapsed (CNT) by storing a copy of that value in t. Every time the Propeller chip's system clock ticks, the CNT register increases by 1. In this application, the system clock is running at 80 MHz. So, the value of CNT increases by 80,000,000 each second.
__ int t = CNT;
The main loop starts by setting PHSA to 0. Then, wiatcnt(t += dtm) adds the number of clock ticks in 1 second to the system clock time that was previously stored in t. That sets a target CNT register value for the waitcnt function to wait for. It's more precise than the simpletools library's pause function. The waitcnt function allows the program to continue after 1 second, at which point, int cycles = PHSA captures the number of low-to-high transitions that have occurred on P5. Then, two print function calls display that value along with the value of n. The value of n might or might not increase depending on whether or not n++ has been un-commented in pinToggle.
__{
____PHSA = 0;
____waitcnt(t += dtm);
____int cycles = PHSA;
____print("n = %d, ", n);
____print("cycles = %d\n", cycles);
__}
}
The pinToggle function uses only built-in propeller.h macros for I/O pin manipulations, which allows it to be labeled with the fcache attribute so that it can be copied into the portion of Cog RAM not used by the C kernel. This greatly increases execution speed because the program does not have to wait for Main RAM access, which is shared with 7 other cogs, to get the next instruction. Local variables also have faster access because they are stored in Cog RAM as well. Global variables are stored in Main RAM, and when fast execution speed is the goal, they should be used sparingly.
IMPORTANT: You won't see the speed increase until you un-comment the __attribute__((fcache)) statement.
void pinToggle(void *par) // pinToggle function
{
__ int pmask = 1 << pin;
__ DIRA |= pmask;
__ //int t = CNT;
__ while(1)
__ {
____ //n++;
____ OUTA ^= pmask;
____ //waitcnt(t+=dt);
__ }
}
Here is a COGC application that does the same thing as the fcache application from the previous post. One important difference is that the COGC kernel is smaller than a CMM or other memory models, which means your application can fit more code into the cog.
The shared variables from the previous post were modified and moved into a header file. The fcached function was also modified so that it could be run from a COGC file. So what's left in the main file is just code that launches the COGC cog and tests it.
Did You Know?
- In this activity, you will add files to your project. If you want to see them in a list or reopen them after closing, just use the Project Manager panel. You can open it by clicking the Show/Hide Project Manager button in SimpleIDE's bottom-left corner.
- If you want to copy your project, just use your file browser (Mac Finder, Windows Explorer) to copy the folder that contains all the files. Then, open the .side project in the folder copy you created.
Project SetupFirst, follow the checklist instructions for the creating a project and adding the three files below to it. Then continue to the Test Instructions.
A header file with shared variables provides a convenient place where code in both files can access them.
If the COGC code is going to be part of an application running in a different mode, like CMM, LMM or XMM, it needs to live in its own COGC file that's part of the project.
Test Instructions
Now you are ready to run the application.
- Click SimpleIDE's Run with Terminal button.
- Verify that the application toggles the pin 5x per second.
- Click the pinToggle.cogc tab
- Comment out these two statements: (share->n)++; and waitcnt(t+=share->dt);
- Click Run with Terminal again, and you'll be back up to about 2.5 MHz signal, which means the loop is again repeating at 5 MHz.
Try ThisLet's try modifying the main file to set up two light blinking processes on P26 and P27.
Thanks for the really quick and detailed example on fcache.
A working code example has been added to the COGC project in post #3. Next step is to write a narrative of what's happening in the code.
Thanks for this second tutorial too.
They are very valuable to me and I think for others who are looking for advanced stuff as well.
Daniel
Attached here is the intro to the C section.
You can buy the book on leanpub (you can download a sample to give you an idea of the things I cover)
Propeller Programming
Leanpub has a 45 day return policy.
Best,
Sridhar