How can I get most out of Fcache?
Christof Eb.
Posts: 1,237
Hi,
is there already a documentation about fcache?
I would like to be able to get most out of this mechanism, which is very powerful. Can I convince it to cache more than the innermost loop?
Thanks
Christof
is there already a documentation about fcache?
I would like to be able to get most out of this mechanism, which is very powerful. Can I convince it to cache more than the innermost loop?
Thanks
Christof
Comments
It will automatically cache outer loops too, as long as the normal conditions for fitting in fcache are met: there can be no function calls (except calls to NATIVE functions) inside the loop, no branches to points outside the loop, and the loop has to fit.
In the CMM preview you can also suggest to the compiler that it put a whole function into FCACHE, by putting __attribute__((fcache)) before the function declaration. Provided the function fits, and has no calls to other functions (actually NATIVE functions are allowed, but that's a special case), then the whole function will be placed into FCACHE. This is useful if you need to guarantee the timing of initialization code relative to some loops, or if the function has multiple independent loops and you want to keep them in the FCACHE together.
Eric
This sounds exactly like what I was looking for: a way to execute a short function always from the cache. This allows to have fast routines with a tight timing without the use of an PASM cog. So I tried this test code: .. and I can output the hello world text with 115200 baud in all memory models (LMM, CMM, XMMC, SDXMMC) with no problems. The question is: Is it safe to expect that the tx function is always executed from the cache, or can it be that it is executed slower for the first time, or if the cache is full of other cached routines?
Andy
A function declared __attribute__((fcache)) is always completely loaded into FCACHE before it is executed, so the relative timings of the instructions within the function will always stay the same. The time it takes to start the function will vary depending on whether or not it is already in the cache... if some other function was using the cache then the first instruction of tx will not start until the cache has been re-loaded.
I'll also note that there is an undocumented feature: a function declared with __attribute__((native)) is always placed in the kernel memory, and hence always runs at full speed. This feature is undocumented because the size of kernel memory is subject to change; in particular there is very little space left in the XMM and CMM kernels. It's probably OK to use native functions in pure LMM mode, but I wouldn't do it in other modes.
Eric
Thanks alot, Christof
Try: The combination of the "-g" flag to gcc and the "-Wa,-ahl=foo.s" flags to the assembler cause the assembler to produce a listing with interleaved C and assembly.
Eric