P2 Task Performance Profiling tool
Tubular
Posts: 4,705
Following on from OzPropDev's observations, here's a code snippet that lets you view/profile real cog performance by task. This tool lets us experiment with different loop sizes, task register task orders, rdlong positioning etc, so see whether sweet spots exist (or not).
It captures the
- number of in-loop instructions, using an "ADD" instruction to substitute for real in-loop instructions
- number of loop iterations, using a IJNZ instruction in place of the usual JMP
- number of RDLONG hub accesses by tracking PTRA and PTRB increments (two counting buckets)
(substitute own number of instructions to see how it performs)
Hopefully this can be used to shed real light on this very useful multi tasking feature of P2
cheers
tubular
edit: newer version 0.2, see post below
It captures the
- number of in-loop instructions, using an "ADD" instruction to substitute for real in-loop instructions
- number of loop iterations, using a IJNZ instruction in place of the usual JMP
- number of RDLONG hub accesses by tracking PTRA and PTRB increments (two counting buckets)
(substitute own number of instructions to see how it performs)
Hopefully this can be used to shed real light on this very useful multi tasking feature of P2
cheers
tubular
edit: newer version 0.2, see post below
Comments
Comparing the system counter against the tally of instructions executed and jumps performed would also help reveal how much "Stall" is occurring
Here's an updated v0.2 that includes the system timer, and estimates the waste (stalls and pipeline discards). I had kind of forgotten about the pipeline discard issues, though it is documented in Chip's documentation.