Seairth, that was easy, but how would it look with 3 tasks with SETTASK #%%10233210
The 3 delay slots after a delayed jump are just in single task mode. With multitasking it depends a lot of the task switcher settings. I don't think a compiler can incorporate all this.
Seairth, that was easy, but how would it look with 3 tasks with SETTASK #%%10233210
The 3 delay slots after a delayed jump are just in single task mode. With multitasking it depends a lot of the task switcher settings. I don't think a compiler can incorporate all this.
Andy
Ariba
Compilers can be very tricky to create, and also very smart, in the way they get their jobs done.
Some years ago, my son was part of a universitary team, that created a distributed processing version, capable of spreading the work among a hundred or even more processors, then recovering all the resulting code pieces, ready for final assembly, in one functional unit.
It was targeted to compile astonishingly huge operating systems, but the team also had an eye, in the distribution of the workload, among thousands of almost lazy Android cell phones, doing nothing but timekeeping, at their owners purses and tables.
So, as long as the rules of the target cpu can be properly crafted, and a four level pipeline is not such hairy monster at all, i believe we can expect many good news from this side of the street.
Seems to me the compiler cannot know how many taks are going to be running at compile time, so yes I agree it looks like an impossibility to optimise for that pipeline in general.
Does that mean we might see some "dangerous" optimization compiler flags that may break things if tasking is used?
Clearly we need a way to use tasking, and many other features from C. Looks like a lot of builtins or intrinsics (whatever they are called) are going to be needed.
If it's impossible for the compiler to figure out on its own, there's no harm in telling it what you're doing: "Hey, compiler, I've got four tasks running here!" That's what pragmas (pragmata?) are for.
Comments
and a simulator, that source-steps and does (exactrly) what the core does, is better again.
SETSPA #0 loop1: DJNZD ctr1, #:loop1 ADD data, addval loop2: DJNZD ctr1, #:loop2 PUSHA data something something
Going to do?Blow my mind, that's what!
Actually, this one is very interesting. The "something" lines are never reached until ctr1 hits zero. Here's the pipeline for the above code:
PC S1 S2 S3 S4 0 SETSPA 1 DJNZD loop1 SETSPA 2 ADD DJNZD loop1 SETSPA 3 DJNZD loop2 ADD DJNZD loop1 SETSPA 4 PUSHA DJNZD loop2 ADD DJNZD loop1 1 DJNZD loop1 PUSHA DJNZD loop2 ADD 2 ADD DJNZD loop1 PUSHA DJNZD loop2 3 DJNZD loop2 ADD DJNZD loop1 PUSHA 4 PUSHA DJNZD loop2 ADD DJNZD loop1 etc...
If ctr1 was odd, then it would end as follows:
5 something PUSHA DJNZD loop2 ADD 6 something something PUSHA DJNZD loop2 3 DJNZD loop2 something something PUSHA
If ctr1 was even, you'd end up with:
4 PUSHA ADD DJNZD loop1 PUSHA 5 something PUSHA ADD DJNZD loop1 1 DJNZD loop1 something PUSHA ADD 2 ADD DJNZD loop1 something PUSHA 3 DJNZD loop2 ADD DJNZD loop1 something 4 PUSHA DJNZD loop2 ADD DJNZD loop1 etc...
The 3 delay slots after a delayed jump are just in single task mode. With multitasking it depends a lot of the task switcher settings. I don't think a compiler can incorporate all this.
Andy
Ariba
Compilers can be very tricky to create, and also very smart, in the way they get their jobs done.
Some years ago, my son was part of a universitary team, that created a distributed processing version, capable of spreading the work among a hundred or even more processors, then recovering all the resulting code pieces, ready for final assembly, in one functional unit.
It was targeted to compile astonishingly huge operating systems, but the team also had an eye, in the distribution of the workload, among thousands of almost lazy Android cell phones, doing nothing but timekeeping, at their owners purses and tables.
So, as long as the rules of the target cpu can be properly crafted, and a four level pipeline is not such hairy monster at all, i believe we can expect many good news from this side of the street.
Yanomani
Seems to me the compiler cannot know how many taks are going to be running at compile time, so yes I agree it looks like an impossibility to optimise for that pipeline in general.
Does that mean we might see some "dangerous" optimization compiler flags that may break things if tasking is used?
Clearly we need a way to use tasking, and many other features from C. Looks like a lot of builtins or intrinsics (whatever they are called) are going to be needed.
-Phil