P2 cogexec vs hubexec vs overlay timings
Cluso99
Posts: 18,071
in Propeller 2
Curious to see just how cogexec vs hubexec vs overlay timings would stack up, I ran some tests...
Overlay loads the hub block into cog and executes from cog.
Here is the routine that I tested, with a small loop repeated 'loops' times where 1=no loop.
And overlays don't seem worth the trouble unless there are a lot of loops.
Overlay loads the hub block into cog and executes from cog.
Here is the routine that I tested, with a small loop repeated 'loops' times where 1=no loop.
getct ctr1
call #olay ' *** cogexec ***
getct ctr2
.............
rdlong y,olayptr ' set known hub rotation
getct ctr1
call #hub_test ' *** hubexec test ***
getct ctr2
.............
rdlong y,olayptr2 ' set known hub rotation
getct ctr1
setq #hub_test2_end - hub_test2 -1 ' *** overlay length to load
rdlong olay, olayptr
call #olay
getct ctr2a
.............
hub_test
mov z, loopctr ' 1
.loop add x, #1 ' 2
add x, #1 ' 3
add x, #1 ' 4
add x, #1 ' 5
add x, #1 ' 6
add x, #1 ' 7
add x, #1 ' 8
add x, #1 ' 9
add x, #1 '10
add x, #1 ' 1
add x, #1 ' 2
add x, #1 ' 3
add x, #1 ' 4
add x, #1 ' 5
add x, #1 ' 6
add x, #1 ' 7
add x, #1 ' 8
add x, #1 ' 9
add x, #1 '10
add x, #1 ' 1
add x, #1 ' 2
add x, #1 ' 3
add x, #1 ' 4
add x, #1 ' 5
.loop2 add x, #1 ' 6
add x, #1 ' 7
add x, #1 ' 8
add x, #1 ' 9
djnz z, #.loop2 '10
ret
hub_test_end
and here are the results (clocks in hex)...
loops: cogexec hubexec overlay 0A: 000000B2 00000128 000000DE (178 296 222) 09: 000000A6 00000110 000000D2 08: 0000009A 000000F8 000000C6 07: 0000008E 000000E0 000000BA 06: 00000082 000000C8 000000AE 05: 00000076 000000B0 000000A2 (118 176 162) 04: 0000006A 00000098 00000096 03: 0000005E 00000080 0000008A ( 94 128 138) 02: 00000052 00000068 0000007E 01: 00000046 00000051 00000072 (70 81 114)So roughly there is an 11 clock overhead for hubexec over cogexec for each load/loop.
And overlays don't seem worth the trouble unless there are a lot of loops.
