@Wuerfel_21 said:
Why would there be a limit in the first place? There really shouldn't be with how the bytecodes stack up.
It's not a bytecode thing, It's an economical issue with RFVAR, which brings in the offset address plus some extra LSBs to handle index count (0..3) and byte/word/long size. There are two bits for index count and two for size.
I don't really know how you are allocating all these bits and how things work behind the scenes, but if you have two bits for size and four combinations possible for just three unit sizes (byte/word/long) does that not leave one more unused combination? And could that spare combination be used to indicate an expanded condition where you need to read in with one more RFVAR to extend things further and to access more information? ie. make the decision variable length capable/chainable. I do realize this adds more complications but in most typical cases I suspect it wouldn't be needed anyway so hopefully it wouldn't slow things down very much at all in the executed code.
@Wuerfel_21 said:
Why would there be a limit in the first place? There really shouldn't be with how the bytecodes stack up.
It's not a bytecode thing, It's an economical issue with RFVAR, which brings in the offset address plus some extra LSBs to handle index count (0..3) and byte/word/long size. There are two bits for index count and two for size.
I don't really know how you are allocating all these bits and how things work behind the scenes, but if you have two bits for size and four combinations possible for just three unit sizes (byte/word/long) does that not leave one more unused combination? And could that spare combination be used to indicate an expanded condition where you need to read in with one more RFVAR to extend things further and to access more information? ie. make the decision variable length capable/chainable. I do realize this adds more complications but in most typical cases I suspect it wouldn't be needed anyway so hopefully it wouldn't slow things down very much at all in the executed code.
Good point about there being only three sizes (BYTE/WORD/LONG) and how we waste 25% of the two size bits' range. I think if we could get to four indexes, that would be very adequate. The problem is that we need to divide the nibble by 5 or 3, so that the quotient and the remainder will each supply a size or an index count. Dividing by five is faster, since it will get through 4 bits in only three steps.That got me thinking about how it could be done and I came up with a 3-instruction solution:
PR0 is the input nibble
PR1 is the output index count (0..4)
PR2 is the output size (0/1/2 = BYTE/WORD/LONG)
This gets another index-count possibility out of the 4 bits. The code doing the work is the REP and the following two instructions.
So, is it worth burdening every structure access with ten more clocks of execution time, in order to accommodate the distal need for four nested live indexes, instead of three? Not sure about this part. I think three live indexes would be pretty unusual, while two would be pretty common.
EDIT: I just realized that "REP #2,#2" yields the same results, which makes this a keeper, taking 8 less clocks, or only two more than simple bitfields.
'
'
' Setup hub structured variable (19 longs)
'
hub_sv rfvar y 'a b c d get starting offset << 4 + %iiii
ror y,#4 'a b c d lsb-justify address, msb-justify %iiii
pusha x 'a b c | a: setup [pbase + rfvar {+pop*rfvar}]
mov x,pbase 'a | | | b: setup [vbase + rfvar {+pop*rfvar}]
mov x,vbase '| b | | c: setup [dbase + rfvar {+pop*rfvar}]
mov x,dbase '| | c | d: setup [pop + rfvar {+pop*rfvar}]
add x,y 'a b c d add starting offset
shr y,#32-4 'lsb-justify %iiii (0/1/2 * 5 + index count)
mov z,#bc_setup_byte_pa & $1FF 'get setup byte[pop address]
rep #2,#2 'convert %iiii into index count and byte/word/long
cmpsub y,#5 wcz 'y will equal index count, 0..4
addx z,#0 'z will equal setup byte/word/long[pop address]
.loop if_nz popa v 'pop stack to get index
if_nz rfvar w 'get structure size
if_nz mul v,w 'multiply index by structure size
if_nz add x,v 'add into address
if_nz djnz y,#.loop 'another index and size?
rdlut z,z 'chain to setup byte/word/long[pop address]
execf z
@cgracey said:
Here is the code that does structure access:
Neato. There is even one condition left over when the nibble equals 15. That special value could even be used down the track someday if you needed even more expansion beyond the 4 nested indexes if the code can just branch off to a special case handler for that where it reads in more data in another extended format and interprets it differently.
@cgracey said:
Here is the code that does structure access:
Neato. There is even one condition left over when the nibble equals 15. That special value could even be used down the track someday if you needed even more expansion beyond the 4 nested indexes if the code can just branch off to a special case handler for that where it reads in more data in another extended format and interprets it differently.
Comments
I don't really know how you are allocating all these bits and how things work behind the scenes, but if you have two bits for size and four combinations possible for just three unit sizes (byte/word/long) does that not leave one more unused combination? And could that spare combination be used to indicate an expanded condition where you need to read in with one more RFVAR to extend things further and to access more information? ie. make the decision variable length capable/chainable. I do realize this adds more complications but in most typical cases I suspect it wouldn't be needed anyway so hopefully it wouldn't slow things down very much at all in the executed code.
Good point about there being only three sizes (BYTE/WORD/LONG) and how we waste 25% of the two size bits' range. I think if we could get to four indexes, that would be very adequate. The problem is that we need to divide the nibble by 5 or 3, so that the quotient and the remainder will each supply a size or an index count. Dividing by five is faster, since it will get through 4 bits in only three steps.That got me thinking about how it could be done and I came up with a 3-instruction solution:
PR0 is the input nibble
PR1 is the output index count (0..4)
PR2 is the output size (0/1/2 = BYTE/WORD/LONG)
This gets another index-count possibility out of the 4 bits. The code doing the work is the REP and the following two instructions.
So, is it worth burdening every structure access with ten more clocks of execution time, in order to accommodate the distal need for four nested live indexes, instead of three? Not sure about this part. I think three live indexes would be pretty unusual, while two would be pretty common.
EDIT: I just realized that "REP #2,#2" yields the same results, which makes this a keeper, taking 8 less clocks, or only two more than simple bitfields.
Here is the code that does structure access:
Neato. There is even one condition left over when the nibble equals 15. That special value could even be used down the track someday if you needed even more expansion beyond the 4 nested indexes if the code can just branch off to a special case handler for that where it reads in more data in another extended format and interprets it differently.
Yeah, that one extra condition may be useful.
Chip,
My vote is for how @Wuerful_21 has suggested a struct should be defined, it looks better.
HydraHacker