Basis for a Bytecode Interpreter?
lonesock
Posts: 917
Hi, All.
Well, this has probably been done before, but here goes [noparse][[/noparse]8^)
I noticed that while PASM instruction codes are 9 bit, they are always in the form xxxxxx00x. With only 7 input bits, you can recover the 9 bit instruction code with the following PASM statements:
Obviously you still have to do a movi and then some other instruction before the actual instruction is executed. However, this opens up the possibility of a bytecode interpreter where the low bit of each byte signals if it's a "semi-native" PASM instruction in the form:
or another instruction dealing with the overhead of getting/setting the values of the 2 registers (from Hub RAM, Cog RAM, or constants, or maybe from a stack?), accessed via a jump table...something like:
Does this seem worth developing?
thanks,
Jonathan
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
Well, this has probably been done before, but here goes [noparse][[/noparse]8^)
I noticed that while PASM instruction codes are 9 bit, they are always in the form xxxxxx00x. With only 7 input bits, you can recover the 9 bit instruction code with the following PASM statements:
ror inst, #30 wc ' like a shl by 2, but puts the low bit in C andn inst, #%110 ' zero out these 2 bits muxc inst, #1 ' recover the low bit
Obviously you still have to do a movi and then some other instruction before the actual instruction is executed. However, this opens up the possibility of a bytecode interpreter where the low bit of each byte signals if it's a "semi-native" PASM instruction in the form:
semi_native_op registerD, registerS
or another instruction dealing with the overhead of getting/setting the values of the 2 registers (from Hub RAM, Cog RAM, or constants, or maybe from a stack?), accessed via a jump table...something like:
' Fetch this byte, then do something! rdbyte inst, code_idx ' read the next byte add code_idx, #1 ' increment the code index shr inst, #1 wc ' check if it's a direct PASM or an Interpreter inst if_c jmp #PASM_instruction ' go to PASM if necessary add inst, #Interp_jump_table ' get the address out of a jump table jmp inst ' jump the the correct PMOD operator
Does this seem worth developing?
thanks,
Jonathan
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
Comments
-Phil
I think it is too slow. I have played with many different schemes, and your approach is 12 cycles slower than the fastest byte code interpreter I could code:
rdbyte vect,pc
add pc,#1
jmp vect
This ofcourse assumes a jump table starting at 0
In the end, I found that mixed byte code / LMM took too many cog resources (memory) to be efficient at either. Your mileage may vary.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com 5.0" VGA LCD in stock!
Morpheus dual Prop SBC w/ 512KB kit $119.95, Mem+2MB memory/IO kit $89.95, both kits $189.95 SerPlug $9.95
Propteus and Proteus for Propeller prototyping 6.250MHz custom Crystals run Propellers at 100MHz
Las - Large model assembler Largos - upcoming nano operating system
If this seemed at all feasible, then the next, and harder, step would be re-targeting a compiler to produce bytecode for this.
thanks for all the feedback,
Jonathan
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
Those 2 bits are used to determine if the instruction will set the z and c flags when it executes. So they are used!
e.g. cmp x,y wc, wz 'sets c & z flags (this instruction requires those 2 bits you are referring to on)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
* I do not want to make any kind of LMM kernel
* I do not want to be able to execute "cmp x,y wc, wz", or any other perfectly valid PASM instruction
* I want to make a code-dense virtual CPU (like the ZPU) that executes bytecode
* I want the bytecode to be generated from a standard compiler, e.g. LCC
* I would like there to be a simple mechanism for executing instructions that the prop can handle natively (like ADD, SUB, etc)
-- without writing a specific function and attendent jump-table entry
-- the mechanism would handle a subset of byte-codes as "do the PASM equivalent passthrough"
-- the mechanism would handle all other valid bytecodes using a jump table
* I would like to use the remainder of the Cog RAM as a scratch pad for keeping either a fast stack, fast registers, or both
Jonathan
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
If I understand correctly, you want to use the lower order bit to determine if the byte is a byte code (one of 127 such), or instead should be re-interpreted as the first byte of a PASM instruction. If you decide the byte is a PASM instruction, you will reconstruct the original LONG instruction from that byte plus the 3 succeeding bytes and execute it directly, otherwise you will use it as a bytecode and execute it via a jumptable.
Is that right?
Here are the problems I see:
- As others have said, the bits you are assuming are empty are not - they will be set in any PASM instructions that want to write the carry or zero bit. But this is ok if you never intend to execute such instructions.
- bytecodes will be long aligned only in 1 in 4 cases - and if they are not, but the byte code represents a PASM instruction then you need to either pad them till they are (so that you can re-read the whole thing as a long) or be prepared to reconstruct the PASM instruction by reading the next three consecutive bytes using byte reads. Either way is quite inefficient.
You may be better of having a bytecode that says something like "interpret the following long as a PASM instruction", and use the compiler to ensure that this special byte code always occurs just prior to 4 long-aligned bytes (so they can be fetched as a single LONG). Then you could execute any instruction, and wouldn't need to do any messing about shifting bits back into the correct position.
But I'm still not sure it's worth it - the overhead of executing each such PASM instruction is some bit test logic plus an additional hub fetch - and you still have the problem of wasted space since you have to worry about alignment. But it might be worth it if the gain you get in both speed and space for the byte codes outweighs what you lose because of the embedded PASM.
Ross.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Catalina - a FREE C compiler for the Propeller - see Catalina
I was never planning on reconstructing a full PASM instruction. I was planning on keeping a stack, with the head of stack always in a local register, call it registerS. I was planning on having the working register in another local register, call it registerD. There would be a single native PASM instruction, for example "mov registerD, registerS". Now, the mov would be swapped out for certain opcodes the prop can handle natively (ADD, SUB, etc.), but all the rest remains the same.
Well, the consensus seems to be "don't waste my time", so I'll go with that. [noparse][[/noparse]8^)
thanks,
Jonathan
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
This would be ideal for Prop I, I think. You will get about 1.5x-2x more memory than LMM C (it won't have the conditional bits which the ICC uses for LMM C) and still faster than Spin.
// richard
The trick in the Propeller would be to develop a compression method that would only take a few instructions to expand the compressed codes.· You also may need to enforce some rules on the assembly code so that it can be compressed.· One such rule could be that variable addresses must be between $100 and $1ff.· This would reduce the size of the source and destination fields from 9 bits to·8 bits.· That way you could use a single byte for each value.
The value of immediate constants could be restricted even further.· Typically, immediate constants are small values, such as 0, 1, 2, 4, etc.· You might be able to limit constants to 4 bits.· Larger constants would need to be stored in cog memory or hub memory.
Most instructions write a result, but do not write the carry or zero flag.· Most instructions are performed unconditionally, so this would be the normal case also.· So for the normal case you could compress the opcode and flags down to a single byte.· This would give you 1 byte for the opcode, and 1 byte for each of the operands.· That's an amazing compression of 4 bytes down to 3 bytes!·· That's not very impressive.· However, you might be able to further restrict the instruction set and variable range to get this down to 2 bytes, which is starting to get interesting.· That would be a 2:1 compression.
It really starts getting interesting if you can combine multiple instructions into a single psuedo-code.· This will get you get you a lot more compression.· The trick is to make it·quick and easy·to decode.
Dave
Post Edited (Dave Hein) : 2/25/2010 2:20:19 PM GMT
Maybe I don't get it exactly either, but here goes:
So basically, a 3 byte instruction could be used where 6 bits of the 8 bit opcode reference a PASM instruction? Byte 2 and 3 are the destination and source registers as R0...R255 (or a subset) with the balance of COG registers for stack/interpreter.
It seems that ZC should always be affected and RI can be selected. Of course, one loses IF_Z etc... conditionals, but is that really important? One could use TJZ and TJNZ for branching BRZ/BRNZ.
Another approach would be a 2 byte instruction with R0..15 (mapped in COG) selected for destination and source registers. That might be slower, but it would be more compact.
How easy is it to build a prototype? Getting this group to "bless" even the best ideas and intentions is difficult. Do what you like with your time ... you don't need anyone here to help you explore this.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.