******************************************************************************* * * * Spin Bytecode Definition * * * ******************************************************************************* This is a very rough first draft of the description of the Spin bytecode. It is simple in essence but not quite so simple to explain in words. The opcodes reflect those generated by the AiChip Disassembler for the Propeller Chip ( PropView.exe ) where they are presented using mnemonics and in a format which will be familiar with users of other processor architectures; an opcode optionally followed by one or more operands. The opcodes have been determined by studying the contents of various images of bytecode and the effort has built upon the preliminary work done by Robert Bryon Vandiver ( Gear ) and Cliff L Biffle. In the process the naming of opcodes has diverged from their schemes and some missing information has been filled in. Tis document describes the basic opcodes on a one-to-one basis. The AiChip Disassembler can fold opcodes and operands during its analysis to create even more user friendly higher-level opcodes ( CALLSUB and CALLFUN etc ) which can be expanded by a suitable bytecode assembler to produce the more complex basic bytecode structure. These higher-level opcodes are not currently discussed in this document. The document has been developed without guidance from Parallax Inc and is incomplete and may even be in error in some respects. ******************************************************************************* * * * Top overview * * * ******************************************************************************* All opcodes are a single byte - 00-3F Special purpose opcodes 40-7F Fast access VAR and LOC 80-BF Access MEM, OBJ, VAR and LOC C0-FF Unary and Binary operators ******************************************************************************* * * * More detailed * * * ******************************************************************************* There are 64 special purpose opcodes, with the bytecode instructions in the following format - .---.---.---.---.---.---.---.---. | 0 | 0 | o o o o o o | `---^---^---^---^---^---^---^---' 00 FRAME | 10 LOOKUP | 20 CLKSET | 30 ABORT 01 FRAME | 11 LOOKDN | 21 COGSTOP | 31 ABOVAL 02 FRAME | 12 LOOKUPR | 22 LRETSUB | 32 RETURN 03 FRAME | 13 LOOKDNR | 23 WAITCNT | 33 RETVAL 04 GOTO | 14 QUIT ? | 24 PUSH SPR[] | 34 PUSH #-1 05 CALL | 15 MARK | 25 POP SPR[] | 35 PUSH #0 06 CALLOBJ | 16 STRSIZE | 26 USING SPR[] | 36 PUSH #1 07 CALLOBJ | 17 STRCOMP | 27 WAITVID | 37 PUSH #kp 08 LOOPJPF | 18 BYTEFIL | 28 COGIFUN | 38 PUSH #k1 09 LOOPRPT | 19 WORDFIL | 29 LNEWFUN | 39 PUSH #k2 0A JPF | 1A LONGFIL | 2A LSETFUN | 3A PUSH #k3 0B JPT | 1B WAITPEQ | 2B LCLRFUN | 3B PUSH #k4 0C GOTO[] | 1C BYTEMOV | 2C COGISUB | 3C ? 0D CASE | 1D WORDMOV | 2D LNEWSUB | 3D REG r[] 0E CASER | 1E LONGMOV | 2E LSETSUB | 3E REG r[..] 0F LOOKEND | 1F WAITPNE | 2F LCLRSUB | 3F REG r There are 64 fast access stack load / save opcodes, with the bytecode instructions in the following format - .-------------------------- 0 = VAR | 1 = LOC | | .------------------ Address ( addr = v * 4 ) | | | | .-------- 0 0 = PUSH | | | 0 1 = POP | | | 1 0 = USING | | | 1 1 = PUSH # | | | .---.---.---.---.---.---.---.---. | 0 1 | w | v v v | o o | `---^---^---^---^---^---^---^---' There are 128 stack load / save opcodes, with the bytecode instructions in the following format - .--------------------------- 0 0 = Byte | 0 1 = Word | 1 0 = Long | 1 1 = Unused - Unary / Binary operators | | .--------------------- 1 = [] | | | | .---------------- 0 0 = MEM | | | 0 1 = OBJ | | | 1 0 = VAR | | | 1 1 = LOC | | | | | | .-------- 0 0 = PUSH | | | | 0 1 = POP | | | | 1 0 = USING | | | | 1 1 = PUSH # | | | | .---.---.---.---.---.---.---.---. | 1 | s s | i | w w | o o | `---^---^---^---^---^---^---^---' There are 32 binary and unary operator opcodes, with the bytecode instructions in the following format - .---.---.---.---.---.---.---.---. | 1 1 1 | o o o o o | `---^---^---^---^---^---^---^---' E0 ROR | E8 BIT_AND | F0 LOG_AND | F8 SQRT E1 ROL | E9 ABS | F1 ENCODE | F9 LT E2 SHR | EA BIT_OR | F2 LOG_OR | FA GT E3 SHL | EB BIT_XOR | F3 DECODE | FB NE E4 MIN | EC ADD | F4 MPY | FC EQ E5 MAX | ED SUB | F5 MPY_MSW | FD LE E6 NEG | EE SAR | F6 DIV | FE GE E7 BIT_NOT | EF BIT_REV | F7 MOD | FF LOG_NOT ******************************************************************************* * * * Very Detailed * * * ******************************************************************************* There are 64 special purpose opcodes, with the bytecode instructions in the following format - .---.---.---.---.---.---.---.---. | 0 | 0 | o o o o o o | `---^---^---^---^---^---^---^---' 00 FRAME CALL WITH RETURN VALUE 01 FRAME CALL WITHOUT RETURN VALUE 02 FRAME CALL WITH ABORT TRAP 03 FRAME CALL IGNORING ABORT TRAP The FRAME opcodes indicate how the stack should be adjusted after a subsequent Call is made. The bottom two bits indicate the type of adjustment to be made. The exact mechanism used for stack framing is not kown by the author and has not been fully analysed, but it appears that CALL and CALLOBJ will always return two long values on the stack, a retun value and a trap value. Bit 0 of the opcode set indicates that the return value is to be popped and discarded after the call, bit 1 indicates the abort value is to be popped and discarded after the call. 04 GOTO The GOTO opcode will always branch to an address specified. The GOTO opcode is followed by an Address Offset, indicating how many bytes to branch ( plus or minus ) from the address of the Spin opcode which follows the GOTO opcode. 05 CALL c The CALL opcode will temporarily branch to an address specified before execution continues after the CALL. The Call opcode is followed by a Call Offset. The call to an actual routine is done via a vector table at the start of the object which points to the location of the routine to be called. The CALL opcode causes a branch through that vector table. 06 CALLOBJ pu 07 CALLOBJ p[]u .. more ... 08 LOOPJPF a .. more ... 09 LOOPRPT a .. more ... 0A JPF a The JPF opcode will pop the top of stack and branch to the address specified when the value popped is zero, otherwise execution will continue at the next Spin opcode. The JPF opcode is followed by an Address Offset, indicating how many bytes to branch ( plus or minus ) from the address of the Spin opcode which follows the JPF opcode. 0B JPT a The JPT opcode will pop the top of stack and branch to the address specified when the value popped is non-zero, otherwise execution will continue at the next Spin opcode. The JPT opcode is followed by an Address Offset, indicating how many bytes to branch ( plus or minus ) from the address of the Spin opcode which follows the JPT opcode. 0C GOTO[] .. more ... 0D CASE a .. more ... 0E CASER a .. more ... 0F LOOKEND .. more ... 10 LOOKUP .. more ... 11 LOOKDN .. more ... 12 LOOKUPR .. more ... 13 LOOKDNR .. more ... 14 QUIT ? .. more ... 15 MARK .. more ... 16 STRSIZE .. more ... 17 STRCOMP .. more ... 18 BYTEFIL 19 WORDFIL 1A LONGFIL .. more ... 1B WAITPEQ .. more ... 1C BYTEMOV 1D WORDMOV 1E LONGMOV .. more ... 1F WAITPNE .. more ... 20 CLKSET .. more ... 21 COGSTOP .. more ... 22 LRETSUB .. more ... 23 WAITCNT .. more ... 24 PUSH SPR[] .. more ... 25 POP SPR[] .. more ... 26 USING SPR[] .. more ... 27 WAITVID .. more ... 28 COGIFUN .. more ... 29 LNEWFUN .. more ... 2A LSETFUN .. more ... 2B LCLRFUN .. more ... 2C COGISUB .. more ... 2D LNEWSUB .. more ... 2E LSETSUB .. more ... 2F LCLRSUB .. more ... 30 ABORT ... more ... 31 ABOVAL On return from a CALL, an abort value will always be returned. The ABOVAL opcode indicates that the value on top of the stack should be the value which should be returned. 32 RETURN On return from a CALL, a return value will always be returned. The RETURN opcode indicates that the value which has been placed in the bottom of the stack when called ( LOC+0.LONG ) should be returned. 33 RETVAL On return from a CALL, a return value will always be returned. The RETVAL opcode indicates that the value on top of the stack should be the value which should be returned. 34 PUSH #-1 The long word $FFFFFFFF is pushed to the stack. 35 PUSH #0 The long word $00000000 is pushed to the stack. 36 PUSH #1 The long word $00000001 is pushed to the stack. 37 PUSH #kp The single byte following the PUSH opcode indicates the vaue of a 32-bit number to push to the stack. The lower five bits ( bits 0 to bit 4 ) of the operand byte is used to specify a power-of-two number ( 0 = $00000001, 1 = $00000002 through to $1F = $80000000 ). If bit 5 is set, that number is decremented. If bit 6 is set, that number is inverted. The purpose of bit 7 is not clear and has never been found to have been set by the author during testing so far -- It may have some significance for negative numbers where msb's may all be 1's ? 38 PUSH #k1 The byte following the PUSH opcode is taken as the 8-lsb's of a long number to push to the stack. The top 24-bits are zeroed. 39 PUSH #k2 The two bytes following the PUSH opcode are taken as the 16-lsb's of a long number to push to the stack. The top 16-bits are zeroed. The 16-bit number is stored after the PUSH opcode MSB first. 3A PUSH #k3 The three bytes following the PUSH opcode are taken as the 24-lsb's of a long number to push to the stack. The top 8-bits are zeroed. The 24-bit number is stored after the PUSH opcode MSB first. 3B PUSH #k4 The four bytes following the PUSH opcode are taken as a 32-bit long number to push to the stack. The 32-bit number is stored after the PUSH opcode MSB first. 3C ? The purpose of this opcode has not yet been identified. It has not been present in any Spin bytecode analysed by the author. 3D REG r[] .. more ... 3E REG r[..] .. more ... 3F REG r .. more ... There are 64 fast access stack load / save opcodes, with the bytecode instructions in the following format - .-------------------------- 0 = VAR | 1 = LOC | | .------------------ Address ( adr = v * 4 ) | | | | .-------- 0 0 = PUSH | | | 0 1 = POP | | | 1 0 = USING | | | 1 1 = PUSH # | | | .---.---.---.---.---.---.---.---. | 0 1 | w | v v v | o o | `---^---^---^---^---^---^---^---' These opcodes allow fast access by making long access to the first few long entries in the variable space or stack a single byte opcode. The single byte opcodes are effectively expanded to the following within the interpreter - .---.---.---.---.---.---.---.---. | 0 1 | w | v v v | o o | `---^---^---^---^---^---^---^---' | | | | `---------|------------------------. | | | `-----------. | | \|/ \|/ \|/ .---.---.---.---.---.---.---.---. .---.---.---.---.---.---.---.---. | 1 | 0 0 | 0 | 1 w | o o | | 0 0 0 | v v v | 0 0 | `---^---^---^---^---^---^---^---' `---^---^---^---^---^---^---^---' The consequential operation of those expanded bytecodes is described below. There are 128 stack load / save opcodes, with the bytecode instructions in the following format - .--------------------------- 0 0 = Byte | 0 1 = Word | 1 0 = Long | 1 1 = Unused - Unary / Binary operators | | .--------------------- 1 = [] | | | | .--------------- 0 0 = MEM | | | 0 1 = OBJ | | | 1 0 = VAR | | | 1 1 = LOC | | | | | | .------- 0 0 = PUSH | | | | 0 1 = POP | | | | 1 0 = USING | | | | 1 1 = PUSH # | | | | .---.---.---.---.---.---.---.---. | 1 | s s | i | w w | o o | `---^---^---^---^---^---^---^---' These opcodes are always followed by a Varaiable Address Offset which indicates the variables offset from a particular base as identified by bits 2 and 3 in the opcode. MEM is the global memory with offset zero being the first entry in the 32KB Ram. OBJ is an offset from where the main program or a sub-object has been placed within Ram. VAR is an offset from where the global memory from the main program or a sub-object has been placed in Ram. LOC is an offset from the base of the stack as it was when the any routine was entered by CALL or CALLOBJ. The size of the data entity being dealt with ( byte, word or long ) is indicated by bits 5 and 6 of the opcode. Bit 4 is an 'indexed' indicator. When present, a long value is popped from the stack and added to the variable offset specified to give an effective address for the variable to be used. This is used for array indexing and access. PUSH pushes the contents of the varaible indicated by the effective address to the stack as a long. POP pops a long from the stack and stores it as an appropriately sized data entity in the variable at the effective address. PUSH # pushes the effective address to the stack as a long. USING is the most complex form of the opcode range. The effective address of the variable data to be operated upon is determined, and a subsequent Using Modifier byte determines what operation(s) should be applied to the variable. Using modifiers are described later in this document. They notably include pre and post increment and decrement capabilities. There are 32 binary and unary operator opcodes, with the bytecode instructions in the following format - .---.---.---.---.---.---.---.---. | 1 1 1 | o o o o o | `---^---^---^---^---^---^---^---' Operators are single byte opcodes without any operand. They pop one ( unary operators ) or two ( binary operators ) longs from the stack, apply their function and push the resultant long back to the stack. In the following descriptions, 1st is the first or only long pushed, and 2nd is the second long pushed before the opcode is executed. E0 ROR 1st -> 2nd E1 ROL 1st <- 2nd E2 SHR 1st >> 2nd E3 SHL 1st << 2nd E4 MIN 1st #> 2nd E5 MAX 1st <# 2nd E6 NEG - 1st Unary E7 BIT_NOT ! 1st Unary E8 BIT_AND 1st & 2nd E9 ABS ABS( 1st ) Unary EA BIT_OR 1st | 2nd EB BIT_XOR 1st ^ 2nd EC ADD 1st + 2nd ED SUB 1st - 2nd EE SAR 1st ~> 2nd EF BIT_REV 1st >< 2nd F0 LOG_AND 1st AND 2nd F1 ENCODE >| 1st Unary F2 LOG_OR 1st OR 2nd F3 DECODE |< 1st Unary F4 MPY 1st * 2nd F5 MPY_MSW 1st ** 2nd F6 DIV 1st / 2nd F7 MOD 1st // 2nd F8 SQRT ^^ 1st Unary F9 LT 1st < 2nd FA GT 1st > 2nd FB NE 1st <> 2nd FC EQ 1st == 2nd FD LE 1st =< 2nd FE GE 1st => 2nd FF LOG_NOT NOT 1st Unary ******************************************************************************* * * * Operands * * * ******************************************************************************* Variable Address Offset ----------------------- A variable Address Offset is one or two bytes forming a positive offset from the base of the memory area the opcode it is used with specifies. The offest can be determined as follows - If ( first & $80 ) == 0 Then offset = first ' One byte Else offset = ( first & $7F ) << 8 | second ' Two byte End If Address Offset ------------- An Address Offset is one or two bytes forming a positive or negative offset from the location which follows the Address Offset. The offest can be determined as follows - If ( first & $80 ) == 0 Then If ( first & $40 ) == 0 Then offset = first ' One byte, +Ve offset Else offset = first - $80 ' One byte, -Ve offset End If Else If ( first & $40 ) == 0 Then offset = ( first & $7F ) << 8 | second ' Two byte, +Ve offset Else offset = (( first & $7F ) << 8 | second ) - $8000 ' Two byte, -Ve offset End If End If There are undoubtedly other mathematical operations onthe bytes which can be used to determine the offset. Call Offset ----------- .. more ... Call Object Offset ------------------ .. more ... Using Modifiers --------------- The USING opcodes simply specify an address of a variable to be processed and its data size. Following a USING opcode and any addressing information, a single byte modifier is given. If the msb of that byte is set then the variables value will be pushed after applying the modifier. The lower 7 bits determine what the operation to be performed should be. 00-07 Special operations 00 COPY 01 ? 02 RPTINCJ 03 ? 04 ? 05 ? 06 RPTADDJ 07 COPY .. more ... RPTINCJ / RPTADDJ These are followed by an Address Offset. They appear as modifiers within a Repeat-From-To construct and are used to either continue the repeat loop or to exit from it. RPTINCJ increments variable by one, RPTADDJ pops a value from the stack and adds it to the variable. The variable's value is then compared with the start and end values popped from the stack and if in range, a jump is taken to the address specified ( that is, the Repeat-From-To continues ), otherwise execution continues at the next opcode. 08-3F Unary operations ( bottom two lsb's ignored ) 08 FWDRAND 0C REVRAND 10 SEXBYTE Sign extend byte 14 SEXWORD Sign extend word 18 POSTCLR 1C POSTSET 20 PREINC 24 PREINC 28 POSTINC 2C POSTINC 30 PREDEC 34 PREDEC 38 POSTDEC 3C POSTDEC 40-5F Binary or Unary operation 40 ROR | 48 BIT_AND | 50 LOG_AND | 58 SQRT 41 ROL | 49 ABS | 51 ENCODE | 59 LT 42 SHR | 4A BIT_OR | 52 LOG_OR | 5A GT 43 SHL | 4B BIT_XOR | 53 DECODE | 5B NE 44 MIN | 4C ADD | 54 MPY | 5C EQ 45 MAX | 4D SUB | 55 MPY_MSW | 5D LE 46 NEG | 4E SAR | 56 DIV | 5E GE 47 BIT_NOT | 4F BIT_REV | 57 MOD | 5F LOG_NOT 60-7F Unknown - Not encountered by the author