The JavaProp project is a collaboration of engineers designing a Java Virtual Machine (JVM) for the Propeller multi-core CPU.
Java is one of the best software languages available. It is like industry standard C/C++ in many ways, but offers features which ease design and maintenance. Java is the introductory language of choice in university settings today for learning computer programming and object oriented design.
The Javelin Stamp by Parallax enjoys a modest following in it’s original form. Users of the product today are faced with a potential product end of life as it is not being actively marketed. Having a product that is compatible with potentially higher performance and lower price would be attractive for these users.
Another product that is highly desirable would be fully Java compliant. Some drawbacks of the Javelin solution are lack of dynamic memory management and low relative performance. Memory management can be implemented, and performance can be improved.
Two Phase Design
Phase 1 Javelin Compatible Feature Goals
Phase 2 Java Fully Compliant Feature Goals
An unfortunate reality is that the current Propeller based product does not have enough built-in memory to allow loading a 32K Java bytecode image as was available in Javelin (an external memory was used for this). This memory shortage would limit programs being used on the JVM to fairly small Java programs even with the proposed smaller footprint Phase 1 design. Perhaps the next generation Propeller will have more memory.
Common Requirements
These design requirements apply to both Phase 1 and 2.
Phase 1 Javelin Compatible Requirements
Phase 2 Full Java 2 Compliant Requirements
Javelin IDE or JIDE Requirements
The Javelin IDE will be able to control JavaProp similar to if not the same way it can control the JavelinStamp. This places certain requirements on the JVM design that are described here.
X.0 Introduction
The Parallax, Inc. Propeller micro-controller is an eight core CPU with up to 32K bytes of shared memory and 2K bytes each of individually core accessible memory for a full memory contengient of 48K bytes. The Propeller otherwise known as Prop has 32 IO pins and runs up to 80MHz in a pipe-lined 32 bit architecture.
The initial software tool set includes an IDE that provides editing of .spin programs. Spin programs can contain proprietary language interpreter and assembly in one file. Spin is a pseudo object oriented language where modules are included and public functions accessed using “dot” method calls and other accesses.
As of this writing (Feb 2008) a C-compiler is in progress, but not yet available. Peter Vekaik suggested adapting the Javelin Stamp Java tool-set and JVM on Propeller. This specification details the design of this JavaProp product.
Various resources were consulted for researching the JVM.
Bill Venner offer some great JVM implementation theory advice in his book and web pages “Inside the Java Virtual Machine”[2]. Other references such as Sun’s “Java VM Spec”[3]. Much of the original bytecode description text comes from a Peter Norton JVM page4, but it appears to be missing for now. Wikipedia5 also has much information on JVM including a list of bytecodes and descriptions.
X.0 Scope
This design document specifies details of the JVM for Propeller. Separate Java IDE and linker from the Javelin Stamp are also being modified for used with the product and are discussed briefly below, but this document focuses primarily on the JVM.
X.0 Language Overview
Propeller SPIN Overview
SPIN is a programming language designed by/for Parallax, Inc. Propeller multicore embedded micro-controller. SPIN files can contain SPIN language and PASM assembly in one file for programming the Propeller.
The SPIN language is pseudo-object oriented in that objects are defined in separate files which contain public and private methods/functions and data. SPIN is object-oriented other than a fairly annoying short list of features not available that would be found in most object-oriented languages.
Things you can not do with SPIN:Propeller PASM Overview
Java Overview
Java is a fully object oriented language that is the choice of many developers and educators. A Java compiler creates class files which contain data structures and bytecodes that can be executed on a virtual machine. Bytecodes are similar to machine language for CPU hardware.
JVM Overview
The JVM in an embedded system executes bytecode from a single class file. In an application like Internet Explorer or Mozilla (Netscape Navigator) one or more JVM can run applications and/or applets using multiple class files.
The embedded system JVM has an easier task loading class files compared to a PC’s JVM. Many shortcuts can also be taken for an embedded JVM to make it have a smaller footprint than a more generic PC JVM; this makes a product like JavelinStamp, Tini, or JavaProp among others possible. Several small footprint JVM sources are available on the net including TinyVM, NanoVM, and SimpleRJT. Many hands have carved the path before us.
X.0 Phase 1 JVM Design
JIDE Interaction
The JVM front-end interface module to the JIDE program will allow JIDE to control single-step debugging at Java source level of the JVM controller. The JVM front-end interface is essentially a command-line interpreter and dispatcher. To do it’s job, the front-end interface needs access to JVM information. The JVM public API below will be made available for the front-end interface module to use for JVM control/status.
JIDE API
PUB jvmStepInitialization
The “main” .spin routine initializes debugging preferences and JVM mode and calls JvmSpinLoop which initializes the JVM via InvokeProgram and runs the embedded java class represented by jem.bin in the “jem” spin object file.
InvokeProgram loads key parameters from the jem.bin header area including the bytecodeStart address which comes from the “jem” spin object file. Key class file header fileds are numStaticFields, constantTable, mainMethodBase, and the string table which is loaded separately in the InitConstantTable call. The JVM attempts to use the architecture setup by the chip version combigned with a type over-ride.
Before calling the main method, the JVM calls InitConstantTable. This function builds an array of “String” objects using the constant table in the bytecode. The array of string objects is kept globally in variable gConstTable to save space rather than having a copy in each object instance; this is possible because the table is read only.
The global constant table array of string objects, like all arrays in the JVM, is allocated from the heap using the GetFromHeap method. The constant table size is automatically derived relying on a bytecode table containing descriptor offsets and the first table entry pointing to the first descriptor. This of course assumes that the last table entry and first descriptor address are contiguous.
After getting storage for the constant table, each entry is built by getting a “new” class for each string, getting string pointer storage space, and initializing the string object fields. The string data is copied from the constant table descriptor, and each object is finally set in it’s position in the global constant table array.
The main method is the first method called. The embedded system does not allow for passing arguments to main, so one step normally performed is eleminated. Calling main is like calling any other method mostly and is managed by the CallMethod function. Before any method can be called, the “method table” in the byte code is consulted for key items such as the method bytecode address, the number of parameters being passed to the method, the number of local variables the method will use, and the address of the default exception handler which appears to be always zero.
Before setting the PC to run a method, a stack “frame” is created for parameters and local variables. A key difference that can occur running a method is with native -vs- JVM methods. If the native bit is not set, the PC is set to the method address (+bytecode start), and the JVM processes the method’s bytecode. If the native bit is set, JVM methods representing the native machine functions will be called (the JVM does not own stack responsibility for native methods).
During a method call, any valid bytecode operator and associated operands must be correctly handled by the JVM. Today if something bad happens, the JVM will call the Crashed method with a string parameter indicating the error. This is fine for early development, but normal exception handling must be applied. Exception table handling for methods needs some study.
Once a method call completes, the return value if used will be placed on the stack. This and the “void” return case are handled by ReturnFromMethodWithValue and ReturnFromMethod.
Heap
The heap is designed with simplicity rather than compactness in mind. This is a fair trade-off for a first cut until development matures. At some point, heap management and API’s could be changed to make memory somewhat more efficient if necessary.
The heap allocator function requires a type and length which makes heap management relatively easy. It allocates pointers from the heap in a downward fashion (each new allocated pointer value will be less than the last). The type and length variables define how much of the the heap is used and can be combigned with the type address to get the next heap entry. An entry is allocated if the MSB of type is clear. An entry is free if the MSB of type is set. It would be possible to use first-fit allocation if the heap was scrubbed.
Users allocate memory with the GetFromHeap method. Users get pointers to data using the GetHeapInfo method. GetHeapInfo requires a previously allocated pointer as an input, and delivers type, length, and data as outputs via variable references. The GetHeapInfo type and length parameters are optional and passing address 0 to them are harmless.
Some convenience macros are provided for accessing heap data since it is easy for the casual code reader to misconstrue these details. Functions GetHeapValue and GetHeapDataValue return data values based on the heap entry’s accounting. Methods SetHeapValue and SetHeapDataValue set value to pointer data.
The heap is used for storing object and array references. The stack is used for stack frames, bytecode manipulations, and method variables.
Stack
Data Structures
The main data structures used in the JVM are defined by the bytecode class file. The class header defines a data structure that could be “framed” in a C-style struct if such was available. The class and method table defines descriptors for all classes and their methods. The constant table has a table index of descriptor offsets and descriptors that are used to populate an array of String objects.
One data structure we have some flexibility in defining is the JVM Object. Pointers are provided in the object for classid (the class descriptor definition bytecode offset), the superclass id (the descriptor definition bytecode offset of the parent of the object), and field array pointer. These members are defined at object “new” time. The field array size is defined by the number of fields in the class descriptor. This data structure is built with the GetNewClass method.
Methods are not kept in an object data structure. When a method is needed, it’s method id is extracted from the class’ method table. Methid id’s are however associated with class ids in a datastructure for quick reference for InvokeVirtual and instanceof code (a methodid is the class’ method descriptor definition bytecode offset). This data structure is built with the LoadClassInfo method.
Bytecode Processing
Decoding what the JVM expects can be quite tricky. Some apparent solutions are addressed here:
INVOKEVIRTUAL - This one made no sense until 3AM … The Class offset appears to be an index into the list of objects on the stack. Most incomming offset opcodes are twice as big as necessary. It is true for this offset; additionally however, one must be subtracted to get the correct object position. So stack object index := offset/2-1. The method descriptor in a class is the sum of the class start address (derived from the indexed object) and the method offset. Fingers crossed :)
Exception Processing
PASM LMM
PASM Bytecode Table
PASM Debugger Features
PASM API
SPIN Bytecode Functions
SPIN Debug Features
SPIN API
Source File Organization
X.0 Source Repository
A source repository is not currently in use, but can be easily created. Current file sharing is being done via Propeller IDE archive packaging.
Rules of engaugement for archive sharing to reduce change conflict:
Rules of engaugement for CVS or other repositories.
X.0 Test
Having unit tests of individual features that may be reproduced by others is desirable. The project is not big enough to justify a requirement for unit test, but code quality will improve if used.
Whitebox tests are functional tests that are written with knowledge to the internal function of the code being tested. It is not clear at this point whether whitebox testing will yield higher quality code than blackbox functional testing of hardware device or common software classes.
Blackbox tests are usually functional tests written to verify features of the product without knowing the internal working of the code. Blackbox tests will be devised for all known classes in use to ensure that any maintenance activity does not break the code. Blackbox tests will also be created for download monitor and user debugger features.
X.0 Glossary