Forth's may or may not include an assembler. I have no idea. It's certainly not mentioned in Leo Brady's book which is the most reading I have done on the subject. But even if they do that is no longer the Forth language now is it?
My puzzle is: Say I have a Forth system on my PC here, which I do. Now I want the same Forth system on some other target architecture X. Say a Propeller or whatever.
How do I go about achieving that?
Presumably I can write, in Forth, an assembler for my target X. Why not?
Presumably I can then write a minimal Forth kernel in that X assembly language, assemble it and put it on the target. Then I have Forth running there on X (Of course I will have written a loader in Forth to do that download to X)
Presumably I can then use that minimal Forth kernel on my new target X to build the rest of the Forth system like the one I had on my PC.
This whole bootstrapping process seems to have happened in C many times over the years. Do people ever actually do that in Forth? Or Lisp or whatever?
How did Forth systems get onto the Propeller? Was it done only using Forth or was it done with help of a proper language like C or Spin?
The only Forth that I've seriously dug into the internals of is FigForth for the 6502. FigForth contains a small kernel written in assembler that provides the basic semantics of the VM. However, much of what users consider the Forth language is not part of the kernel. Those are written in Forth and included with the kernel source. As part of the bootstrapping process they need to be compiled by a Forth kernel and added to the Forth dictionary. This is possible because word definitions only use words that are already defined. At this point the dictionary can be dumped as hex output and embedded within the kernel sources as data statements. This last part isn't technically needed, but it makes bringing up the Forth faster because those words are already in the dictionary.
FigForth also includes a 6502 assembler written in FigForth which can compile the kernel and the hex output to produce a ROM image. This ROM image can then be brought up on another 6502 system. It's a mystery where the first FigForth system came from I figure elves.
Lisp can be a very small language and a tiny implementation would almost certainly fit on the Propeller. In theory, the only functions that are needed are CAR, CDR, CONS, ATOM?, EQ?, and COND along with function definitions and calls. However, practical Lisp systems tend to be much larger and would likely require extended memory to run any useful program.
Almost everything else can be implemented with an asm function, like this:
Multiply and friends should probably be defined as their own PASM functions, though.
The first parameter is the source of movi, the second is the destination register, and the third is the source. It should probably also be able to allow pre-setting and post-capturing of flags.
By the way, I've made good progress on my Lisp GC, but I got sidetracked after getting mad at the Spin compiler while trying to debug it with a debugger kernel I wrote. The only significant part that's left (other than debugging) is making it generate multiple free lists for multiple Lisp VM cogs.
Spin really needs namespaces and an import statement to import symbols from another file into a namespace. The lack of this makes PASM debugging, LMM kernels, and everything else that involves linking PASM across different Spin files virtually impractical. I'll start a thread about my new compiler soon, after I finish a smaller non-Propeller-related project I'm doing now to teach myself more about parsers and compilers and such.
EDIT: The person who wrote Scheme in Forth also wrote Scheme in OCaml. OCaml looks like a very nice language, and I might write my compiler in it.
By the way, I've made good progress on my Lisp GC, but I got sidetracked after getting mad at the Spin compiler while trying to debug it with a debugger kernel I wrote. The only significant part that's left (other than debugging) is making it generate multiple free lists for multiple Lisp VM cogs.
Spin really needs namespaces and an import statement to import symbols from another file into a namespace. The lack of this makes PASM debugging, LMM kernels, and everything else that involves linking PASM across different Spin files virtually impractical. I'll start a thread about my new compiler soon, after I finish a smaller non-Propeller-related project I'm doing now to teach myself more about parsers and compilers and such.
I'm afraid I've forgotten exactly what you're building. Are you building a Lisp system that compiles to LMM PASM or are you building a Lisp interpreter or something in between like a Lisp to byte code compiler and a VM?
I'm afraid I've forgotten exactly what you're building. Are you building a Lisp system that compiles to LMM PASM or are you building a Lisp interpreter or something in between like a Lisp to byte code compiler and a VM?
All I have so far is a half-operational GC. If I do actually make an operational Lisp machine, the compiler would do no more than convert the text to a tree, run macros on it, and then feed the resulting tree to a simple interpreter. I'm much more interested right now in writing my Spin/PASM/anything else compiler. After I finish that, however, I might write a full Lisp to LMM PASM compiler. My Spin compiler will make writing LMM PASM very easy without involving any magic numbers.
How did Forth systems get onto the Propeller? Was it done only using Forth or was it done with help of a proper language like C or Spin?
Pfth is written in PASM and Forth, with a small amount of Spin to get it started. The kernel is written in PASM, and implements the inner interpreter. The inner interpreter is basically a VM that fetches opcodes, which happen to be the addresses of PASM routines that implement basic Forth words. The basic Forth words are things like +, -, *, /, @, ! and so on.
Additional Forth words are handcoded using LONG, WORD and BYTE in a DAT section. These Forth words make up the initial dictionary. One of the handcoded Forth words is an outer interpreter that contains a simple Forth compiler. This is used to compile additional words that are stored as Forth source in the DAT section. Eventually there are enough words to create a more complete outer interpreter, and once that is compiled pfth executes the new outer interpreter. And then it prints "Ok", and waits for user input.
All I have so far is a half-operational GC. If I do actually make an operational Lisp machine, the compiler would do no more than convert the text to a tree, run macros on it, and then feed the resulting tree to a simple interpreter. I'm much more interested right now in writing my Spin/PASM/anything else compiler. After I finish that, however, I might write a full Lisp to LMM PASM compiler. My Spin compiler will make writing LMM PASM very easy without involving any magic numbers.
Dave, it sounds like pfth and FigForth have a similar structure. I've seen similar DAT sections in FigForth that must form the outer interpreter and load the higher level Forth words.
When you have two pieces of PASM that should be in different files but will eventually end up both in the same cog at the same time, like an LMM blockloader kernel in one file and a LMM program in another or a PASM debugger in one file and the PASM code to be debugged in the other, there's no way for the spin compiler to export symbols from one file into the other. Instead of being able to do call #subroutine_in_other_file, you have to do jmpret $123, #$127, which isn't exactly easily understandable. If you change one file, you have to manually update the addresses in the other by looking at the compiler output of a Spin compiler that generates compiler listings. You can use CON constants or "DAT org $123 \n label", but there will still be at least one constant you have to manually update. The compiler should know how to do this itself.
With my compiler, you can do this:
lmm.spin:
DAT kernel ' DAT blocks can now be named
org wherever
' blockloader call subroutine
bcall call #pushret ' pushes currently loaded block's address and pc and maybe flags
jmp #bjmp ' load block # bdst and then jump to it
bcall_ret ' not an actual return - just an alias for pc
pc long 0
' which block we want to jump to - used as index in lookup table so block IDs fit in src literals
bdst long 0
lmm_code_start ' where LMM code should get loaded into the cog
main.spin:
OBJ
lmm : "lmm.spin"
kernel : lmm.kernel ' kernel refers to DAT kernel in lmm.spin
DAT pasm ' has our LMM code
org kernel.lmm_code_start
loop call #stuff
mov kernel.bdst, #3 ' blockload call subroutine #3
call #kernel.bcall
jmp #loop
This page http://www.forth.com/resources/evolution/evolve_3.html describes the origins of FigForth. Basically they wrote a 6502 assembler on a 6800 based Forth and compiled a kernel for 6502 machines. Once they had a 6502 kernel they could use it to build itself in the manner I described above.
@Heater
I don't see how you can exclude Assembler from any language ported over to another platform.
Regarding Forth, eForth set out to create as much as possible of Forth in Forth; but it certain requires the inner and outer interpreter to be in Assembler and to accomodate the memory architecture and i/o of the target platform.
In other words, you just can't get away from writing some low level code in Assembler.
Forth's may or may not include an assembler. I have no idea. It's certainly not mentioned in Leo Brady's book which is the most reading I have done on the subject. But even if they do that is no longer the Forth language now is it?
There are many "data processing" languages that have no need of assembler but C has inline assembly because it is essentially a low level language which depends upon it's libraries which also include assembly. This is probably even more important for most Forths to have an inline assembler to access the bare metal in an efficient and sometimes the only manner possible. However just like C you end up with a function like any other function, except it may be written wholly or partly in assembler.
My puzzle is: Say I have a Forth system on my PC here, which I do. Now I want the same Forth system on some other target architecture X. Say a Propeller or whatever.
How do I go about achieving that?
Presumably I can write, in Forth, an assembler for my target X. Why not?
Presumably I can then write a minimal Forth kernel in that X assembly language, assemble it and put it on the target. Then I have Forth running there on X (Of course I will have written a loader in Forth to do that download to X)
Presumably I can then use that minimal Forth kernel on my new target X to build the rest of the Forth system like the one I had on my PC.
This whole bootstrapping process seems to have happened in C many times over the years. Do people ever actually do that in Forth? Or Lisp or whatever?
How did Forth systems get onto the Propeller? Was it done only using Forth or was it done with help of a proper language like C or Spin?
I think just because we might use the Spin compiler it doesn't mean that the language was done with the help of a "proper" language, (ahem, cough). When I developed Tachyon all I really wanted was a macroassembler but I had to make do with the Spin compiler which wasn't really all that suitable but Brad C's BST made all the difference as it also generated a listing so I could position sections properly.
In eForth the philosophy is to code as little as possible in assembler as to make it easy to bring up on a new CPU, so just the very basic and necessary core words after which the high level kernel words are created. In most Forths including Tachyon the high level kernel words are defined using data directives such as byte or word etc. So in the Tachyon kernel it is mostly made up of DAT sections with byte symbol,symbol structures where symbol is usually one of the Tachyon VM opcodes. But once enough of this kernel is defined and compiled then the runtime kernel can be used to compile in the traditional Forth fashion so rather than all those awkward but necessary data directives. Take this simple kernel word as it is defined in the "Spin" kernel (no Spin is used).
' ( n lo hi -- flg ) true if n is within range of low and high inclusive
WITHIN byte INC,OVER,MINUS,PUSHR
byte MINUS,RPOP,XCALL,xULT
WT1 byte ZEQ,ZEQ,EXIT
All these byte symbols are opcodes except for the xULT which refers to an 8-bit vector pointing to more bytecodes.
It was necessary to code this in the kernel because other words which are part of the text interpreter/compiler need this but otherwise it would have been compiled by the Forth runtime kernel as it normally would:
: WITHIN ( n lo hi -- flg )
1+ OVER - >R - R> U< <>
;
So since we really only need to code the basic and necessary words in assembler and build the high level kernel words using data directives then Forth itself can be brought up quite fast on new CPUs. How well it's brought up though depends a lot upon the thought that goes into it though, which is why I decided upon direct execution bytecodes for the Propeller.
I believe MASM could be used for this purpose of bringing up a kernel just as it is used for this purpose with many other CPUs.
The Prop is a little different from other processors in that most languages are implemented using a VM. Spin, C, PropBASIC and the various Forths all work by using a VM that executes code stored in hub RAM. The exceptions are the cog modes of PropBASIC and C that execute code from cog memory. The other Forth implementations may also have this feature, but I'm not aware of it.
So Prop Forths don't generate cog machine instructions, but they generate VM instructions instead. These instructions are executed by the VM and either call a kernel function or call another Forth word. I've seen Forths for other processors that work this way also, either because the processor doesn't support stacks, or the person writing the Forth just preferred using a VM.
It is also possible to completely dispense with the VM and generate machine instructions. An assembly language isn't needed, you just write machine instructions in binary or hex. That's the way eForth is done. When the Forth interpreter compiles words it will generate machine instructions. It's convenient to have an assembler to get started, but it's not necessary.
Just thought I would mention here that there's now another (very tiny) version of scheme for the Propeller, in the form of the Lazier Scheme to Lazy K converter. It compiles some basic words like the ones David mentioned above (CONS, CAR, CDR, '(), +, *, ISZERO?, etc.) to SKI combinator form, which can then be compiled by the proplazyk compiler to a Propeller binary -- actually a small embedded interpreter, garbage collector, and expression tree wrapped up into a .binary file.
I mentioned Lazy K in another thread, but that was focused on the Lazy K language itself (which is an obfuscated programming language of sorts). But the backend is pretty general, and the SKI combinator calculus is logically equivalent to the lambda calculus that Scheme is built upon.
Comments
Forth's may or may not include an assembler. I have no idea. It's certainly not mentioned in Leo Brady's book which is the most reading I have done on the subject. But even if they do that is no longer the Forth language now is it?
My puzzle is: Say I have a Forth system on my PC here, which I do. Now I want the same Forth system on some other target architecture X. Say a Propeller or whatever.
How do I go about achieving that?
Presumably I can write, in Forth, an assembler for my target X. Why not?
Presumably I can then write a minimal Forth kernel in that X assembly language, assemble it and put it on the target. Then I have Forth running there on X (Of course I will have written a loader in Forth to do that download to X)
Presumably I can then use that minimal Forth kernel on my new target X to build the rest of the Forth system like the one I had on my PC.
This whole bootstrapping process seems to have happened in C many times over the years. Do people ever actually do that in Forth? Or Lisp or whatever?
How did Forth systems get onto the Propeller? Was it done only using Forth or was it done with help of a proper language like C or Spin?
FigForth also includes a 6502 assembler written in FigForth which can compile the kernel and the hex output to produce a ROM image. This ROM image can then be brought up on another 6502 system. It's a mystery where the first FigForth system came from I figure elves.
Almost everything else can be implemented with an asm function, like this:
Multiply and friends should probably be defined as their own PASM functions, though.
The first parameter is the source of movi, the second is the destination register, and the third is the source. It should probably also be able to allow pre-setting and post-capturing of flags.
By the way, I've made good progress on my Lisp GC, but I got sidetracked after getting mad at the Spin compiler while trying to debug it with a debugger kernel I wrote. The only significant part that's left (other than debugging) is making it generate multiple free lists for multiple Lisp VM cogs.
Spin really needs namespaces and an import statement to import symbols from another file into a namespace. The lack of this makes PASM debugging, LMM kernels, and everything else that involves linking PASM across different Spin files virtually impractical. I'll start a thread about my new compiler soon, after I finish a smaller non-Propeller-related project I'm doing now to teach myself more about parsers and compilers and such.
EDIT: The person who wrote Scheme in Forth also wrote Scheme in OCaml. OCaml looks like a very nice language, and I might write my compiler in it.
All I have so far is a half-operational GC. If I do actually make an operational Lisp machine, the compiler would do no more than convert the text to a tree, run macros on it, and then feed the resulting tree to a simple interpreter. I'm much more interested right now in writing my Spin/PASM/anything else compiler. After I finish that, however, I might write a full Lisp to LMM PASM compiler. My Spin compiler will make writing LMM PASM very easy without involving any magic numbers.
Additional Forth words are handcoded using LONG, WORD and BYTE in a DAT section. These Forth words make up the initial dictionary. One of the handcoded Forth words is an outer interpreter that contains a simple Forth compiler. This is used to compile additional words that are stored as Forth source in the DAT section. Eventually there are enough words to create a more complete outer interpreter, and once that is compiled pfth executes the new outer interpreter. And then it prints "Ok", and waits for user input.
When you have two pieces of PASM that should be in different files but will eventually end up both in the same cog at the same time, like an LMM blockloader kernel in one file and a LMM program in another or a PASM debugger in one file and the PASM code to be debugged in the other, there's no way for the spin compiler to export symbols from one file into the other. Instead of being able to do call #subroutine_in_other_file, you have to do jmpret $123, #$127, which isn't exactly easily understandable. If you change one file, you have to manually update the addresses in the other by looking at the compiler output of a Spin compiler that generates compiler listings. You can use CON constants or "DAT org $123 \n label", but there will still be at least one constant you have to manually update. The compiler should know how to do this itself.
With my compiler, you can do this:
lmm.spin:
main.spin:
I don't see how you can exclude Assembler from any language ported over to another platform.
Regarding Forth, eForth set out to create as much as possible of Forth in Forth; but it certain requires the inner and outer interpreter to be in Assembler and to accomodate the memory architecture and i/o of the target platform.
In other words, you just can't get away from writing some low level code in Assembler.
I think just because we might use the Spin compiler it doesn't mean that the language was done with the help of a "proper" language, (ahem, cough). When I developed Tachyon all I really wanted was a macroassembler but I had to make do with the Spin compiler which wasn't really all that suitable but Brad C's BST made all the difference as it also generated a listing so I could position sections properly.
In eForth the philosophy is to code as little as possible in assembler as to make it easy to bring up on a new CPU, so just the very basic and necessary core words after which the high level kernel words are created. In most Forths including Tachyon the high level kernel words are defined using data directives such as byte or word etc. So in the Tachyon kernel it is mostly made up of DAT sections with byte symbol,symbol structures where symbol is usually one of the Tachyon VM opcodes. But once enough of this kernel is defined and compiled then the runtime kernel can be used to compile in the traditional Forth fashion so rather than all those awkward but necessary data directives. Take this simple kernel word as it is defined in the "Spin" kernel (no Spin is used).
All these byte symbols are opcodes except for the xULT which refers to an 8-bit vector pointing to more bytecodes.
It was necessary to code this in the kernel because other words which are part of the text interpreter/compiler need this but otherwise it would have been compiled by the Forth runtime kernel as it normally would:
So since we really only need to code the basic and necessary words in assembler and build the high level kernel words using data directives then Forth itself can be brought up quite fast on new CPUs. How well it's brought up though depends a lot upon the thought that goes into it though, which is why I decided upon direct execution bytecodes for the Propeller.
I believe MASM could be used for this purpose of bringing up a kernel just as it is used for this purpose with many other CPUs.
So Prop Forths don't generate cog machine instructions, but they generate VM instructions instead. These instructions are executed by the VM and either call a kernel function or call another Forth word. I've seen Forths for other processors that work this way also, either because the processor doesn't support stacks, or the person writing the Forth just preferred using a VM.
It is also possible to completely dispense with the VM and generate machine instructions. An assembly language isn't needed, you just write machine instructions in binary or hex. That's the way eForth is done. When the Forth interpreter compiles words it will generate machine instructions. It's convenient to have an assembler to get started, but it's not necessary.
The source code and a binary release are at https://github.com/totalspectrum/proplazyk.
I mentioned Lazy K in another thread, but that was focused on the Lazy K language itself (which is an obfuscated programming language of sorts). But the backend is pretty general, and the SKI combinator calculus is logically equivalent to the lambda calculus that Scheme is built upon.