And for the RasPi - there is some Windows 10 out for the RasPi, for free as I remember. Convert your 25 used opcodes to arm and run Windows 10 on the RasPi's.
That's a good point - I'd forgotten about that Win10 - it will be a Windows subset, but it sounds like this only needs a subset anyway, and that would be an easy? way to port & test the Code generator, on the way to a Bare Metal solution.
A Win10 Pi3 version would be quite reasonable to most users, but a bare metal one would be more Embedded interesting, a la Project Oberon.
There is a thing called Wine in Linux that allows one to run Windows programs. It's far from complete and perfect but there are many Windows Programs that will run under Wine. Including my favourite circuit emulator LTSpice.
Now, a lot of Wine is built into a library libwine. What this means is that, if you have the source code, you can compile programs that use the Windows API on Linux and produce Linux binary executables. You just have to use libwine instead of the usual Linux libraries. The resulting executables can be run directly, like any other Linux program, without having to use the wine program.
I now recall that I actually did this about ten years ago. It was only a simple experiment for giggles but I did get to open a Window and have some dialogues in it.
This conversation has also dragged up an old memory of programming for the Windows API in assembler!
Basically, way back in the day I was wanting to program for this new Windows thing. So I did the usual thing of acquiring some hefty books on Windows and getting to grips with the API. One of those books was actually about Windows programming in assembler. That really floated my boat as I found it was easier to create Windows programs in assembler than C++ ! (I was far more proficient in assembler than C++ at the time)
So, there is the crazy idea. It might be possible to get Plain English to run under Linux with no changes to it's current code generator, just link it's output with libwine.
Note: Wine is not an emulator, so this only flies on x86 machines.
I'm thinking of creating a version that will work with the Propeller microcontroller and I'm wondering (a) if there's any interest in such a thing, and (b) if anyone here would like to help with the Propeller side of the project (I'm familiar with microprocessors, but have not yet worked with a Propeller).
Yes there is interest. And yes, there are people here who would love to help, so long as you can get them to focus on the job at hand and not go off on a tangent (myself very much included!).
Practical stuff. The native language of the propeller is Spin. If you can translate Plain English to Spin, it will run on the Propeller.
Typical objects are a mixture of spin and assembly. Many people don't worry about assembly - they just drop the object into their code and use it. An object might be something that runs a display or a keyboard or sound.
Spin is slower than assembly. For typical projects I write, I start with spin, get it debugged, then gradually move it over to assembly.
Languages like C and Basic were added later to the Propeller.
Later again, boffins came up with cached memory drivers so you could have megabytes of ram.
There is also a very clever assembly emulator that grabs code out of the propeller hub - this runs almost as fast as native assembly but gives you 32k of code space instead of the 2k cog limitation.
So it is worth knowing that you can have lots of memory, and you can code in multiple languages.
So for a "Hello World", behind the scenes it is going to be a video driver written in Spin/Assembly putting that text on a screen.
And then you need some tools - some way to convert Plain English into Spin to start with. Then Plain English into assembly. And for the purists, some way of converting Spin/Assembly to Plain English, so you can take existing object code and convert it back into Plain English.
That latter bit is a big task, but there are examples that have been done over the years converting C to Spin and Spin to C.
This avoids having to reinvent so many things - eg you could write a FAT SD driver from scratch. Or you could take an existing one and convert it to Plain English.
Start simple. Maybe take all the bit shift operators in Spin and all the parts of spin that have strange symbols (@@@ for instance) and think about what those would be in Plain English.
But with a emulator running your existing opcodes and some cog driver doing Keyboard, Mouse, VGA you might be able to build a minimal runtime to support what is needed by your current code generator. Not to run the IDE (yet) but to run the created programs
Yes, I really like the idea of COGS replacing keyboard, mouse and other standard desktop-computer interrupts.
A Win10 Pi3 version would be quite reasonable to most users, but a bare metal one would be more Embedded interesting, a la Project Oberon.
Agreed. Way more work as well. But your mention of Oberon, I think, is pertinent here. It's one of two systems -- the other being the original Pascal Macintosh -- that have really impressed and inspired me. Both had the same kind of minimalist elegance that made them useful for both work and education. Like Plain English, even if I do say so myself. And like the Propeller. Still bearing with me? Good. Now Mike tells me...
There's a Propeller 2 in development with lots of new features including more (16) and faster processors and more shared memory (512K bytes) along with the ability to execute directly from the shared memory. There are other features to help with I/O and signal processing (analog and digital ... better video too). There's an FPGA version of the Propeller 2 available here ... a work in progress ... with silicon expected sometime soon. The native Propeller instruction set is not suited for larger memories, but an interpreter for a suitable instruction set should be easy enough and could support software demand paging into shared memory for a 4GB external address space with good performance.
And it seems to me that a Propeller 2, as described above, even without the 4GB memory extension, is more machine than my 128K Macintosh ever was. So here's what I'm thinking. Why not develop a Plain English Oberon/Macintosh-inspired desktop computer using a Propeller 2? With the friendly face and fundamental capabilities of our Plain English system so it will be useful for both work and education. Specifically, it should be system enough so it can be used to:
1. Describe itself in a wysiwyg document of about 100 pages; and
2. Re-create itself by recompiling its own source.
That's the "useful for work part". The virtues of such a system for education should be obvious, but I'll list some of them here anyway:
1. The system will be educationally complete: the student, studying this system alone, will learn "all about" hardware, firmware, and software;
2. The system will be immaculately pristine: top-to-bottom, it will include only those things that are absolutely necessary and nothing else -- there will be no awkward compromises for compatibility's sake, and no distracting features with zero educational value.
3. The system will be remarkably friendly. Plain English through and through: code, documentation, the whole shebang.
Over the years I found out that it is quite fun to read [Heater's] posts and (for a short time) to work together with him on some insane stupid project.
So how about it, Heater? Is it feasible? I'll be Jobs if you'll be Woz and we can all play at being Jef Raskin and Andy Hertzfeld and Bill Atkinson. We'll shoot for something like this...
...end up with something like this...
...and offer it to the public, in pieces, as an educational kit.
Yes there is interest. And yes, there are people here who would love to help, so long as you can get them to focus on the job at hand and not go off on a tangent (myself very much included!).
I wrote my previous post before seeing this latest from you (don't want you to feel left out).
This avoids having to reinvent so many things - eg you could write a FAT SD driver from scratch. Or you could take an existing one and convert it to Plain English.
Or, given the goal described above, code up a simpler but sufficient file system of our own. My 128K Mac knew nothing of FATs and NTFSs and yet I was still able to get a lot of real work done on the thing. Oberon's file system is another example.
I'm all for simple -- at the start and all the way through! But I'm the kind of guy who needs a "lofty goal" to get motivated. And to keep myself focused -- "Does this thing here contribute to the lofty goal? No? Get back to work!"
The idea is it threefold:
1. It's well-nigh impossible for "the little guy" produce anything that combines beauty and function in a harmonious whole while trying to be compatible with all the Smile the world throws at us;
2. Educational products -- especially those that focus on fundamentals -- don't have to be compatible with other stuff, they only have to be self-consistent; and therefore
3. We can "do it up right" if we restrict ourselves to an "educational prototype kit" built on a reasonably powerful and elegant piece of hardware. I've looked around a lot and have seen very little in the way of hardware that fits that bill better than the Propeller.
Ok, keep it really simple. Take just a few Plain English instructions and convert them line by line to Spin. eg a loop, 'if' statements. addition, subtraction.
Then make it slightly more complex. In Spin, the indentation is part of the language.
What does the Plain English compiler look like? How is it parsing commands? Is it a one or two pass compiler? Is it a compiler based on lookup tables of words?
Sure. But you're already way ahead of me. I think I should actually buy a Propeller and play with it a bit to get some hands-on experience, don't you think? I'm thinking the "Propeller Education Kit -- 40-Pin DIP Version" might be good for me. Ya think? I've built circuits with 6502s and 8088s and Z8s, and I know how to program in everything from assembler to English, so it shouldn't take long to get through it. But I do want the actual experience.
Take just a few Plain English instructions and convert them... Then make it slightly more complex.
Normally when porting the system we start with with something that gives feedback like making a built-in speaker beep. But now my inexperience shows; I suspect it's easier to hook an LED to a Propeller rather than a speaker. Anyway, once it beeps (or blinks) then we typically move on to a conditional beep (or blink), then some basic comparison stuff, then loops, calls, global variables, local variables, parameters, etc. I know the drill for that part of the problem.
... line by line to Spin. eg a loop, 'if' statements. addition, subtraction. In Spin, the indentation is part of the language.
Why Spin? Surely Spin (especially with that indentation stuff) doesn't execute directly on the thing -- it must be reduced to assembler (or at least byte-code) to actually run. But again my ignorance is showing. I was thinking the compiler would put out Propeller machine code -- for speed, for compactness, and so the student who studies the system doesn't have to learn anything but Plain English and the machine code it turns into.
The source code is in six files; these are their actual names:
1. The Desktop
2. The Finder
3. The Editor
4. The Writer
5. The Compiler
6. The Noodle
They're all ascii text files even though they don't have the usual .TXT extension. You should be able to open them up even if you don't have a Windows system.
It takes 17 steps to fully compile a directory (all of the source code for a project must be in a single directory). Here's the top-level code:
To compile a directory:
Compile the directory (start).
Compile the directory (read the source files).
Compile the directory (scan the source files).
Compile the directory (resolve the types).
Compile the directory (resolve the globals).
Compile the directory (compile the headers of the routines).
Compile the directory (calculate lengths and offsets of types).
Compile the directory (add the built-in memory routines).
Compile the directory (index the routines for utility use).
Compile the directory (compile the bodies of the routines).
Compile the directory (add and compile the built-in startup routine).
Compile the directory (offset parameters and variables).
Compile the directory (address).
Compile the directory (transmogrify).
Compile the directory (link).
Compile the directory (write the exe).
Compile the directory (stop).
And here's what it does:
1. We start up, which resets some timers so we an see how long each step takes.
2. We read all the source files into memory.
3. We scan the source files looking for "sentence starters" and "sentence enders" and build three linked lists of the TYPE, the GLOBAL, and the ROUTINE statements we find. The ROUTINE records contain routine header information plus a list of the sentences found in each routine's body.
4. We "resolve" the TYPES by recursively examining them all, filling in "parent" TYPES as we bubble back out of the recursion.
5. We whip through the GLOBALs and point them to the appropriate TYPE records.
6. We compile the headers of the ROUTINES, hanging a list of PARAMETERS and a list of MONIKERS (routine-name fragments) on each ROUTINE record.
7. We whip through the types recursively and calculate their lengths -- primitive types first, then user-defined simple types, then user-defined record types including the offsets of fields in those records.
8. We add some "built-in" routines to the ROUTINE list for memory management.
9. We index the ROUTINES for "utility" use. This index is used later as a last resort when a matching routine cannot be found for a call. In essence, it reduces all of the routines to their bare-minimum characteristics.
10. We compile the bodies of the ROUTINES, sentence by sentence, searching for a matching routine header for each. If we find a match, we point that sentence to the ROUTINE record it should call. If we don't find a match -- even in the "utility" index -- then we report the error: "I don't know how to..."
11. We add in and compile the built-in startup routine which calls a specially-named library function to get things rolling.
12. We whip through all the the parameters and the local variables for each routine and fill in their offsets on the stack based on their TYPES.
13. We calculate absolute addresses for all the GLOBALS (including literals that we discovered as we were going along), and for all the ROUTINES.
14. We "transmogrify" everything that needs transmogrification. That is, we attach tiny little assembler code fragments to each record that needs them. A little snippet of assembler code, for example, to initialize a global variable, or to push a parameter onto the stack.
15. We "link" the whole shebang together into a PE-style executable in a big memory buffer. That is, we copy all those little fragments of assembler into the proper spots in the file buffer. We ignore anything that isn't used to keep the executable as small as possible. So everything in the Noodle (our standard library) gets compiled every time, but only the parts that are actually used are included in the executable.
16. We write the executable buffer to disk with an appropriate name derived from the directory name where the source files are stored.
17. We stop the last of the timers.
If the programmer selects the "List" command from the menu (instead of "Compile" or "Run") we produce a detailed listing of everything the compiler was thinking about (and generating) as it went along. You can see a piece of such a listing on page 88 of the manual, including a bunch of those assembler code fragments (in big-endian hex; or is it little-endian? I can never remember).
Hard to say, as you can see from the above. It makes only one pass at all the code, but it makes several passes at different parts of the code at different times. But it's very, very fast because everything is done with pointers to the source code: for example, a ROUTINE record contains only pointers to the start and end of the routine header, not the routine header itself; ditto for the statements in the body of the routine.
Yes, but not as you might expect. There are a number of "decider" routines that tell us whether a particular string is an article, or a preposition, or a conjunction, for example. Routines like this:
To decide if a string is any conjunction:
If the string is "and", say yes.
If the string is "both", say yes.
If the string is "but", say yes.
If the string is "either", say yes.
If the string is "neither", say yes.
If the string is "nor", say yes.
If the string is "or", say yes.
Say no.
But there are only a handful of those. The bulk of a program's vocabulary (and grammar, for that matter) is inherent in the TYPES, GLOBALS, and ROUTINE HEADERS that the programmer has coded. And those are indexed using hash tables (with linked lists for overflow) so we can the find the things we need quickly.
And that, I'm sure, is way more than you bargained for.
The Plain English compiler looks like this: https://github.com/Folds/osmosian You can see the compiler source in the file "the compiler". 4300 lines of, well, English looking text. Seems that is not useful without "the noodle", about 11000 lines, that has functions for code generation and seems to be a kitchen sink full of other unrelated stuff like GUI drawing primitives, number formatting functions, things to do with PDF rendering.
The other files, "the *", are the IDE and the rest of the system. For a total of about 23000 lines.
My take of this is that Gerry want to get this running on some simple platform, in the most minimalist, simplest way possible. In a self contained manner with no supporting junk. Such that the entire thing is understandable in some reasonable time by a student. Never mind issues like portability and so on. (Sorry Gerry if I'm a bit slow on getting that point). Does this sound like Chip's philosophy in designing the Propeller and Spin in the first place? I think so.
Can the Propeller run Plain English? I don't think so...
The Plain English Windows executable is nearly a megabyte. Running on my PC here I'm seeing it using about 40MByte (Although a lot of that can be because of the wine library and other junk, so let's say it needs 10 or 20MB to run in). Clearly this won't fly on the Propeller without bolting on a lot of external RAM.
Adding RAM like that tends to eat all your pins so you have probably lost VGA display and/or keyboard and or/mouse and/or SD card storage. One could of course use more than one Propeller and add peripherals to a second Prop.
As for code generation, Spin as a target is out of the question. It's limited to the 32K HUB RAM space. It's slow as hell.
Edit: I think perhaps Dr_A meant the Spin byte code not the actual Spin source code. The Spin byte code interpreter is built into the ROM of the Propeller. In theory on could compile Plain English or any language to Spin byte codes and have the Prop run it without ever using the actual Spin language or compiler.
So you would need to compile to that almost native instruction set of the Propeller called Large Memory Model (LMM) that can be executed from external RAM with the help of a tiny fetch, execute interpreter loop. This can be done as prop-gcc demonstrates. Or one could design ones own minimalist byte code and create a little interpreter for it in assembler. This can be done easily as my Zog project demonstrated for running C compiled to the ZPU byte code demonstrated. There are other examples.
This can all be done. As it happens I have a Gadget Gangster Propeller board here with 32MByte of SDRAM attached to prove the point. But I'm guessing it would be so slow as to be unusable. And the complexity of a multi-prop system adds up.
So what about the Propeller 2?
That would be much nicer. More speed, more COGS, more pins. The external RAM interface would be trivial. As would on chip video and so on.
Problem is the Prop 2 does not exist. I've resolved not to think about it until it does.
It requires the purchase of an FPGA board, which is quite expensive if you want a "top of the range" P2 installation, which I think may be required here.
Then there is the whole hassle of getting the FPGA programmed with the P2 configuration. Not such a big deal but it's a big step away from no dependency on tools Gerry is after.
Then there is still the fact that the P2 does not exist and there is no firm date set for when it might. And, God forbid, it might never happen.
My take of this is that Gerry want to get this running on some simple platform, in the most minimalist, simplest way possible. In a self contained manner with no supporting junk. Such that the entire thing is understandable in some reasonable time by a student. Never mind issues like portability and so on. (Sorry Gerry if I'm a bit slow on getting that point). Does this sound like Chip's philosophy in designing the Propeller and Spin in the first place? I think so.
Can the Propeller run Plain English? I don't think so... The Plain English Windows executable is nearly a megabyte. Running on my PC here I'm seeing it using about 40MByte (Although a lot of that can be because of the wine library and other junk, so let's say it needs 10 or 20MB to run in). Clearly this won't fly on the Propeller without bolting on a lot of external RAM.
It takes about 30MB to re-compile itself on Windows. On the one hand, that doesn't seem like a lot to ask from a 32-bit machine, especially these days when a kid's watch might have a gigabyte in it. On the other hand, the 512KB Macintosh ran quite well with, well, 512KB of memory (plus 64KB or so of ROM). Plain English eats up memory mainly because we knew it was there for the taking; we thus opted for simplicity in lieu of memory efficiency. I don't know, at this point, if a scaled-back version could fit in, say, 512KB or not. I'm positive it won't fit in 32KB.
Adding RAM like that tends to eat all your pins so you have probably lost VGA display and/or keyboard and or/mouse and/or SD card storage. One could of course use more than one Propeller and add peripherals to a second Prop.
Seems to me that more than one Propeller (or other single-processor chip) is like a discrete (as opposed to integrated) version of the Propeller itself.
Edit: I think perhaps Dr_A meant the Spin byte code not the actual Spin source code. The Spin byte code interpreter is built into the ROM of the Propeller. In theory on could compile Plain English or any language to Spin byte codes and have the Prop run it without ever using the actual Spin language or compiler.
It was his remark about the indenting that made me think he was thinking of Spin source rather than Spin byte-code.
ESo you would need to compile to that almost native instruction set of the Propeller called Large Memory Model (LMM) that can be executed from external RAM with the help of a tiny fetch, execute interpreter loop. This can be done as prop-gcc demonstrates. Or one could design ones own minimalist byte code and create a little interpreter for it in assembler. This can be done easily as my Zog project demonstrated for running C compiled to the ZPU byte code demonstrated. There are other examples.
This can all be done. As it happens I have a Gadget Gangster Propeller board here with 32MByte of SDRAM attached to prove the point. But I'm guessing it would be so slow as to be unusable. And the complexity of a multi-prop system adds up.
Again, on the one hand, we live in a world of gigahertz processors in freaking toys. While on the other hand, we remember our Mac that used to run at a single megahertz. And I think there's educational value in remembering that -- I'm hoping our students will someday wonder why the other guy's machines, which clock, say, 1000 times faster than ours, don't perform 1000 times better.
So what about the Propeller 2? That would be much nicer. More speed, more COGS, more pins. The external RAM interface would be trivial. As would on chip video and so on. Problem is the Prop 2 does not exist. I've resolved not to think about it until it does.
I was assuming Prop 2 when I posted above (with the pictures of the Macs). I think I said as much there too.
That's not quite true, P2 can certainly run code now, on a fpga, which is fine for software development of this type.
I'm thinking this project, if it's at all feasible, will take about nine months. It would therefore be nice to have a production P2 in about six months at the latest. I search around a little but couldn't find an estimated target date.
As far as I can follow GnCuobol as it is called now (OpenCobol is renamed since joining the Gnu Group) is based on dynamic linking and PropGCC does not support this.
So how do you got around this problem?
VERY interested,
OK, you caught me out. I did not actually run that on a Propeller. Only on my PC.
But I think it could be done. We just need a "static" build, all libs built into the executable, rather than a "dynamic" one. As it happens the cobol library, libcob, is available on my machine as a static library libcob.a. So this is possible.
First compile the COBOL to C:
$ cobc --free -C fibo.cob
Then compile the C code statically:
$ gcc -o fibo --static main.c fibo.c -lcob -lgmp -lm -lncurses -ltinfo -ldb -lpthread
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/libdb.a(os_addrinfo.o): In function `__os_getaddrinfo':
(.text+0x20): warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
But...there is a tiny bit of magic in the home made main() I used:
#include <stdio.h>
// Provide dummy dynamic loader routines to get cobc to link statically.
int dlopen () {}
int dlsym () {}
int dlerror () {}
int main()
{
int f;
cob_init();
printf ("Fibonacci:\n");
f = Fibonacci();
}
I'm sure you could get this running on a Prop with prop-gcc and external memory. Firstly you need to be able to build libcob with prop-gcc, which may or may not be easy.
Err, sorry, this is all off topic for this thread. But I just had to answer.
It requires the purchase of an FPGA board, which is quite expensive if you want a "top of the range" P2 installation, which I think may be required here.
Then there is the whole hassle of getting the FPGA programmed with the P2 configuration. Not such a big deal but it's a big step away from no dependency on tools Gerry is after.
Then there is still the fact that the P2 does not exist and there is no firm date set for when it might. And, God forbid, it might never happen.
You sound more like a wet blanket than a heater. But candor is appreciated in these circumstances. Back to the drawing board?
Plain English eats up memory mainly because we knew it was there for the taking; we thus opted for simplicity in lieu of memory efficiency.
I totally understand. I hear people complaining about "bloat" in modern software. Which may well be true. It looked true to me for a long time, especially since the first machine I was actually paid to program had 4K of RAM and 2K of ROM (It was a Motorola 6809 board that we had to design build first) and we used to dream of having a full 64K of RAM in our CP/M computers!
So things like Javascript always seemed absurdly wasteful to me. Then one day I realized, I have a machine with 4 Gigs of RAM, my Javascript creations take a percent or two of that. RAM is crazy big and cheap today, even on little machines like the Raspi, why do I even think that is wasteful?
It does makes us marvel at the fact that back in the day they squeezed things like compilers for C, Pascal, PL/M and so on into a mere 32K or 64K of RAM. Also reminds us why those languages are not "Plain English", they are as much "English/maths like" as could be handled by such cramped machines and run efficiently.
Why not compile directly to Propeller assembler?
A Propeller has 8 processors, "COG"s. Each COG has 512 32 bit words of memory space to itself. They share 32KBytes of "HUB" RAM.
A COG can only execute it's 32 bit native machine instructions at full speed from within it's own 512 word COG space.
This is a crazy small limit but it's big enough to fit an entire byte code interpreter in. The Spin byte code interpreter. That interpreter, in turn, fetches and executes byte code from HUB memory.
Now, some time after the Prop was on the market it was discovered that a very tight native loop could fetch COG instructions from HUB, and execute them in the COG. This actually runs at about one quarter native COG speed and allows one to create PASM programs that fill the entire HUB RAM. This is the so called "Large Memory Model" invented by Bill Henning.
LMM is a target architecture of the Propeller GCC compiler (and Catalina C compiler, and the old ImageCraft compiler). Which means we can put big C programs into HUB.
It is also possible to run LMM code from external memory with prop-gcc. This all sounds awful slow to me.
You sound more like a wet blanket than a heater
Yeah, I do.
I do actually have an FPGA board here that someone kindly donated to the cause (Sorry I forget who it was, thanks again anyway). I did have a Prop 2 design installed and running on it. Then that design was scrapped and a total redesign started (It's a long story). After some long time the new P2 design is available an FPGA configuration. But then I found I don't have much time to explore it now a days, the delays kind of dampened my enthusiasm anyway, I'm starting to think I might not live long enough to get an actual P2 into my hands.
Still, I might chirp up one day, dig out that FPGA board and dive in there.
A Propeller has 8 processors, "COG"s. Each COG has 512 32 bit words of memory space to itself. They share 32KBytes of "HUB" RAM.
A COG can only execute it's 32 bit native machine instructions at full speed from within it's own 512 word COG space.
This is a crazy small limit but it's big enough to fit an entire byte code interpreter in. The Spin byte code interpreter. That interpreter, in turn, fetches and executes byte code from HUB memory.
Now, some time after the Prop was on the market it was discovered that a very tight native loop could fetch COG instructions from HUB, and execute them in the COG. This actually runs at about one quarter native COG speed and allows one to create PASM programs that fill the entire HUB RAM. This is the so called "Large Memory Model" invented by Bill Henning.
LMM is a target architecture of the Propeller GCC compiler (and Catalina C compiler, and the old ImageCraft compiler). Which means we can put big C programs into HUB.
It is also possible to run LMM code from external memory with prop-gcc. This all sounds awful slow to me.
Let me see if I've got the picture. Eliminating all the various "work arounds," the original vision for how the Propeller was intended to be used is:
1. The main routine(s) of a program would be written in Spin and would reside in the 32KB HUB RAM. This Spin code would be executed by an interpreter that is built-in to the Propeller's ROM running on one of the COGS.
2. These main routines would delegate various small tasks to the remaining COGS. The code for these delegated tasks would ideally be written in Propeller assembler and their assembled machine code would reside in the other COG's 512-instruction memory spaces and would execute from there.
Yes? Is that how the Propeller was intended to work?
If so, what's still not clear is:
a. Where (and in what form) is all this code (Spin source, Spin byte-code, Assembler, and/or machine-code) before start-up? The essential bits, I presume, would be in the Propeller's ROM in one form or another. Yes?
b. How does the code get from the ROM to the HUB RAM and the COG spaces? Some kind of automatic loader at start-up, I presume. Yes?
c. Is Spin code in the HUB RAM intended to be in source code or byte-code format?
d. Can the COG spaces contain data as well as instructions? Why are the COG spaces typically referred to as "512 32-bit words" instead of just "2K of memory"? Are there restrictions on how it can be divided up, or is it that the space was intended for 32-bit machine code instructions only?
a&b) Spin bytecodes, machine code, and initialized data normally reside in an (up to) 32K binary RAM image that is downloaded from an attached PC using a ROM resident loader started when the chip restarts. This loader can also copy the RAM image to an attached EEPROM (32K or more) or, if there's no attached PC, will copy the RAM image from EEPROM to RAM for execution.
c) bytecodes. There is a native Spin compiler/linker that requires an attached micro-SD card, part of a native "operating system" that provides an SD card loader, basic I/O library, and various utility programs. It has a low memory footprint in the 32K RAM, mostly for communication with the I/O drivers running in several cogs and buffers for the SD card I/O.
d) yes. The 512 word address space contains code, local data, and 16 control registers for the I/O pins, timer/counters, a system-wide clock. This space is addressed as 32-bit words only. The shared 32K address space uses special instructions for access and can be addressed as 32K 8-bit bytes, 16K 16-bit words, or 8K 32-bit long words with indivisible accesses occurring from each processor in a 200ns round-robin cycle. In other words, a read or write cycle for byte, word, or long word access can occur in a time slot assigned to a processor once each 200ns cycle.
A processor's memory is normally initialized from a 2K byte area of shared memory in 512 x 200ns such cycles (= ~100us) then the processor is started at its location 0. This is initiated using a special instruction. Anything beyond that point depends on the program loaded. The Spin interpreter is loaded from on-chip ROM as a special case.
Most programs consist of a main section, written in Spin or C, running interpretively under control of one processor plus several assembly I/O drivers (sometimes a floating point interpreter in one processor), each in one processor. Some I/O might be written in Spin or C under control of other copies of the appropriate interpreter, each in a processor. For example, serial I/O up to 9600 Baud can easily be done in Spin. Moderate speed I2C or SPI I/O devices can be done in Spin.
Some other programming systems work differently. There's a high speed Forth interpreter (Tachyon Forth) that uses one processor for a high speed receive UART, but otherwise loads a full Forth interpreter into the other processors. One of these is used to implement time-based multithreading.
Mike is always so quick to the draw. Here is what I wrote:
Gerry,
1. The main routine(s) of a program would be written in Spin and would reside in the 32KB HUB RAM. This Spin code would be executed by an interpreter that is built-in to the Propeller's ROM running on one of the COGS.
Yes.
At startup COG 0 gets loaded with the Spin byte code interpreter from ROM and starts to run. The interpreter then starts executing byte codes, of the "main" routine from HUB.
2. These main routines would delegate various small tasks to the remaining COGS. The code for these delegated tasks would ideally be written in Propeller assembler and their assembled machine code would reside in the other COG's 512-instruction memory spaces and would execute from there.
Yes, mostly sort of.
That "main" method is actually in a Spin object. The top level object.
Other objects can be used from the top level, and sub objects can use sub objects...
So the first thing is that you include a sub object into your top level object and then you can call methods of that sub object. e.g. subObject.calculateSomething(x, y)
Often a sub object is a hardware driver, like the FullDuplexSerial object. So you might call fds.init(9600, 8 , 1) or whatever the API is, I forget.
The Spin code of fds.init may well start another COG running some assembler code to do the high speed serial bit banging.
Then you will call fds.tx(char) and so on to send and receive data from the fds driver. The tx method will be written in Spin and take care of passing data to the PASM driver COG.
All of this is harder to describe than do. Best thing is find a classic example like the FullDuplexSerial in OBEX or in the examples included in the PropTool download and see how it all hangs together.
In short:
a) A Spin object can call methods in other Spin objects. Which all runs in the same COG.
b) A Spin object can start one or more new COG running one of it's Spin methods or some PASM.
a. Where (and in what form) is all this code (Spin source, Spin byte-code, Assembler, and/or machine-code) before start-up? The essential bits, I presume, would be in the Propeller's ROM in one form or another. Yes?
No. There is a 32K I2C EEPROM attached to the Prop. When a Prop is powered up it listens for a download on a serial link. If a download comes the received binary is written to the EEPROM. If no download is detected it loads the HUB memory from the EEPROM and then runs the code as described above. The ROM on board the Prop chip is not writable.
b. How does the code get from the ROM to the HUB RAM and the COG spaces? Some kind of automatic loader at start-up, I presume. Yes?
Yes, as above.
c. Is Spin code in the HUB RAM intended to be in source code or byte-code format?
Byte-code.
d. Can the COG spaces contain data as well as instructions?
Yes.
Actually that is an interesting question. In the COG architecture those 512 word memory locations can be seen as code space, were the native code is executed from, or register space. Or we could say that the COG is executing code from it's own registers. It is a 512 register machine!
Why are the COG spaces typically referred to as "512 32-bit words" instead of just "2K of memory"?
COG instructions all work on 32 bit wide data within the COG. You can of course pack byte or 16 bit data into those words but then you have to program all the masking and shifting to access them yourself. The code gets messy and big. I don't think anyone ever does that.
Accessing HUB RAM is different in that there are wrbyte, wrword, etc instructions to access bytes and 16 bit words. Keeping byte data in HUB is, I think, easier and quicker than keeping it in COG.
So I set about adapting my rude and crude C/Pascal like language compiler to generate PASM. That was a variant of Jack Crenshaw's "Tiny" language. There is a thread here about it somewhere.
But then came Brad Cambel and his BST Spin IDE that ran on on Linux, and Mike Parks HomeSpun Spin compiler in C#. So my humble compiler efforts soon got dropped. Now we have Propeller IDE and SimpleIDE and prop-gcc.
...and all such questions are now answered. What a remarkable little chip! The way the built-in font is stored is a little goofy, but it seems that everyone gets goofy here and there for one reason or another.
Incidently, has anyone else discovered that it's often easier to find something on a particular site by using Google rather than that site's own set of menus and/or search mechanism? Happens to me all the time, say, with the Windows API reference -- and here, where Googling "parallax propeller datasheet" took me directly to the PDF.
...and all such questions are now answered. What a remarkable little chip!
Can you list the 25 intel opcodes your code generator currently uses ?
Then, those used to doing code-engines, can map those to P1 and P2 opcodes, to get an idea of size and speed of a COG-Engine that directly uses those opcodes.
Incidently, has anyone else discovered that it's often easier to find something on a particular site by using Google rather than that site's own set of menus and/or search mechanism? Happens to me all the time, say, with the Windows API reference -- and here, where Googling "parallax propeller datasheet" took me directly to the PDF.
As one of the younger folks around here, I truly have no idea how any of you functioned before Google. I can only imagine everyone standing on their rooftops yelling "WHOEVER HAS THE CAT PICTURES BOOK, CAN I BORROW IT FOR AN HOUR!?"
Yes. Search on forums and other such sites has been terrible ever since the beginning of the web.
Mostly it's because such sites use an SQL data base, MySql, which is great for banks storing relational data but terrible for free text searching of your content.
This was a problem I hit in 1999 when working on a web site. We ended up investing in a rather expensive free text database that was mostly used to index all the articles published by news papers.
Now a days I think sites leaverage Google to do the searching behind their search boxes.
WHOEVER HAS THE CAT PICTURES BOOK, CAN I BORROW IT FOR AN HOUR!?"
Yep, just like that.
I lost count of the number of books I loaned out over the years that never came back.
Knowledge and know-how was like illicit drugs. It was shared in pubs and bars, or exchanged at secret meeting, like the monthly radio amateur club meetings. You could buy watered down versions of this knowledge substance in magazines about electronics, computing and such.
Then, those used to doing code-engines, can map those to P1 and P2 opcodes, to get an idea of size and speed of a COG-Engine that directly uses those opcodes.
Assuming P1 and P2 analogs for all those opcodes exist. I'm not sure how this helps, though. The big problem is not enough memory. Our compiler assumes there will be a nearly bottomless stack and gigabyte heap. Getting that to work in 512KB on a P2 may be possible, but it will require a major re-thinking of the whole thing.
That's not quite true, P2 can certainly run code now, on a fpga, which is fine for software development of this type.
I'm thinking this project, if it's at all feasible, will take about nine months. It would therefore be nice to have a production P2 in about six months at the latest. I search around a little but couldn't find an estimated target date.
A Real P2 might not quite make those time-lines, but a FPGA testable version exists now, and that is more on your critical path.
This is also why I suggested doing both Pi-3 Bare Metal and P2 - there will be many similarities in those flows, and you may find a common grouping of P2 & ARM opcodes that are quite similar to your Intel ones.
You may even revisit and re-mix those intel ones, once you choose P2 & Pi3 opcodes, to keep things 'more similar', and ARM you can field test any time you are ready.
You may even revisit and re-mix those intel ones, once you choose P2 & Pi3 opcodes, to keep things 'more similar', and ARM you can field test any time you are ready.
I don't think the problem is with the "opcodes". It would be easy if the overall architecture of the system could be transported to another platform and all we had to do was adjust the hard-coded machine instructions, even if we had to do them all by hand (there's not that many; given exact analogs of each instruction, I could do it in a day). The problem is architectural in nature...
This is also why I suggested doing both Pi-3 Bare Metal and P2 - there will be many similarities in those flows, and you may find a common grouping of P2 & ARM opcodes that are quite similar to your Intel ones.
The Pi can offer me a bottomless stack and a gigabyte heap; the P2 cannot. This kind of difference makes the opcodes a trivial matter.
I understand (I think) what you're trying to do -- make the thing more portable. What I don't think you realize is that the barriers to portability are primarily the API facilities (or lack thereof) on each platform. Windows, for example, gives me guaranteed access to a reasonably complete and device-independent and set of graphics routines on any machine running Windows XP thru 10: I draw a line and it appears exactly where I thought it would on any screen and any printer. The P1, future P2, and even the Pi have nothing similar to offer. So I have to figure out how to beg, borrow, or steal that functionality from elsewhere, and how to guarantee that it will be there on all the target machines. I could translate the opcodes a dozen times, by hand, in the time it will take to resolve that problem!
Comments
That's a good point - I'd forgotten about that Win10 - it will be a Windows subset, but it sounds like this only needs a subset anyway, and that would be an easy? way to port & test the Code generator, on the way to a Bare Metal solution.
A Win10 Pi3 version would be quite reasonable to most users, but a bare metal one would be more Embedded interesting, a la Project Oberon.
There is a thing called Wine in Linux that allows one to run Windows programs. It's far from complete and perfect but there are many Windows Programs that will run under Wine. Including my favourite circuit emulator LTSpice.
Now, a lot of Wine is built into a library libwine. What this means is that, if you have the source code, you can compile programs that use the Windows API on Linux and produce Linux binary executables. You just have to use libwine instead of the usual Linux libraries. The resulting executables can be run directly, like any other Linux program, without having to use the wine program.
I now recall that I actually did this about ten years ago. It was only a simple experiment for giggles but I did get to open a Window and have some dialogues in it.
This conversation has also dragged up an old memory of programming for the Windows API in assembler!
Basically, way back in the day I was wanting to program for this new Windows thing. So I did the usual thing of acquiring some hefty books on Windows and getting to grips with the API. One of those books was actually about Windows programming in assembler. That really floated my boat as I found it was easier to create Windows programs in assembler than C++ ! (I was far more proficient in assembler than C++ at the time)
So, there is the crazy idea. It might be possible to get Plain English to run under Linux with no changes to it's current code generator, just link it's output with libwine.
Note: Wine is not an emulator, so this only flies on x86 machines.
So building Plain English on Linux against libwine, and hence having it build itself on Linux seems quite doable.
Yes there is interest. And yes, there are people here who would love to help, so long as you can get them to focus on the job at hand and not go off on a tangent (myself very much included!).
Practical stuff. The native language of the propeller is Spin. If you can translate Plain English to Spin, it will run on the Propeller.
Typical objects are a mixture of spin and assembly. Many people don't worry about assembly - they just drop the object into their code and use it. An object might be something that runs a display or a keyboard or sound.
Spin is slower than assembly. For typical projects I write, I start with spin, get it debugged, then gradually move it over to assembly.
Languages like C and Basic were added later to the Propeller.
Later again, boffins came up with cached memory drivers so you could have megabytes of ram.
There is also a very clever assembly emulator that grabs code out of the propeller hub - this runs almost as fast as native assembly but gives you 32k of code space instead of the 2k cog limitation.
So it is worth knowing that you can have lots of memory, and you can code in multiple languages.
So for a "Hello World", behind the scenes it is going to be a video driver written in Spin/Assembly putting that text on a screen.
And then you need some tools - some way to convert Plain English into Spin to start with. Then Plain English into assembly. And for the purists, some way of converting Spin/Assembly to Plain English, so you can take existing object code and convert it back into Plain English.
That latter bit is a big task, but there are examples that have been done over the years converting C to Spin and Spin to C.
This avoids having to reinvent so many things - eg you could write a FAT SD driver from scratch. Or you could take an existing one and convert it to Plain English.
Start simple. Maybe take all the bit shift operators in Spin and all the parts of spin that have strange symbols (@@@ for instance) and think about what those would be in Plain English.
Or perhaps a very good tool for education. Bear with me...
Yes, I really like the idea of COGS replacing keyboard, mouse and other standard desktop-computer interrupts.
Agreed. Way more work as well. But your mention of Oberon, I think, is pertinent here. It's one of two systems -- the other being the original Pascal Macintosh -- that have really impressed and inspired me. Both had the same kind of minimalist elegance that made them useful for both work and education. Like Plain English, even if I do say so myself. And like the Propeller. Still bearing with me? Good. Now Mike tells me...
And it seems to me that a Propeller 2, as described above, even without the 4GB memory extension, is more machine than my 128K Macintosh ever was. So here's what I'm thinking. Why not develop a Plain English Oberon/Macintosh-inspired desktop computer using a Propeller 2? With the friendly face and fundamental capabilities of our Plain English system so it will be useful for both work and education. Specifically, it should be system enough so it can be used to:
1. Describe itself in a wysiwyg document of about 100 pages; and
2. Re-create itself by recompiling its own source.
That's the "useful for work part". The virtues of such a system for education should be obvious, but I'll list some of them here anyway:
1. The system will be educationally complete: the student, studying this system alone, will learn "all about" hardware, firmware, and software;
2. The system will be immaculately pristine: top-to-bottom, it will include only those things that are absolutely necessary and nothing else -- there will be no awkward compromises for compatibility's sake, and no distracting features with zero educational value.
3. The system will be remarkably friendly. Plain English through and through: code, documentation, the whole shebang.
So how about it, Heater? Is it feasible? I'll be Jobs if you'll be Woz and we can all play at being Jef Raskin and Andy Hertzfeld and Bill Atkinson. We'll shoot for something like this...
...end up with something like this...
...and offer it to the public, in pieces, as an educational kit.
Ya think? Anyone?
Or, given the goal described above, code up a simpler but sufficient file system of our own. My 128K Mac knew nothing of FATs and NTFSs and yet I was still able to get a lot of real work done on the thing. Oberon's file system is another example.
I'm all for simple -- at the start and all the way through! But I'm the kind of guy who needs a "lofty goal" to get motivated. And to keep myself focused -- "Does this thing here contribute to the lofty goal? No? Get back to work!"
The idea is it threefold:
1. It's well-nigh impossible for "the little guy" produce anything that combines beauty and function in a harmonious whole while trying to be compatible with all the Smile the world throws at us;
2. Educational products -- especially those that focus on fundamentals -- don't have to be compatible with other stuff, they only have to be self-consistent; and therefore
3. We can "do it up right" if we restrict ourselves to an "educational prototype kit" built on a reasonably powerful and elegant piece of hardware. I've looked around a lot and have seen very little in the way of hardware that fits that bill better than the Propeller.
Then make it slightly more complex. In Spin, the indentation is part of the language.
What does the Plain English compiler look like? How is it parsing commands? Is it a one or two pass compiler? Is it a compiler based on lookup tables of words?
Normally when porting the system we start with with something that gives feedback like making a built-in speaker beep. But now my inexperience shows; I suspect it's easier to hook an LED to a Propeller rather than a speaker. Anyway, once it beeps (or blinks) then we typically move on to a conditional beep (or blink), then some basic comparison stuff, then loops, calls, global variables, local variables, parameters, etc. I know the drill for that part of the problem.
Why Spin? Surely Spin (especially with that indentation stuff) doesn't execute directly on the thing -- it must be reduced to assembler (or at least byte-code) to actually run. But again my ignorance is showing. I was thinking the compiler would put out Propeller machine code -- for speed, for compactness, and so the student who studies the system doesn't have to learn anything but Plain English and the machine code it turns into.
The whole shebang is available here:
osmosian.com/cal-3040.zip
The source code is in six files; these are their actual names:
1. The Desktop
2. The Finder
3. The Editor
4. The Writer
5. The Compiler
6. The Noodle
They're all ascii text files even though they don't have the usual .TXT extension. You should be able to open them up even if you don't have a Windows system.
It takes 17 steps to fully compile a directory (all of the source code for a project must be in a single directory). Here's the top-level code:
To compile a directory:
Compile the directory (start).
Compile the directory (read the source files).
Compile the directory (scan the source files).
Compile the directory (resolve the types).
Compile the directory (resolve the globals).
Compile the directory (compile the headers of the routines).
Compile the directory (calculate lengths and offsets of types).
Compile the directory (add the built-in memory routines).
Compile the directory (index the routines for utility use).
Compile the directory (compile the bodies of the routines).
Compile the directory (add and compile the built-in startup routine).
Compile the directory (offset parameters and variables).
Compile the directory (address).
Compile the directory (transmogrify).
Compile the directory (link).
Compile the directory (write the exe).
Compile the directory (stop).
And here's what it does:
1. We start up, which resets some timers so we an see how long each step takes.
2. We read all the source files into memory.
3. We scan the source files looking for "sentence starters" and "sentence enders" and build three linked lists of the TYPE, the GLOBAL, and the ROUTINE statements we find. The ROUTINE records contain routine header information plus a list of the sentences found in each routine's body.
4. We "resolve" the TYPES by recursively examining them all, filling in "parent" TYPES as we bubble back out of the recursion.
5. We whip through the GLOBALs and point them to the appropriate TYPE records.
6. We compile the headers of the ROUTINES, hanging a list of PARAMETERS and a list of MONIKERS (routine-name fragments) on each ROUTINE record.
7. We whip through the types recursively and calculate their lengths -- primitive types first, then user-defined simple types, then user-defined record types including the offsets of fields in those records.
8. We add some "built-in" routines to the ROUTINE list for memory management.
9. We index the ROUTINES for "utility" use. This index is used later as a last resort when a matching routine cannot be found for a call. In essence, it reduces all of the routines to their bare-minimum characteristics.
10. We compile the bodies of the ROUTINES, sentence by sentence, searching for a matching routine header for each. If we find a match, we point that sentence to the ROUTINE record it should call. If we don't find a match -- even in the "utility" index -- then we report the error: "I don't know how to..."
11. We add in and compile the built-in startup routine which calls a specially-named library function to get things rolling.
12. We whip through all the the parameters and the local variables for each routine and fill in their offsets on the stack based on their TYPES.
13. We calculate absolute addresses for all the GLOBALS (including literals that we discovered as we were going along), and for all the ROUTINES.
14. We "transmogrify" everything that needs transmogrification. That is, we attach tiny little assembler code fragments to each record that needs them. A little snippet of assembler code, for example, to initialize a global variable, or to push a parameter onto the stack.
15. We "link" the whole shebang together into a PE-style executable in a big memory buffer. That is, we copy all those little fragments of assembler into the proper spots in the file buffer. We ignore anything that isn't used to keep the executable as small as possible. So everything in the Noodle (our standard library) gets compiled every time, but only the parts that are actually used are included in the executable.
16. We write the executable buffer to disk with an appropriate name derived from the directory name where the source files are stored.
17. We stop the last of the timers.
If the programmer selects the "List" command from the menu (instead of "Compile" or "Run") we produce a detailed listing of everything the compiler was thinking about (and generating) as it went along. You can see a piece of such a listing on page 88 of the manual, including a bunch of those assembler code fragments (in big-endian hex; or is it little-endian? I can never remember).
Hard to say, as you can see from the above. It makes only one pass at all the code, but it makes several passes at different parts of the code at different times. But it's very, very fast because everything is done with pointers to the source code: for example, a ROUTINE record contains only pointers to the start and end of the routine header, not the routine header itself; ditto for the statements in the body of the routine.
Yes, but not as you might expect. There are a number of "decider" routines that tell us whether a particular string is an article, or a preposition, or a conjunction, for example. Routines like this:
To decide if a string is any conjunction:
If the string is "and", say yes.
If the string is "both", say yes.
If the string is "but", say yes.
If the string is "either", say yes.
If the string is "neither", say yes.
If the string is "nor", say yes.
If the string is "or", say yes.
Say no.
But there are only a handful of those. The bulk of a program's vocabulary (and grammar, for that matter) is inherent in the TYPES, GLOBALS, and ROUTINE HEADERS that the programmer has coded. And those are indexed using hash tables (with linked lists for overflow) so we can the find the things we need quickly.
And that, I'm sure, is way more than you bargained for.
The Plain English compiler looks like this: https://github.com/Folds/osmosian You can see the compiler source in the file "the compiler". 4300 lines of, well, English looking text. Seems that is not useful without "the noodle", about 11000 lines, that has functions for code generation and seems to be a kitchen sink full of other unrelated stuff like GUI drawing primitives, number formatting functions, things to do with PDF rendering.
The other files, "the *", are the IDE and the rest of the system. For a total of about 23000 lines.
My take of this is that Gerry want to get this running on some simple platform, in the most minimalist, simplest way possible. In a self contained manner with no supporting junk. Such that the entire thing is understandable in some reasonable time by a student. Never mind issues like portability and so on. (Sorry Gerry if I'm a bit slow on getting that point). Does this sound like Chip's philosophy in designing the Propeller and Spin in the first place? I think so.
Can the Propeller run Plain English? I don't think so...
The Plain English Windows executable is nearly a megabyte. Running on my PC here I'm seeing it using about 40MByte (Although a lot of that can be because of the wine library and other junk, so let's say it needs 10 or 20MB to run in). Clearly this won't fly on the Propeller without bolting on a lot of external RAM.
Adding RAM like that tends to eat all your pins so you have probably lost VGA display and/or keyboard and or/mouse and/or SD card storage. One could of course use more than one Propeller and add peripherals to a second Prop.
As for code generation, Spin as a target is out of the question. It's limited to the 32K HUB RAM space. It's slow as hell.
Edit: I think perhaps Dr_A meant the Spin byte code not the actual Spin source code. The Spin byte code interpreter is built into the ROM of the Propeller. In theory on could compile Plain English or any language to Spin byte codes and have the Prop run it without ever using the actual Spin language or compiler.
So you would need to compile to that almost native instruction set of the Propeller called Large Memory Model (LMM) that can be executed from external RAM with the help of a tiny fetch, execute interpreter loop. This can be done as prop-gcc demonstrates. Or one could design ones own minimalist byte code and create a little interpreter for it in assembler. This can be done easily as my Zog project demonstrated for running C compiled to the ZPU byte code demonstrated. There are other examples.
This can all be done. As it happens I have a Gadget Gangster Propeller board here with 32MByte of SDRAM attached to prove the point. But I'm guessing it would be so slow as to be unusable. And the complexity of a multi-prop system adds up.
So what about the Propeller 2?
That would be much nicer. More speed, more COGS, more pins. The external RAM interface would be trivial. As would on chip video and so on.
Problem is the Prop 2 does not exist. I've resolved not to think about it until it does.
That's not quite true, P2 can certainly run code now, on a fpga, which is fine for software development of this type.
It requires the purchase of an FPGA board, which is quite expensive if you want a "top of the range" P2 installation, which I think may be required here.
Then there is the whole hassle of getting the FPGA programmed with the P2 configuration. Not such a big deal but it's a big step away from no dependency on tools Gerry is after.
Then there is still the fact that the P2 does not exist and there is no firm date set for when it might. And, God forbid, it might never happen.
It takes about 30MB to re-compile itself on Windows. On the one hand, that doesn't seem like a lot to ask from a 32-bit machine, especially these days when a kid's watch might have a gigabyte in it. On the other hand, the 512KB Macintosh ran quite well with, well, 512KB of memory (plus 64KB or so of ROM). Plain English eats up memory mainly because we knew it was there for the taking; we thus opted for simplicity in lieu of memory efficiency. I don't know, at this point, if a scaled-back version could fit in, say, 512KB or not. I'm positive it won't fit in 32KB.
Seems to me that more than one Propeller (or other single-processor chip) is like a discrete (as opposed to integrated) version of the Propeller itself.
Agreed.
It was his remark about the indenting that made me think he was thinking of Spin source rather than Spin byte-code.
Why not compile directly to Propeller assembler?
Again, on the one hand, we live in a world of gigahertz processors in freaking toys. While on the other hand, we remember our Mac that used to run at a single megahertz. And I think there's educational value in remembering that -- I'm hoping our students will someday wonder why the other guy's machines, which clock, say, 1000 times faster than ours, don't perform 1000 times better.
I was assuming Prop 2 when I posted above (with the pictures of the Macs). I think I said as much there too.
But I think it could be done. We just need a "static" build, all libs built into the executable, rather than a "dynamic" one. As it happens the cobol library, libcob, is available on my machine as a static library libcob.a. So this is possible.
First compile the COBOL to C: Then compile the C code statically: Ignore the warning. Then it runs:
$ ./fibo But...there is a tiny bit of magic in the home made main() I used: I'm sure you could get this running on a Prop with prop-gcc and external memory. Firstly you need to be able to build libcob with prop-gcc, which may or may not be easy.
Err, sorry, this is all off topic for this thread. But I just had to answer.
So things like Javascript always seemed absurdly wasteful to me. Then one day I realized, I have a machine with 4 Gigs of RAM, my Javascript creations take a percent or two of that. RAM is crazy big and cheap today, even on little machines like the Raspi, why do I even think that is wasteful?
It does makes us marvel at the fact that back in the day they squeezed things like compilers for C, Pascal, PL/M and so on into a mere 32K or 64K of RAM. Also reminds us why those languages are not "Plain English", they are as much "English/maths like" as could be handled by such cramped machines and run efficiently.
A Propeller has 8 processors, "COG"s. Each COG has 512 32 bit words of memory space to itself. They share 32KBytes of "HUB" RAM.
A COG can only execute it's 32 bit native machine instructions at full speed from within it's own 512 word COG space.
This is a crazy small limit but it's big enough to fit an entire byte code interpreter in. The Spin byte code interpreter. That interpreter, in turn, fetches and executes byte code from HUB memory.
Now, some time after the Prop was on the market it was discovered that a very tight native loop could fetch COG instructions from HUB, and execute them in the COG. This actually runs at about one quarter native COG speed and allows one to create PASM programs that fill the entire HUB RAM. This is the so called "Large Memory Model" invented by Bill Henning.
LMM is a target architecture of the Propeller GCC compiler (and Catalina C compiler, and the old ImageCraft compiler). Which means we can put big C programs into HUB.
It is also possible to run LMM code from external memory with prop-gcc. This all sounds awful slow to me. Yeah, I do.
I do actually have an FPGA board here that someone kindly donated to the cause (Sorry I forget who it was, thanks again anyway). I did have a Prop 2 design installed and running on it. Then that design was scrapped and a total redesign started (It's a long story). After some long time the new P2 design is available an FPGA configuration. But then I found I don't have much time to explore it now a days, the delays kind of dampened my enthusiasm anyway, I'm starting to think I might not live long enough to get an actual P2 into my hands.
Still, I might chirp up one day, dig out that FPGA board and dive in there.
1. The main routine(s) of a program would be written in Spin and would reside in the 32KB HUB RAM. This Spin code would be executed by an interpreter that is built-in to the Propeller's ROM running on one of the COGS.
2. These main routines would delegate various small tasks to the remaining COGS. The code for these delegated tasks would ideally be written in Propeller assembler and their assembled machine code would reside in the other COG's 512-instruction memory spaces and would execute from there.
Yes? Is that how the Propeller was intended to work?
If so, what's still not clear is:
a. Where (and in what form) is all this code (Spin source, Spin byte-code, Assembler, and/or machine-code) before start-up? The essential bits, I presume, would be in the Propeller's ROM in one form or another. Yes?
b. How does the code get from the ROM to the HUB RAM and the COG spaces? Some kind of automatic loader at start-up, I presume. Yes?
c. Is Spin code in the HUB RAM intended to be in source code or byte-code format?
d. Can the COG spaces contain data as well as instructions? Why are the COG spaces typically referred to as "512 32-bit words" instead of just "2K of memory"? Are there restrictions on how it can be divided up, or is it that the space was intended for 32-bit machine code instructions only?
c) bytecodes. There is a native Spin compiler/linker that requires an attached micro-SD card, part of a native "operating system" that provides an SD card loader, basic I/O library, and various utility programs. It has a low memory footprint in the 32K RAM, mostly for communication with the I/O drivers running in several cogs and buffers for the SD card I/O.
d) yes. The 512 word address space contains code, local data, and 16 control registers for the I/O pins, timer/counters, a system-wide clock. This space is addressed as 32-bit words only. The shared 32K address space uses special instructions for access and can be addressed as 32K 8-bit bytes, 16K 16-bit words, or 8K 32-bit long words with indivisible accesses occurring from each processor in a 200ns round-robin cycle. In other words, a read or write cycle for byte, word, or long word access can occur in a time slot assigned to a processor once each 200ns cycle.
A processor's memory is normally initialized from a 2K byte area of shared memory in 512 x 200ns such cycles (= ~100us) then the processor is started at its location 0. This is initiated using a special instruction. Anything beyond that point depends on the program loaded. The Spin interpreter is loaded from on-chip ROM as a special case.
Most programs consist of a main section, written in Spin or C, running interpretively under control of one processor plus several assembly I/O drivers (sometimes a floating point interpreter in one processor), each in one processor. Some I/O might be written in Spin or C under control of other copies of the appropriate interpreter, each in a processor. For example, serial I/O up to 9600 Baud can easily be done in Spin. Moderate speed I2C or SPI I/O devices can be done in Spin.
Some other programming systems work differently. There's a high speed Forth interpreter (Tachyon Forth) that uses one processor for a high speed receive UART, but otherwise loads a full Forth interpreter into the other processors. One of these is used to implement time-based multithreading.
Gerry, Yes.
At startup COG 0 gets loaded with the Spin byte code interpreter from ROM and starts to run. The interpreter then starts executing byte codes, of the "main" routine from HUB. Yes, mostly sort of.
That "main" method is actually in a Spin object. The top level object.
Other objects can be used from the top level, and sub objects can use sub objects...
So the first thing is that you include a sub object into your top level object and then you can call methods of that sub object. e.g. subObject.calculateSomething(x, y)
Often a sub object is a hardware driver, like the FullDuplexSerial object. So you might call fds.init(9600, 8 , 1) or whatever the API is, I forget.
The Spin code of fds.init may well start another COG running some assembler code to do the high speed serial bit banging.
Then you will call fds.tx(char) and so on to send and receive data from the fds driver. The tx method will be written in Spin and take care of passing data to the PASM driver COG.
All of this is harder to describe than do. Best thing is find a classic example like the FullDuplexSerial in OBEX or in the examples included in the PropTool download and see how it all hangs together.
In short:
a) A Spin object can call methods in other Spin objects. Which all runs in the same COG.
b) A Spin object can start one or more new COG running one of it's Spin methods or some PASM. No. There is a 32K I2C EEPROM attached to the Prop. When a Prop is powered up it listens for a download on a serial link. If a download comes the received binary is written to the EEPROM. If no download is detected it loads the HUB memory from the EEPROM and then runs the code as described above. The ROM on board the Prop chip is not writable. Yes, as above. Byte-code. Yes.
Actually that is an interesting question. In the COG architecture those 512 word memory locations can be seen as code space, were the native code is executed from, or register space. Or we could say that the COG is executing code from it's own registers. It is a 512 register machine! COG instructions all work on 32 bit wide data within the COG. You can of course pack byte or 16 bit data into those words but then you have to program all the masking and shifting to access them yourself. The code gets messy and big. I don't think anyone ever does that.
Accessing HUB RAM is different in that there are wrbyte, wrword, etc instructions to access bytes and 16 bit words. Keeping byte data in HUB is, I think, easier and quicker than keeping it in COG.
Or, you can skip all that Spin nonsense
Back in the day there was no Spin compiler or IDE that would run on Linux. Which almost had me give up on the Prop idea.
Luckily I found Cliff Biffle had written an assembler for the Prop in Java. http://cliffhacks.blogspot.fi/2006/10/propasm-propeller-assembler.html Others had reverse engineered the loader protocol. So one could write entirely in PASM using tools on Linux.
So I set about adapting my rude and crude C/Pascal like language compiler to generate PASM. That was a variant of Jack Crenshaw's "Tiny" language. There is a thread here about it somewhere.
But then came Brad Cambel and his BST Spin IDE that ran on on Linux, and Mike Parks HomeSpun Spin compiler in C#. So my humble compiler efforts soon got dropped. Now we have Propeller IDE and SimpleIDE and prop-gcc.
https://www.parallax.com/sites/default/files/downloads/P8X32A-Propeller-Datasheet-v1.4.0_0.pdf
...and all such questions are now answered. What a remarkable little chip! The way the built-in font is stored is a little goofy, but it seems that everyone gets goofy here and there for one reason or another.
Incidently, has anyone else discovered that it's often easier to find something on a particular site by using Google rather than that site's own set of menus and/or search mechanism? Happens to me all the time, say, with the Windows API reference -- and here, where Googling "parallax propeller datasheet" took me directly to the PDF.
Can you list the 25 intel opcodes your code generator currently uses ?
Then, those used to doing code-engines, can map those to P1 and P2 opcodes, to get an idea of size and speed of a COG-Engine that directly uses those opcodes.
As one of the younger folks around here, I truly have no idea how any of you functioned before Google. I can only imagine everyone standing on their rooftops yelling "WHOEVER HAS THE CAT PICTURES BOOK, CAN I BORROW IT FOR AN HOUR!?"
Mostly it's because such sites use an SQL data base, MySql, which is great for banks storing relational data but terrible for free text searching of your content.
This was a problem I hit in 1999 when working on a web site. We ended up investing in a rather expensive free text database that was mostly used to index all the articles published by news papers.
Now a days I think sites leaverage Google to do the searching behind their search boxes.
Ray
I lost count of the number of books I loaned out over the years that never came back.
Knowledge and know-how was like illicit drugs. It was shared in pubs and bars, or exchanged at secret meeting, like the monthly radio amateur club meetings. You could buy watered down versions of this knowledge substance in magazines about electronics, computing and such.
(1) Throughout the compiler in the "transmogrification" routines, as they're needed, like so:
To transmogrify a fragment (call external):
Attach $FF15 and the fragment's entry's address to the fragment's code.
To transmogrify a fragment (call internal):
Get an address given the fragment's routine.
Attach $E8 and the address to the fragment.
(2) Throughout the Noodle (our standard library of generally-useful routines) where speed was deemed essential:
To uppercase a byte:
Intel $8B8508000000. \ mov eax,[ebp+8] \ the byte
Intel $803861. \ cmp byte ptr [eax],'a'
Intel $0F820C000000. \ jb END
Intel $80387A. \ cmp byte ptr [eax],'z'
Intel $0F8703000000. \ ja END
Intel $802820. \ sub byte ptr [eax],$20
Note that Intel opcodes are sometimes 8 bits, sometimes 16 bits.
Assuming P1 and P2 analogs for all those opcodes exist. I'm not sure how this helps, though. The big problem is not enough memory. Our compiler assumes there will be a nearly bottomless stack and gigabyte heap. Getting that to work in 512KB on a P2 may be possible, but it will require a major re-thinking of the whole thing.
This is also why I suggested doing both Pi-3 Bare Metal and P2 - there will be many similarities in those flows, and you may find a common grouping of P2 & ARM opcodes that are quite similar to your Intel ones.
You may even revisit and re-mix those intel ones, once you choose P2 & Pi3 opcodes, to keep things 'more similar', and ARM you can field test any time you are ready.
The Pi can offer me a bottomless stack and a gigabyte heap; the P2 cannot. This kind of difference makes the opcodes a trivial matter.
I understand (I think) what you're trying to do -- make the thing more portable. What I don't think you realize is that the barriers to portability are primarily the API facilities (or lack thereof) on each platform. Windows, for example, gives me guaranteed access to a reasonably complete and device-independent and set of graphics routines on any machine running Windows XP thru 10: I draw a line and it appears exactly where I thought it would on any screen and any printer. The P1, future P2, and even the Pi have nothing similar to offer. So I have to figure out how to beg, borrow, or steal that functionality from elsewhere, and how to guarantee that it will be there on all the target machines. I could translate the opcodes a dozen times, by hand, in the time it will take to resolve that problem!