Plain English Programming
GerryRzeppa
Posts: 73
Hello! I hope this is the right place to post this.
I'm the co-author of the Plain English programming language for Windows which you can read about here:
http://www.osmosian.com/instructions.pdf
I'm thinking of creating a version that will work with the Propeller microcontroller and I'm wondering (a) if there's any interest in such a thing, and (b) if anyone here would like to help with the Propeller side of the project (I'm familiar with microprocessors, but have not yet worked with a Propeller).
In brief, using Plain English, code that used to look like this...
...all express the same thought.
But enough for now. Looking forward to your replies...
I'm the co-author of the Plain English programming language for Windows which you can read about here:
http://www.osmosian.com/instructions.pdf
I'm thinking of creating a version that will work with the Propeller microcontroller and I'm wondering (a) if there's any interest in such a thing, and (b) if anyone here would like to help with the Propeller side of the project (I'm familiar with microprocessors, but have not yet worked with a Propeller).
In brief, using Plain English, code that used to look like this...
...would look like this:OBJ
pin : "Input Output Pins"
time : "Timing"
PUB Blink
repeat
pin.High(9)
time.Pause(1000)
pin.Low(9)
time.Pause(1000)
Note that any reasonable English-language variant of the above would also work. For example, our compiler understands that the following statements, and a wide variety of similar ones...The red light is on pin 9.
To run:
Turn on the red light .
Wait for 1 second.
Turn the red light off.
Wait for 1 second.
Repeat.
Turn on the red light.
Turn the red light on.
Activate the red light.
Set pin 9 high.
...all express the same thought.
But enough for now. Looking forward to your replies...
Comments
If it already outputs a common-denominator language like C, then it should be easier to port.
The machine-specific code in the compiler does trivial things like allocate local and global variables, push and pop variables from the stack, and call functions. We only use 25 of the hundreds of machine instructions available on Intel CPUs.
The routines in the standard library are of three types. First, we have a lot of pure Plain English statements and routines, like these:
These will require minimal or no conversion.
Secondly, we have a wide variety of calls to functions built-in to Windows:
These will have to be converted to calls to corresponding built-in Propeller functions, if such exist -- or simply be eliminated.
And thirdly, we have some pre-assembled machine code (in hex) for primitive operations:
These will have to be converted to their hexadecimal Propeller equivalents.
If you have a Windows machine (any version from XP to 10) and would like to play around with our compiler, you can get the whole shebang, including the Plain English source code, here:
www.osmosian.com/cal-3040.zip
Just download and unzip; no installation necessary. Less than a megabyte. Follow the instructions in the aforementioned PDF and before you go ten pages you'll be recompiling the thing in itself -- in less than 3 seconds on a bottom-of-the-line machine from Walmart!
Yeah, that ought to make programming easier!!
Excellent.
There is a perl script that translates C to English with output that reads a bit like that. I can't get it to work just now.
The magic part is that you can translate that English back to the original C.
http://www.mit.edu/~ocschwar/C_English.html
The trick is to use the right tool for each job. "x=y+1" is obviously more compact than "put y plus 1 into x." But two things need to be considered.
1. "x=y+1" is an arbitrary and artificial syntax. Not everyone agrees that it is the most effective syntax. Some, for example, would prefer a nested prefix syntax like "( = x (+ y 1) )", while others might argue for a postfix syntax to eliminate nesting and parentheses: "y 1 + x =". Which one should we choose?
2. More importantly, the various mathematical notions are not good at expressing thoughts that are not mathematical in nature. Surely you can see that "Clear the screen" is more natural and self-explanatory than, say, something like "graphlib.clrscr(0,true);".
We believe that programs should be written in a natural language with snippets of specialized artificial syntax where appropriate. Like a typical math book. Or this very post. Why? Because it turns out that most of the statements in most programs are not mathematical in nature, and that a natural language syntax is more suitable for those statements than an artificial mathematical one.
Take our compiler, for example. It's a sophisticated Plain-English-to-Executable-Machine-Code translator that has 3,050 imperative sentences in it.
1,306 of those (about 42%) are conditional statements, and at least half of those are trivial things like these:
If the item is not found, break.
If the compiler's abort flag is set, exit.
The remainder of those conditional statements are slightly more complex, but all of them fit on a single line. Here are a couple of the longer ones:
If the length is 4, attach $FF32 to the fragment's code; exit.
If the rider's token is any numeric literal, compile the literal given the rider; exit.
Of the remaining sentences:
272 (about 9%) are simple assignment statements:
Put the type name into the field's type name.
202 (about 7%) are just the infrastructure for various loops:
Loop.
Get a field from the type's fields.
[ other stuff here]
Repeat.
183 (6%) simply add something to the end of this or that list, like so:
Add the field to the type's fields.
164 (about 5%) are trivial statements used to return boolean results, start and stop various timers, show the program's current status, and write interesting things to the compiler's output listing.
Say no.
Say yes.
Set the variable's compiled flag.
Start the compiler's timer.
Stop the compiler's timer.
Show status "Compiling...".
List the globals in the compiler's listing.
119 (about 4%) advance the focus in the source code, sentences like:
Bump the rider.
Move the rider (code rules).
92 (about 3%) are used to create, destroy and keep internal indexes up to date, sentences like:
Create the type index using 7919 for the bucket count.
Index the type given the type's name.
Destroy the type index.
58 (about 2%) are used to find things in various lists:
Find a variable given the name.
37 (about 1%) are calls to various conversion routines:
Convert the rider's token to a ratio.
31 (about 1%) are used to generate actual machine code (plus those that appear in conditional statements, as above):
Attach $E8 and the address to the fragment.
And that accounts for 80% of the code in our compiler.
Only 57 of the remaining sentences (less than 2% of the whole) are mathematical in nature, a line here and there like these:
Add 4 to the routine's parameter size.
Subtract the length from the local's offset.
Multiply the type's scale by the base type's scale.
Calculate the length of the field's type.
Round the address up to the nearest multiple of 4096.
And the rest are not formulaic at all. Stuff like:
Copy the field into another field.
Append the fragment to the current routine's fragments.
Abort with "I was hoping for a definition but all I found was '" then the token.
Initialize the compiler.
Remove any trailing backslashes from the path name.
Reduce the monikette's type to a type for utility use.
Eliminate duplicate nicknames from the type's fields.
Prepend "original " to the term's name.
Extend the name with the rider's token.
Unquote the other string.
Read the source file's path into the source file's buffer.
Generate the literal's name.
Extract a file name from the compiler's abort path.
Write the compiler's exe to the compiler's exe path.
Swap the monikettes with the other monikettes.
Skip any leading noise in the substring.
Scrub the utility index.
Fill the compiler's exe with the null byte given the compiler's exe size.
Position the rider's token on the rider's source.
Pluralize the type's plural name.
Link.
Finalize the compiler.
Check for invalid optional info on the type.
And that's why we say that most of what most programs do is easy stuff, stuff that can be conveniently expressed in a natural language. And that, in turn, is why we like programming in Plain English: the thoughts in our heads are typed in as Plain English "pseudo code" and, with a tweak here and there, that pseudo code actually compiles and runs. And is self-documenting, to boot.
What size is that x86 EXE file ?
I notice no END IF equivalent mentioned yet, and the language seems to divert from nested code & block structures.
Is there an ELSEIF or a CASE equivalent ?
And it is the greenest language of all. It even has a environment division...
Enjoy!
Mike
Ray
IF statements in Plain English end with a period as they do in normal written English; no END IF is required: "If the abort flag is set, exit." Semicolons are used to include subordinate "thoughts" in an IF statement, again, as they are normally used in written English: "If the abort flag is set, report the error; beep; exit."
Our goal is to make Plain English the most natural programming language we can. We want to be able to "say what we're thinking" as we code -- or, more precisely, we want to be able to say things the way a an uninitiated, non-mathematically-inclined child would say them so that the code will be both easy to write and easy to read. We therefore actively discourage nested code -- and I hasten to add that we did not find this inconvenient, even when writing a program as large as a complete IDE or as technical as a native-code-generating compiler.
We have also eliminated most of the "keywords" typically associated with block structures. LOOP is one of the very few non-grammatical keywords in Plain English. (Our keywords are typically grammatical in nature: articles like A and AN and THE; prepositions like IN and OF and AROUND; and state-of-being verbs like IS and ARE.)
There is no ELSEIF. The equivalent of a CASE statement is simply a series of IF statements, typically followed by an EXIT or a REPEAT or, in the case of a routine that returns a boolean result, a SAY YES or SAY NO:
Please keep three things in mind:
1. Our Plain English system is a minimalist system; we put into it only those things that were absolutely necessary to see if the idea was sound.
2. We have found this minimal system to be both convenient to use and efficient in operation. In short, we haven't found ourselves "wishing" we had nested IFs or more keywords. Quite the contrary; we find the simplicity of the language not only refreshing, but a real advantage when it comes to teaching children how to program.
3. Plain English is not like other languages in other ways, as well. For example:
a. We do not give artificial, proper-noun names to local variables, just as we do not give artificial names to most of the real-life objects around us (the chair, the mouse, the screen). We simply use the normal descriptive identifier: the chair, the mouse, the screen.
b. We actively encourage the programmer to extend the language -- vocabulary and grammar -- to suit his personal needs. For example, instead of a routine with a header like this...
...which can be called in only one way, we recommend "a header with variations" more like this...
...which can be called in a variety of sematically-equivalent ways. The language thus grows in both ease of use and in "understanding of the local dialect" as a programmer (or a community of programmers) includes more of what naturally "pops into his head."
It's kind of like FORTH on steroids for people who have never programmed before.
The whole system, including the interface, the language, and the standard library is described in this 120-page manual (written as if the compiler itself were speaking, imitating his idol, the HAL 9000):
www.osmosian.com/instructions.pdf
Ray
The size gives a indication of how this might port to Propeller.
P1 has 8 small cores, and there is a 'bump' in the design approach when code cannot fit into one core, and another 'bump' when you exceed the chips 32k total RAM.
If you want to just generate binaries (not self host) things are simpler, but you still have the choice of native or byte code approach.
If you have carefully used a 25 opcode subset of Intel, maybe that can port to a byte-code engine, but you still have the 32k P1 RAM Chip limit.
P2 is a whole different story, It may even be able to self-host this, and this sounds broadly similar to Project Oberon (also a small IDE, that self builds and it its own compact 'operating system' ) - Example HW has 1MB of RAM.
P2 targets 512k of system memory, and can execute code from that memory - for that, you would look to map those 25 intel opcodes directly to P2 ones.
The size of the IDE / compiler is one thing. It can run on a PC or whatever so who cares about that?
The size of the produce executable code is another thing. Unless it requires the IDE/compiler/run time to actually run.
It's that latter size that is interesting. Is there any chance of a compiled executable running on a P1 or P2.
Do you have a Linux version of this compiler so that I could try it out here?
Still looking forward to seeing how that fibo function looks in Plain English Programming. And how big is the executable?
I need to learn more about the COGS to answer that. In general, we would use them as they are normally used only the "interface" would be Plain English.
We greatly admire the work of Niklaus Wirth, especially his Oberon system. I'm going to have to look more closely at P2 as far as self-hosting goes. It would be great to have a system that was Plain English, top to bottom, without the need for ugly stuff like Windows or Linux underneath.
Running it under wine here without trouble.
Our executables are very tight and do not require run time libraries (except, of course, for the functions provided by Windows itself).
That appears to be the question. Our IDE can be significantly scaled back in size -- for example, it includes an entire wysiwyg page editor that we use to produce documentation and that is almost as big as the compiler itself. It also includes a PDF generator and other such stuff that could be left out.
Sorry, Windows only.
796 KB.
As I mentioned before, one would not normally use a natural language to describe the fibo function; the most natural way to describe it, therefore, is in some artificial mathematical syntax (as you have done). A similar case is found where we want to describe the actual bits that add a byte to another byte on an Intel CPU. In Plain English, the code looks like this:
Natural language where it is the obvious choice, hexadecimal notation where it is more convenient. So your fibo function would probably be implemented something like this:
Our compiler doesn't currently support C as a sub-language, primarily because we didn't experience the need for such a thing -- as I mentioned above, less than 2% of our code is mathematical in nature (an that's typical for most applications). But surely you can see that it's easier to add a C-language sub-compiler to a Plain English compiler rather than the other way around.
I think you would be well served by removing the vulgarity. I get that this is light hearted and fun, but there are certain sentences that just don't belong... especially if this is meant for kids. Top of page five and the second paragraph on page 12 of instructions.pdf have caught my eyes already as things that could be toned down a bit.
Also, the font is cute for the first 60 seconds. It's getting really old, really fast though. Maybe use it in the title and headings if you don't want to get rid of it completely.
The issue you will hit here, is the small size of just 500 odd opcodes per COG.
That usually means two code generators - one that generates native opcodes, but has a size limit, but can be used to place small independent real-time programs into up to 7 COGS.
The 2nd code generator generates byte-code, that runs on the 8th COG engine - here, you would likely emulate the 25 intel opcodes (or some size-tuned variant?) - that should fit in the 500 limit ?
Speed is lower, but code size is quite a lot larger.
If this engine can optionally read from 2 x QuadSPI memories, then you can avoid the 32k limit.
It's best to follow the instructions as they appear in the manual; the sample application found there is much broader and deeper than a mere "Hello, World!" teaser and all will become clear as you progress. But if you insist:
1. Make a new directory for your project. Put it wherever you want and call it whatever you like. The New Directory command, as you might expect, is under then "N" menu.
2. Copy "The Noodle" from the CAL-3040 directory to your project directory. This is how you make all of the standard library code part of your project.
3. Create a new text file for your code in the same directory. Call it whatever you want, but do NOT put an extension (like .txt or .src) on it. The compiler only compiles files with NO extension.
4. Open your text file and enter the following code:
To run:
Start up.
Write "Hello, World!".
Wait for 3 seconds.
Shut down.
5. Execute the Run command. It's under "R".
Agreed. The manual is kind of a mixed bag regarding the intended audience. When I wrote it ten years ago I didn't really know who might be reading it.
We developed that font for three reasons: (1) we wanted a font that we owned so we could include it with the package to make sure the system would look and feel the same on everyone's computer; (2) we wanted everyone to know, right off the bat, that Plain English is different; and (3) it was a tribute to my elder son and co-developer (the font is a digitized version his very own handwriting).
You can change the font in the editor if you want to use a different font when you're programming. I would suggest something like "Times New Roman," however, since you'll find -- after you've gotten used to it -- that a proportionally-spaced font works best with Plain English (rather than the mono-spaced fonts that are typically used with mathematically-inspired languages).
1. Obviously, hosting Plain English on a Propeller is a different matter than simply having our compiler generate code suitable for downloading to a Propeller. To make the system usable we'd need not just a basic Propeller board, but something with at least a video output and inputs for a keyboard and a mouse. And by the time we got all that together, both the cost and the complexity of the system would make me start to wonder if a Raspberry Pi -- or even a cheap Windows laptop, since the compiler already runs on Windows -- wouldn't be a more economical and effective programming setup.
2. So what about keeping Plain English on the PC, and downloading to the Propeller? Well, it's true that the Propeller is a remarkable machine, especially in the way that it eliminates interrupts and makes programs completely deterministic. I like that. But it does seem to be rather specialized. Most applications can cope with interrupts, and most applications do not require strict determinism. And because these features are part of the hardware, even when such features are needed, it's not often that the machine exactly fits the bill -- how many applications, for example, require exactly eight COGS (and not seven or nine)? Hmm...
3. Some years ago I developed a restaurant system with "transmitters" on the tables for calling a server, and "dispatchers" on the walls so the servers could ask the system where they were needed. The transmitters were radio devices and used a small PIC chip, and the dispatchers were hard-wired and designed around a Zilog Z8. Here's a photo of one of the dispatchers:
A lot of work (and money) went into their development and packaging. So I'm asking myself, if I had to do that system today, would I do it the same way? Probably not. I'd probably get a bunch of cheap tablets that I could use as both transmitters and dispatchers, connect them together with WiFi or Bluetooth or something, and program them from a PC running Plain English. Overkill, no doubt; but in today's market place, I'm afraid, a viable solution -- especially for the small entrepreneur attempting to develop a prototype or proof-of-concept system.
4. I'm not saying that there's no longer a place for microcontrollers. But it does seem like the tablet-style overkill I just described may soon be the way most hobbyists and prototypers will be going. For example, this guy...
...wanted to make a LEGO Rubik's Cube Solver and he chose to use a phone for his microcontroller. Probably because it was a simple and inexpensive way to get a super-fast CPU, gigs of memory, a display, touch-screen controls, and a camera, all in a package just a little bigger than a sticky-note pad. Goodness! That's a solution that's hard to compete with.
5. I think you can see where this is going, at least in my mind. I'm beginning to think the future of Plain English lies elsewhere. But I welcome any further thoughts you folks might have, and I thank you for all your insights so far.
Clearly Plain English Programming is intended for the not so mathematically sophisticated. Those new to programming who would like to get their computer to do something.
That is exactly in line with the purpose of the Raspberry Pi.
Given that the Pi has now sold getting on for 10 million units and many of those will be in the hands of youngsters new to computing, and hobbyists and in schools etc, it's a no brainer to suggest that the Pi is where you should be putting your efforts.
First order of the day is to get the thing to run under Linux. Then generate ARM executables.
That's not to say we would not like to see it generate code for the Propeller but my feeling is that it's not going to work out on the current Propeller. It may work on the P2 but that does not exist yet. Realistically the audience for such a thing is vastly bigger in the Raspberry Pi world, and indeed Linux in general.
If the idea is ever going to fly at all the Pi world is certainly a good place to try it out.
You have written a lot above and it's very interesting. Of course I have to take issue with much of it I'm going to have to read it over again carefully before commenting.
No, don't go!! This is really interesting.
I think cogs could be even easier. I've done a bit of coding in assembly - 8080, z80, cogs, other code, and usually what it involves is learning some mnemonics and then writing them down and then writing a long version in Plain English next to that line of code to explain what is actually happening;
mov a,b ; move the contents of the b register to the a register
add a,c ; add the contents of the a register to the b register and store in the a register
Well, in the Plain English language, do it exactly the same, just leave out the mnemonics!
It might even make porting code between processors even easier - everyone understands what registers are, but every processor seems to use different mnemonics.
Surprisingly, that's easier said than done. Unlike Windows, Linux is a "moving target." For example, I can go here...
https://msdn.microsoft.com/en-us/library/windows/desktop/ff818516(v=vs.85).aspx
...pick out a small, well-documented subset of the system, and write a program that, with no installation at all, will run and look and feel and behave exactly the same on all versions of Windows. I can't do that with Linux. There's no useful subset of the thing that is guaranteed to be the same on all Linux computers. There's no authoritative source for documentation. There's no universal API.
So when you say, "get the thing to run under Linux," do you mean Debian-based, Knoppix-based, Ubunto-based, Gentoo-based, Pacman-based, Arch-based, RPM-based, Fedora-based, RHEL-based, Mandriva-based, open-SUSE-based, Slackware-based, or some other version of "Linux"? And which API were you thinking we'd be employing for our graphics, printer, and other user-interface stuff: Unity? Gnome? KDE? Xfce? Cinnamon? MATE? LXDE? XMonad?
Don't get me wrong; I'm no big fan of the "cathedral." But I'm also no fan of the "bazaar." I personally prefer the "monastery" -- where an inspired abbot and a small group of enthusiastic monks work together to produce something of lasting value. Kind of like Apple during the Pascal-based Macintosh years.
Yes, that's exactly the idea. Make the "comments" and the "pseudocode" the actual code. We've essentially done that on Windows, and it turns out it's not that hard to do -- if the target, like Windows, (a) isn't a moving target, and (b) is useful and popular enough to justify the development effort.
I think we have to get a little "higher level" than registers to achieve true portability since the number and size (and even existence!) of registers is not the same on every machine. But yes, Plain English code, I predict, will one day be the most used and most transportable code of all. We can see it happening already: I can ask Plain English questions and issue Plain English commands like "What day is it?" and "What's the temperature?" and "Play a song by Paul Simon" to systems as different as a Microsoft's Cortana, Apple's Siri, Amazon's Echo, and even Google's web browser (with an extra click) and get the correct response from all four. Now if such systems not only had Plain English interfaces, but were actually programmed in Plain English... we'd be done!
As a practical matter, what am I suggesting?...
1) A program like a compiler is basically a simple thing, source text in, binary out. This can be usable from the command line with whatever parameters it needs.
So, for example, such a thing can be written in C/C++. If it uses nothing but the C/C++ standard libraries or perhaps POSIX API's it will be buildable and usable on pretty much every platform in existence. Contrary to your "moving target" these API's standardized and have been very stable over many years.
Of course it need not be C/C++. What about Pascal, see Free Pascal, or many other languages that are available on many platforms?
2) No doubt you want a GUI. For an IDE or whatever. This is a solved problem. My favourite is the Qt GUI toolkit. Or for Pascal there is Lazarus.
As a testament to all this Parallax offers Spin and C/C++ compilers and IDEs that work on Windows, Mac and Linux. See PropellerIDE and SimpleIDE.
In short, I take a different view. To build "lasting" value you need to be cross platform. So that your creation can live on for many years into the future despite the comings and goings of platforms. I would suggest this is even more important for programming language systems where users will invest considerable time and effort using it for their own creations which they in turn want to be long lived.
The Windows world is not a "monastery". "Shopping Mall" or "Prison" perhaps
First step is to get the source of Plain English programming up onto github or bitbucket or where ever.
PropBASIC gets pretty darn close to "plain English" but without the tedious typing. At least, this lazy programmer would find the typing to be tedious.
I am all about self-documenting code, mind.
Again, I don't recognize the program you're talking about. One of the goals of our system is to demonstrate that low-level programs (like compilers) can be conveniently and efficiently written in high-level languages (like English). How can we show that we've accomplished that if the thing has been re-created in C/C++? Another goal was to create a language suitable for both the extreme novice writing "Hello, World!" and the professional writing IDEs and compilers and page-layout facilities. That, again, gets lost in the translation to C/C++ (languages that are hardly suitable for the uninitiated).
See above. The only acceptable source language for a Plain English compiler is Plain English. One must practice what one preaches, yes?
We don't just want a GUI, or your favorite GUI, or someone else's preferred GUI, we want the same GUI everywhere so we only have to write our code once.
Sure, it can be done -- at significant cost to the producer. And it makes everything less clear and convenient for the consumer: just look at the download page on this very site. Such cross-platform support may be a business necessity for some; it's not for us (because we're trying to make a point rather than trying to make money). Personally, I wish they had spent more time on making this forum wysiwyg rather than the edit-here-preview-there-come-back-to-make-corrections thing that it currently is. That's the kind of user-interface design I would have expected in the 1970's on DOS -- not 16 years into the 21st century!
Agreed. And you're entitled to that view.
I think Steve Jobs would have disagreed. The war is over and Apple won. The most closed, least-cross-platform company is the one that is now worth much, much more than Microsoft and all the Linux vendors put together.
The original version of our compiler, written over ten years ago, still runs and is still relevant today. No changes. That's remarkable longevity in the computer industry. And while such longevity may be achieved using the methods you're advocating, we achieved it employing an entirely different approach. Though I'm pretty sure a great idea would survive either way if the product retained it's identity in both approaches.
As long as our program still runs, the programs coded with our system will also run. I'm not sure one can ask for more than that.
Windows is (and has always been) the cathedral. Apple, in the salad days of the Pascal-based Mac, was the monastery. The Osmosian Order of Plain English Programmers is a smaller monastery whose time has not yet arrived.
Had it, done it, been there. https://github.com/Folds/osmosian/issues/4