Adventures in interpreters

Spork Frog · 2008-07-29 01:19

Making an interpreter has always been one of those things I've wanted to do. I've always wanted to try and do a full QBasic 1.0 on the Prop, as far out and hard as that might be. I realized that I need a starting place, however.

Then I came across brainf*ck. It's simple enough, only 8 tokens, all whitespace ignored, no variables, no nothing. It's used a lot in obfuscated coding challenges.

After a couple hours of work and a lot of stupid mistakes, I have a mostly working interpreter (I believe it's correctly interpreting all 7 of the symbols I've implemented), with the "," operator not yet implemented. It's supposed to have a 30,000 byte stack, but there's not going to be enough room to do that. I'll take as much as I can.

Still have a few bugs to chase down yet, but I figure some people might enjoy playing with it as is.

VERSION 0.5

Current open bugs:
-Uses 13 as newline instead of 10 like most implementations.

Unimplemented features:
-Stack overrun detector

I'm as always very open to feature requests and bug reports.

Post Edited (Spork Frog) : 7/29/2008 2:45:17 PM GMT

Spork Frog · 2008-07-29 14:41

Version 0.5. I guess this could be a 1.0 release, but I haven't done enough testing yet to tell that.

Found out that most brainf*ck implementations use 10 as newline instead of 13 like most everything on the Prop does, have to fix that yet. Also need to add a stack overflow detector.

Added the "," symbol and fixed the other open bug, changed the stack to an array.

hippy · 2008-07-29 14:52

BF is an interesting challenge and a lot of fun. I'm not particularly fond of it as a language but it does give a good introduction which covers most bases when you come to do more complex interpretation. You have to start somewhere and BF is as good a place as any.

Cluso99 · 2008-07-29 15:14

10 ($0A) is <CR> carriage return which should just return to the beginning of line.
13 ($0D) is <LF> which means go to the next line, but may not necessarily return to the beginning of the line (depending on implementation).
Usually therefore <cr> + <lf> is issued. Usually <lf> will surfice, like on Hyerterminal and PST.

ps Please change the name.

hippy · 2008-07-29 17:41

Actually ...

10 ($0A) is LF
13 ($0D) is CR

What's "newline" ? Whatever you want it to be

heater · 2008-07-29 19:22

Cluso99: That's like asking to change the name of C or Pascal (or English) it's defined already.

Spork Frog: Don't forget that Windows and Unix(Linux etc) have different ideas about what is a newline.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

mikestefoy · 2008-07-29 19:42

but foul words are still foul words.

" (expletive)" is bad Anglo Saxon swearing.

I agree with Cluso99.

swearing and foul words have no place here or in software.

I always hated the FCUK brand in the UK.

if I named a language Bin Ladin would you also be so neutral ?

Mike

mikestefoy · 2008-07-29 19:46

isnt it funny.

this parallax board has just removed "ffuucckk" as an " (expletive), but seems to allows "brain(expletive)"

heater · 2008-07-29 20:03

mikestefoy: Actually I agree with you. I have the same instinctive reaction to these things. On the other hand we have to recognize that these "standards" are changing all the time. FCUK may be a shocking marketing tactic to get your attention today but for your children it's going to be really dull, if they get the joke at all.
It's always interesting to come across the decisions of censors of films, books etc from fifty or so years ago. They banned things that no one would bat an eye at nowadays.

As for the language interpreter we cannot mention, it may be aptly named. Perhaps trying to write any useful program in it does something bad to you brain [noparse]:)[/noparse]

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

hippy · 2008-07-29 20:18

Foul words are whatever we choose them to be. They have no special attributes other than some people don't like them, others use them as part of their everyday conversation. I'm personally of the opinion use them frequently and they become just words like every other.

I was once taken by surprise with an episode of the Married with Children TV sit-com and the classic line, "My mother always told me never to marry a Wanker". I didn't know at the time that Wanker was a reasonably common surname in the US. It means something quite different in the UK.

I'd not take offence at a programming language called "bin Ladin", but would recommend a TLA ( three letter acronym ) as being more traditional in computing circles which conveniently gives OBL or UBL depending on one's transliteration of Arabic. Which rather proves how arbitrary offence is; some might be offended at one but not the other.

Five down, five across. Some people will wet themselves, others won't understand ...

www.electraisd.net/alumni/display_class.aspx?y=1993

mikestefoy · 2008-07-29 20:28

@hippy. foul words are foul words, not what we choose them to be.

I could say the same for the horrible word "Cxxt".

I really really think that you are wrong. If your daughter came home and said to you " you are a "xxxx", would you smile and

say "Foul words are whatever we choose them to be. They have no special attributes other than some people don't like them, others use them as part of their everyday conversation. I'm personally of the opinion use them frequently and they become just words like every other."

I seem to get involved in these off topic discussions. maybe its becuse I was raised by Victorian grandparents.

Mike

Coley · 2008-07-29 21:04

hippy said...

Five down, five across. Some people will wet themselves, others won't understand ...

www.electraisd.net/alumni/display_class.aspx?y=1993

ROFL, that's got to be someone taking the mickey, surely

Nice one hippy!

As for the whole expletive thing they are just words, some people choose more appropriate ones than others but I don't worry about it

I'm sure it wasn't meant in an abusive context.

Regards,

Coley

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
PropGFX Forums - The home of the Hybrid Development System and PropGFX Lite

Beanie2k · 2008-07-29 21:12

But what is considered "foul" can change with time and place. Growing up in the 1960's I can remember when the words "damn" and "hell" were not allowed in print or on the air except in a religious context ("If we do not repent and turn to The Lord we will surely be danged to the eternal fires of heck!!!" just didn't cut it). On the other hand a certain·racial·term·was once common yet now is strictly forbidden. Hippy gave a good example of where differences occur with regard to place. There are many other examples of both phenomena. Currently the 'f' word in question seems to be in a state of transition with younger and more progressive people accepting it while older and more conservative people still find it offensive.

While we are on this off-topic topic: For those of you in the UK what is the status of the word "drat"? Here in the
USA it is no worse than "phooey" but I know in the UK it used to be a very offensive swear word. However I've heard that this has changed lately.

Post Edited (Beanie2k) : 7/29/2008 10:00:26 PM GMT

Coley · 2008-07-29 21:19

'drat' I must admit I have _never_ heard this used in the UK as a swear word, it's something dick dastardly would say isn't it...

and the only 'phooey' I remember was a mild mannered janitor called Henry......

LOL

Regards,

Coley

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
PropGFX Forums - The home of the Hybrid Development System and PropGFX Lite

Cluso99 · 2008-07-30 04:28

@Hippy. Of course you are right $0A = <LF> and $0D = <CR>

My comment was based on the fact that Spork Frog said there was an open bug using 13 instead of 10. It is more common to use 13 ($0D). Sometimes $0A is required first. Nothing about changing the spec - that's the way it's been for 30+ years. In the 60's $0A was just a line feed (no carriage return) and $0D was just a carriage return. The terminals required time to implement both these operations (with cams etc).

Phil Pilgrim (PhiPi) · 2008-07-30 06:14

This is one of those things that's maddeningly OS-dependent. For Linux/Unix, LF is the newline; for MSDOS/Windows, it's CRLF; and for OS/X, just a CR. Normally, it doesn't matter. But sometimes files formatted for MS OSes show up double-spaced in other OSes. (I typically just use LF in my PC programming and text file editing.)

Having two different characters for this stems from early teletype machines, wherein a CR (or its Baudot equivalent) returned the carriage to the home position, and a LF (or its Baudot equivalent) rolled the paper up. In TTY links that were unreliable, some machines could be configured to advance the paper automatically when a CR was received and/or return the carriage when a LF was received. This helped to eliminate accidental overprinting or running off the right edge of the page. (Microsoft, in clinging to the old TTY standard, apparently thought their systems unreliable enough to require two line-end characters — just in case. 'Kinda like wearing a belt and suspenders!

)

-Phil

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Still some PropSTICK Kit bare PCBs left!

Spork Frog · 2008-07-30 15:13

I have written down to change it to 10 as newline (meaning what would be a cr/lf on Windows machines, cr on Linux) because that's what almost all implementations of the language use for both newline and as a "return" or "enter" on the input, irregardless of what platform they run on. Most programs also depend on this, so for the sake of keeping compatability with currently written programs, I'm going to change it to that.

Adventures in interpreters

Comments