Compiler frustrations

Phil Pilgrim (PhiPi) · 2009-05-25 06:24

I have a simple task: construct a table in the DAT section, each entry of which contains a fixed or variable-length ID string, followed by several predefined longs. 'Sounds easy, right?

First with the fixed-length strings:

testsuite     long      "A15 ", 1 << 15, 0, 5000, 0, 0, 1000, 2000

Well, that's a bad idea: each character in the string "A15 " takes up its own long.

How about:

testsuite     long      byte "A15 ", 1 << 15, 0, 5000, 0, 0, 1000, 2000

Nope. The compiler flags byte and reports an error: "Size override must be larger." Why, for cripes sake? This seems like a perfectly logical use of an override to me.

Okay, then, how about:

testsuite     byte      "A15 ", long 1 << 15, 0, 5000, 0, 0, 1000, 2000

No again. The long only applies to the "1 << 15"; the other arguments are truncated to bytes.

Well, then (grasping at straws):

testsuite     byte      "A15 ", long (1 << 15, 0, 5000, 0, 0, 1000, 2000)

Nice idea, but no: overrides can't be grouped.

This works, of course:

testsuite     byte      "A15 ", long 1 << 15, long 0, long 5000, long 0, long 0, long 1000, long 2000

But it's gonna be a big table. Who would want to type all those "long"s? Besides, I might leave one out and screw everything up.

This works, too:

testsuite     byte      "A15 "
              long      1 << 15, 0, 5000, 0, 0, 1000, 2000

But the table not only won't be as readable, it'll be twice as long on the screen.

You know what I'd like? A string constant type that packs four characters into a long, padding with zeroes, if necessary: e.g. pack("ABC") == $00434241.

Okay, enough with fixed-length strings. Variable-length strings would be even nicer:

testsuite     long      string("A15 Test"),1 << 15, 0, 5000, 0, 0, 1000, 2000

Sorry. The compiler flags "string" and says "Expected a constant, unary operator or '('." I thought strings were constants. Why can't it stuff a zero-terminated string somewhere like it does in Spin code and poke its address into the table? Of course, the rules for handling @variable items in DAT sections would have to apply, since the absolute address isn't known at compile time, but so what?

Sure, I could do:

testsuite     long      @id0,1 << 15, 0, 5000, 0, 0, 1000, 2000
              '...
id0           byte      "A15 Test", 0

But why should I have to? Besides, it ruins the readability of the table.

This all makes me very grumpy. I love programming, and I don't like feeling grumpy when I'm doing something I love. Grrr!

-Phil

Post Edited (Phil Pilgrim (PhiPi)) : 5/25/2009 7:22:07 AM GMT

MagIO2 · 2009-05-25 07:45

hmmm ... didn't you read beaus posts recently? ;o)

testsuite long
          byte "A15 "
          long 1<<15, 0, 5000, 0, 0, 1000, 2000

Phil Pilgrim (PhiPi) · 2009-05-25 08:09

That's pretty much one of my examples above. My objective is to get an entire record on one line, since there will be so many of them. I don't have many peeves with Spin and assembly, but one of them is a profound lack of expressiveness when it comes to compile-time constructors for strings and other data structures. It needs to be better.

-Phil

MagIO2 · 2009-05-25 08:19

Then maybe
testsuite long byte "A15 " long 1<<15, 0, 5000, 0, 0, 1000, 2000
? This would allow to access the string as a long, as the label itself is declared as long. (Not able to test it currently)
PS: I see .. this is the last version you mentioned .. couldn't see the trees because of the wood ;o)

PhiPi said...
How about:
testsuite     long      byte "A15 [color=red]", 1 << 15[/color], 0, 5000, 0, 0, 1000, 2000
Nope. The compiler flags byte and reports an error: "Size override must be larger." Why, for cripes sake? This seems like a perfectly logical use of an override to me.

Guess your problem here is that you switched to byte and did not switch back to long. 1<<15 of course is to big for a byte.

But you are right ... at least having macros would have helped here to write more readable code.

Post Edited (MagIO2) : 5/25/2009 8:38:05 AM GMT

Ale · 2009-05-25 08:22

Phil:

I know what you mean. I do not think any compiler (asm compiler) will accept that, mixed long/byte constants like what you want. Whit a macro assembler I would solve the problem using a macro.
But, if you make a small <insert preferred scripting language here> to convert your input to the desired output, I know it is two steps but if the table is just one table with little changes... It is what I end up doing normally :-(, at least for pasm. binutils' as on the other hand supports a nice preprocessor like cpp. Now that could be an idea, did you try with cpp as preprocessor ? It may work...

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Visit the home of pPropQL, pPropQL020 and OMU for the pPropQL/020 at omnibus.uni-freiburg.de/~rp92

Phil Pilgrim (PhiPi) · 2009-05-25 08:35

MagIO2 said...
Guess your problem here is that you switched to byte and did not switch back to long. 1<<15 of course is to big for a byte.

No, actually the problem that I illustrated is that you cannot override a long with a byte: it gives a syntax error. Also, in point of fact, overrides do switch back automatically, which is another factor I pointed out in my post. Finally, the compiler doesn't care whether a constant doesn't fit the stated size: it just truncates it.

Here is what I would like to see:

1. A constructor to pack character strings in situ into words and longs.

2. Automatic handling of strings in the DAT section.

3. Ability to override large data types with small ones and back again. There's no reason the compiler can't use a zero filler for any alignment gaps that result.

-Phil

Leon · 2009-05-25 08:41

I use the m4 macro-processor for things like that.

Leon

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle

Phil Pilgrim (PhiPi) · 2009-05-25 08:41

Ale, Leon,

Yup, a preprocessor might assist, although more fundamental changes to the language itself are called for in this case, I think. I've been lobbying for preprocessor hooks in the IDE for some time now...

-Phil

hippy · 2009-05-25 11:44

Phil Pilgrim (PhiPi) said...
I've been lobbying for preprocessor hooks in the IDE for some time now...

Seems like forever. LMM coding within PropTool would be so much easier with macros. Whether macros and other preprocessing is done within the PropTool itself or through a hook to allow pre-processors to be called doesn't really bother me ( the later is most flexible ).

Using external tools with the PropTool is currently a horrible kludge at best, frustrating and prone to error. I know there are third party alternatives but that's not really the point.

Adding some sort of macros would be my priority to improving PropTool, followed by conditional compilation.

If one has regional library code for products ( say PAL and NTSC TV ) you either need separate PAL and NTSC drivers so twice the effort to maintain, or a single run-time configurable version which adds unnecessary overhead if only one or the other is actually required. As separate objects need different file names, all objects which use those also need to be unique objects and so it goes, an ever expanding list of variants and a maintenance nightmare in the making. To make life easier one is forced to go with single source and run-time configuration.

Mike Huselton · 2009-05-25 15:09

Aside from ownership issues from Parallax, what do you think of reverse-engineering the Parallax Propeller IDE to replace some hot key with a hook to an external preprocessor that lives in the Propeller directory? A bidirectional Windows Messaging scheme would perhaps work. What is the language that the IDE is written in? I know that the Sendkeys method to invoke a preprocessor kinda works. Just an idle thought...

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH

Post Edited (James Michael Huselton) : 5/25/2009 3:16:06 PM GMT

BradC · 2009-05-25 15:12

James Michael Huselton said...
Aside from ownership issues from Parallax, what do you think of reverse-engineering the Parallax Propeller IDE to replace some hot key with a hook to an external preprocessor that lives in the Propeller directory? What is the language that the IDE is written in?

It's written in Borland Delphi and can be uncompressed and examined with a disassembler (not that I'd know). Go your hardest..

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
"VOOM"?!? Mate, this bird wouldn't "voom" if you put four million volts through it! 'E's bleedin' demised!

ericball · 2009-05-25 16:27

Phil Pilgrim (PhiPi) said...

Sure, I could do:

testsuite     long      @id0,1 << 15, 0, 5000, 0, 0, 1000, 2000
              '...
id0           byte      "A15 Test", 0

Just a warning, this code won't work.· It might compile correctly, but the value of @id0 compiled into the DAT section, probably won't match the runtime value.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: http://forums.parallax.com/showthread.php?p=800114
NTSC color bars (template): http://forums.parallax.com/showthread.php?p=803904

Phil Pilgrim (PhiPi) · 2009-05-25 16:32

ericball said...
Just a warning, this code won't work. It might compile correctly, but the value of @id0 compiled into the DAT section, probably won't match the runtime value.

Yes, I pointed that out in my post. What gets compiled is an object-relative address that needs further manipulation at runtime. But that's okay. I wouldn't use that form, though, since it makes the table unreadable, but I would use its equivalent if strings could be embedded in DAT tables and allocated automatically like they are in Spin.

-Phil

Post Edited (Phil Pilgrim (PhiPi)) : 5/25/2009 5:01:26 PM GMT

Beau Schwabe · 2009-05-25 17:23

Phil Pilgrim,

You could do something like this...

testsuite     long
              byte "A15 "
              long 1 << 15, 0, 5000, 0, 0, 1000, 2000

... This way you could still access all the data from 'testsuite' in long variable format

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

hippy · 2009-05-25 17:33

James Michael Huselton said...
Aside from ownership issues from Parallax, what do you think of reverse-engineering the Parallax Propeller IDE to replace some hot key with a hook to an external preprocessor that lives in the Propeller directory?

I recall someone did something that. Still a kludge though unless you can achieve totally transparent operation, and that has some tricky issues; edited objects not same on disk as within the IDE, making sure all objects refer to correct post-processed filenames.

It's easy enough to generate a new IDE window with post-processed contents of another, but when you find you've spent ages editing the wrong file, and lost all the edits on a further compile cycle, it'll probably get uninstalled quicker than it got added

Mike Huselton · 2009-05-25 17:41

Hippy, I know that this is a hack. I don't want to impliment this way, either.

Maybe the solution is to fly out to Parallax and spend a few days. Don't leave the office until someone codes the hooks. Carry plenty of cash for bail.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH

Mike Huselton · 2009-05-25 17:55

Maybe Parallax could enter in a contract agreement with just one trusted individual to make the coding changes and return the code to Parallax. Mark the changes in the code to make it easy for Parallax to read. I've done this in many situations with big software vendors and it works. I had a coding convention for annotation: (ADD), (CHANGE) and (DELETE) with datestamp and JMH as my initials.

Parallax could only release the code for the Menu section. Makes it tough to compile and test, though.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH

Post Edited (James Michael Huselton) : 5/25/2009 6:00:40 PM GMT

Phil Pilgrim (PhiPi) · 2009-05-25 18:00

Beau,

Yeah, I know. That's what MagIO2 suggested, too, and I included something similar to that format in my original post. My objective, though, is to get each entire record on one line, since the table is going to be rather large.

I guess my underlying agenda here is to prod Jeff and Chip out of Prop2 work long enough to do some necessary (as I see it) backfilling to the current software. This would encompass (most important first):

1. Subdirectory structure for OBJects.
2. Built-in preprocessor hooks: http://forums.parallax.com/showthread.php?p=706620
3. More and better inline data constructors, partcularly for strings and lists (possible with a preprocessor, too).
4. Object pointers.
5. More expressive address dereferenceing (possible with a preprocessor, too).
6. Constant folding (possible with a preprocessor, too, but better if done natively).

I don't mean to malign an otherwise excellent piece of software. It's great as far as it goes, and I know its authors are busy, busy, busy. But in my mind, and after three years of public availablilty, it's still not finished.

-Phil

Beau Schwabe · 2009-05-25 20:35

Phil Pilgrim,

Seems as though there could be a modifier similar to the 'string' function that would allow the ascii.

For a one line approach to what you want, you could do this...

testsuite     long "A"<<24|"1"<<16|"5"<<8, 1 << 15, 0, 5000, 0, 0, 1000, 2000

... where "A"<<24|"1"<<16|"5"<<8 could be parsed from a future version of the IDE.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Phil Pilgrim (PhiPi) · 2009-05-25 21:14

Yeah, that would work in a pinch, I suppose. I think you meant "A"|"1"<<8|"5"<<16 (little endian), though, right?

Actually, I'm more resigned to:

testsuite     byte      "A15 "
              long             1 << 15, 0, 5000, 0, 0, 1000, 2000

It's going to scroll off the screen, but I think it's my most readable option for now.

-Phil

kwinn · 2009-05-25 21:15

Wouldn't a sytax more in line with calling and assignment be better?

Lname long (Wname word $FF, (Bname byte "A", 7))

Parsing this should be straight forward, and it is consistent with what we already do in other areas.

Phil Pilgrim (PhiPi) · 2009-05-25 21:57

The following constant contructors would be an invaluable addition to Spin:

1. @str0("ABC"): Same as current string, which allocates a zero-terminated string and places its address in the code, but do it for DAT section, too.

2. @nstr("ABC"): Same idea, but allocates a string whose first character is the byte count, and do it for DAT section, too.

3. @str("ABC", 13): Same idea, but more general-purpose, with no count and no terminator. User is responsible for terminating (e.g. with a CR).

4. str0("ABC"): Places "ABC", 0 in situ in a DAT section, using multiple elements and zero-padding as necessary to match type and force alignment.

5. nstr("ABC"): Places 3, "ABC" in situ in a DAT section, using multiple elements and zero-padding as necessary to match type and force alignment.

6. str("ABC"): Places "ABC" in situ in a DAT section, using multiple elements and zero-padding as necessary to match type and force alignment.

7. bit(0, 3, 16, 31) or |<(0, 3, 16, 31): Yields the value 1 << 0 | 1 << 3 | 1 << 16 | 1 << 31 or |<0 | |<3| |<16 | |<31 more compactly and more readably.

Array constructors, too:

8. @byte and @nbyte: Same as @str and @nstr, but allowing variable and expression elements.

9. @word, @nword, @long, @nlong: You get the idea. Array constructors are very useful for implementing variable-length parameter lists within method calls.

The whole idea here is to improve program readability by expressing data structures inline where they're used, rather than having to address something defined elsewhere.

-Phil

Dr_Acula · 2009-05-26 00:03

I've been down this road too back in the early 1990s. Get annoyed. Get angry. Think of things that need changing. Write a compiler from scratch LOL.

Thinking in several languages here, normally I'd keep the declares separate
Declare byte variables
Declare word variables
Declare long variables
Declare string variables
Code...

But mixing types as lines of data is indeed a different proposition. You might be able to do everything in hex, and just start thinking in hex. Do it all as bytes and
data FF,01,02,C0 etc
where you know that the first one was a byte, the second and third ones are actually parts of a word, the third one is a byte etc. Then big numbers like longs at least are readable as a hex number.

Re strings, I've just been writing some string routines in assembly for the Z80. CP/M function call convention terminates with a $ but I agree that zero termination is better. In addition to the ones mentioned, I've also coded mid(), left(), copy() hex() and a few others. If you define strings at the beginning with space for 32 characters, then use what you need and put a 0 at the end, there is space for most strings. But it uses a bit of memory that way. On a Z80 there is 64k of ram so plenty of space for strings that use up to 32 bytes per string. MBASIC used a more memory efficient way of packing strings, but it did need to go off and tidy things up from time to time, resulting in long pauses at random intervals in programs. So it is a tradeoff of space vs speed.

I've been using vb.net a bit, so named all my strings with vb.net conventions, ie every string routine starts with 'string' eg strings_mid(), strings_left(), strings_copy().

It took about half a day to write all these routines and they are proving extremely useful and about 1000x faster than string functions in MBASIC. There would be no reason string functions could not be written in PASM. Maybe someone already has done this?

Phil Pilgrim (PhiPi) · 2009-05-26 00:16

Dr_Acula said...
Maybe someone already has done this?

I have (in Spin, though): http://forums.parallax.com/showthread.php?p=587930. But this deals more with strings at execution time once they've been defined and allocated. My biggest concern here is the expressiveness of the language itself which, through certain improvements, would allow things to be defined inline where they're used, rather than remotely via explicit pointers. Granted, this is something that could be accomplished with an appropriately-written preprocessor; but Spin, as yet, doesn't even have hooks for that.

-Phil

Ale · 2009-05-26 05:07

OT:

Dr_Acula said...
Re strings, I've just been writing some string routines in assembly for the Z80. CP/M function call convention terminates with a $

I always wondered where the $ to terminate DOS string came from. The ones who say that it is not a total rip-off of CP/M may have to give fcb a second look

More flexible compilers is the way to go, macros would be useful.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Visit the home of pPropQL, pPropQL020 and OMU for the pPropQL/020 at omnibus.uni-freiburg.de/~rp92

Dr_Acula · 2009-05-26 10:46

Dos function calls http://faculty.lacitycollege.edu/colantrs/ct46/labrefs/c46dosfuncs/c46dosfuncs.html

There are some missing ones in DOS eg 3H and 4H hmm, they correspond to obsolete equipment eg paper tape reader. But looking at the rest:
CPM 0H. Terminate program
DOS 0H- terminate program
CPM 01H Read console - returns an ascii character
DOS 01H Read keyboard
CPM 02H Write to console
DOS 02H Write to console
CPM 06H Direct console input/output.
DOS 06H Direct console read write
CPM 09H Print string (ends with $)
DOS 09H Print string (ends with $)
etc etc

DOS adds some more, eg subdirectories, that were added later after CP/M had lost popularity. But the core of DOS is CP/M.
With the basics of strings in the actual operating system, it is easy to add more assembly code to add higher level string functions. The same concept could be applied to PASM - have some simple input output routines, and add a library of strings etc. Maybe this is a bit esoteric, but sometimes high level languages are not quite fast enough.

Re defining things inline, assembly can be a bit of a pain like that. But it does help to have similar conventions for all function calls. Eg all the above CP/M calls pass a two byte variable in register DE and data comes back in register A. So after a while you don't have to think so much about defining things inline. The code becomes 'define registers, call function, store data, define registers etc'

The other point about writing string functions in PASM is you don't have to wait for SPIN to be rewritten. Or write a new SPIN compiler...

Compiler frustrations

Comments