Compiler frustrations
Phil Pilgrim (PhiPi)
Posts: 23,514
I have a simple task: construct a table in the DAT section, each entry of which contains a fixed or variable-length ID string, followed by several predefined longs. 'Sounds easy, right?
First with the fixed-length strings:
Well, that's a bad idea: each character in the string "A15 " takes up its own long.
How about:
Nope. The compiler flags byte and reports an error: "Size override must be larger." Why, for cripes sake? This seems like a perfectly logical use of an override to me.
Okay, then, how about:
No again. The long only applies to the "1 << 15"; the other arguments are truncated to bytes.
Well, then (grasping at straws):
Nice idea, but no: overrides can't be grouped.
This works, of course:
But it's gonna be a big table. Who would want to type all those "long"s? Besides, I might leave one out and screw everything up.
This works, too:
But the table not only won't be as readable, it'll be twice as long on the screen.
You know what I'd like? A string constant type that packs four characters into a long, padding with zeroes, if necessary: e.g. pack("ABC") == $00434241.
Okay, enough with fixed-length strings. Variable-length strings would be even nicer:
Sorry. The compiler flags "string" and says "Expected a constant, unary operator or '('." I thought strings were constants. Why can't it stuff a zero-terminated string somewhere like it does in Spin code and poke its address into the table? Of course, the rules for handling @variable items in DAT sections would have to apply, since the absolute address isn't known at compile time, but so what?
Sure, I could do:
But why should I have to? Besides, it ruins the readability of the table.
This all makes me very grumpy. I love programming, and I don't like feeling grumpy when I'm doing something I love. Grrr!
-Phil
Post Edited (Phil Pilgrim (PhiPi)) : 5/25/2009 7:22:07 AM GMT
First with the fixed-length strings:
testsuite long "A15 ", 1 << 15, 0, 5000, 0, 0, 1000, 2000
Well, that's a bad idea: each character in the string "A15 " takes up its own long.
How about:
testsuite long byte "A15 ", 1 << 15, 0, 5000, 0, 0, 1000, 2000
Nope. The compiler flags byte and reports an error: "Size override must be larger." Why, for cripes sake? This seems like a perfectly logical use of an override to me.
Okay, then, how about:
testsuite byte "A15 ", long 1 << 15, 0, 5000, 0, 0, 1000, 2000
No again. The long only applies to the "1 << 15"; the other arguments are truncated to bytes.
Well, then (grasping at straws):
testsuite byte "A15 ", long (1 << 15, 0, 5000, 0, 0, 1000, 2000)
Nice idea, but no: overrides can't be grouped.
This works, of course:
testsuite byte "A15 ", long 1 << 15, long 0, long 5000, long 0, long 0, long 1000, long 2000
But it's gonna be a big table. Who would want to type all those "long"s? Besides, I might leave one out and screw everything up.
This works, too:
testsuite byte "A15 " long 1 << 15, 0, 5000, 0, 0, 1000, 2000
But the table not only won't be as readable, it'll be twice as long on the screen.
You know what I'd like? A string constant type that packs four characters into a long, padding with zeroes, if necessary: e.g. pack("ABC") == $00434241.
Okay, enough with fixed-length strings. Variable-length strings would be even nicer:
testsuite long string("A15 Test"),1 << 15, 0, 5000, 0, 0, 1000, 2000
Sorry. The compiler flags "string" and says "Expected a constant, unary operator or '('." I thought strings were constants. Why can't it stuff a zero-terminated string somewhere like it does in Spin code and poke its address into the table? Of course, the rules for handling @variable items in DAT sections would have to apply, since the absolute address isn't known at compile time, but so what?
Sure, I could do:
testsuite long @id0,1 << 15, 0, 5000, 0, 0, 1000, 2000 '... id0 byte "A15 Test", 0
But why should I have to? Besides, it ruins the readability of the table.
This all makes me very grumpy. I love programming, and I don't like feeling grumpy when I'm doing something I love. Grrr!
-Phil
Post Edited (Phil Pilgrim (PhiPi)) : 5/25/2009 7:22:07 AM GMT
Comments
-Phil
testsuite long byte "A15 " long 1<<15, 0, 5000, 0, 0, 1000, 2000
? This would allow to access the string as a long, as the label itself is declared as long. (Not able to test it currently)
PS: I see .. this is the last version you mentioned .. couldn't see the trees because of the wood ;o)
Guess your problem here is that you switched to byte and did not switch back to long. 1<<15 of course is to big for a byte.
But you are right ... at least having macros would have helped here to write more readable code.
Post Edited (MagIO2) : 5/25/2009 8:38:05 AM GMT
I know what you mean. I do not think any compiler (asm compiler) will accept that, mixed long/byte constants like what you want. Whit a macro assembler I would solve the problem using a macro.
But, if you make a small <insert preferred scripting language here> to convert your input to the desired output, I know it is two steps but if the table is just one table with little changes... It is what I end up doing normally :-(, at least for pasm. binutils' as on the other hand supports a nice preprocessor like cpp. Now that could be an idea, did you try with cpp as preprocessor ? It may work...
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Visit the home of pPropQL, pPropQL020 and OMU for the pPropQL/020 at omnibus.uni-freiburg.de/~rp92
Here is what I would like to see:
1. A constructor to pack character strings in situ into words and longs.
2. Automatic handling of strings in the DAT section.
3. Ability to override large data types with small ones and back again. There's no reason the compiler can't use a zero filler for any alignment gaps that result.
-Phil
Leon
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle
Yup, a preprocessor might assist, although more fundamental changes to the language itself are called for in this case, I think. I've been lobbying for preprocessor hooks in the IDE for some time now...
-Phil
Seems like forever. LMM coding within PropTool would be so much easier with macros. Whether macros and other preprocessing is done within the PropTool itself or through a hook to allow pre-processors to be called doesn't really bother me ( the later is most flexible ).
Using external tools with the PropTool is currently a horrible kludge at best, frustrating and prone to error. I know there are third party alternatives but that's not really the point.
Adding some sort of macros would be my priority to improving PropTool, followed by conditional compilation.
If one has regional library code for products ( say PAL and NTSC TV ) you either need separate PAL and NTSC drivers so twice the effort to maintain, or a single run-time configurable version which adds unnecessary overhead if only one or the other is actually required. As separate objects need different file names, all objects which use those also need to be unique objects and so it goes, an ever expanding list of variants and a maintenance nightmare in the making. To make life easier one is forced to go with single source and run-time configuration.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH
Post Edited (James Michael Huselton) : 5/25/2009 3:16:06 PM GMT
It's written in Borland Delphi and can be uncompressed and examined with a disassembler (not that I'd know). Go your hardest..
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
"VOOM"?!? Mate, this bird wouldn't "voom" if you put four million volts through it! 'E's bleedin' demised!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: http://forums.parallax.com/showthread.php?p=800114
NTSC color bars (template): http://forums.parallax.com/showthread.php?p=803904
-Phil
Post Edited (Phil Pilgrim (PhiPi)) : 5/25/2009 5:01:26 PM GMT
You could do something like this...
... This way you could still access all the data from 'testsuite' in long variable format
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
I recall someone did something that. Still a kludge though unless you can achieve totally transparent operation, and that has some tricky issues; edited objects not same on disk as within the IDE, making sure all objects refer to correct post-processed filenames.
It's easy enough to generate a new IDE window with post-processed contents of another, but when you find you've spent ages editing the wrong file, and lost all the edits on a further compile cycle, it'll probably get uninstalled quicker than it got added
Maybe the solution is to fly out to Parallax and spend a few days. Don't leave the office until someone codes the hooks. Carry plenty of cash for bail.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH
Parallax could only release the code for the Menu section. Makes it tough to compile and test, though.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH
Post Edited (James Michael Huselton) : 5/25/2009 6:00:40 PM GMT
Yeah, I know. That's what MagIO2 suggested, too, and I included something similar to that format in my original post. My objective, though, is to get each entire record on one line, since the table is going to be rather large.
I guess my underlying agenda here is to prod Jeff and Chip out of Prop2 work long enough to do some necessary (as I see it) backfilling to the current software. This would encompass (most important first):
1. Subdirectory structure for OBJects.
2. Built-in preprocessor hooks: http://forums.parallax.com/showthread.php?p=706620
3. More and better inline data constructors, partcularly for strings and lists (possible with a preprocessor, too).
4. Object pointers.
5. More expressive address dereferenceing (possible with a preprocessor, too).
6. Constant folding (possible with a preprocessor, too, but better if done natively).
I don't mean to malign an otherwise excellent piece of software. It's great as far as it goes, and I know its authors are busy, busy, busy. But in my mind, and after three years of public availablilty, it's still not finished.
-Phil
Seems as though there could be a modifier similar to the 'string' function that would allow the ascii.
For a one line approach to what you want, you could do this...
... where "A"<<24|"1"<<16|"5"<<8 could be parsed from a future version of the IDE.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Actually, I'm more resigned to:
It's going to scroll off the screen, but I think it's my most readable option for now.
-Phil
Lname long (Wname word $FF, (Bname byte "A", 7))
Parsing this should be straight forward, and it is consistent with what we already do in other areas.
1. @str0("ABC"): Same as current string, which allocates a zero-terminated string and places its address in the code, but do it for DAT section, too.
2. @nstr("ABC"): Same idea, but allocates a string whose first character is the byte count, and do it for DAT section, too.
3. @str("ABC", 13): Same idea, but more general-purpose, with no count and no terminator. User is responsible for terminating (e.g. with a CR).
4. str0("ABC"): Places "ABC", 0 in situ in a DAT section, using multiple elements and zero-padding as necessary to match type and force alignment.
5. nstr("ABC"): Places 3, "ABC" in situ in a DAT section, using multiple elements and zero-padding as necessary to match type and force alignment.
6. str("ABC"): Places "ABC" in situ in a DAT section, using multiple elements and zero-padding as necessary to match type and force alignment.
7. bit(0, 3, 16, 31) or |<(0, 3, 16, 31): Yields the value 1 << 0 | 1 << 3 | 1 << 16 | 1 << 31 or |<0 | |<3| |<16 | |<31 more compactly and more readably.
Array constructors, too:
8. @byte and @nbyte: Same as @str and @nstr, but allowing variable and expression elements.
9. @word, @nword, @long, @nlong: You get the idea. Array constructors are very useful for implementing variable-length parameter lists within method calls.
The whole idea here is to improve program readability by expressing data structures inline where they're used, rather than having to address something defined elsewhere.
-Phil
Thinking in several languages here, normally I'd keep the declares separate
Declare byte variables
Declare word variables
Declare long variables
Declare string variables
Code...
But mixing types as lines of data is indeed a different proposition. You might be able to do everything in hex, and just start thinking in hex. Do it all as bytes and
data FF,01,02,C0 etc
where you know that the first one was a byte, the second and third ones are actually parts of a word, the third one is a byte etc. Then big numbers like longs at least are readable as a hex number.
Re strings, I've just been writing some string routines in assembly for the Z80. CP/M function call convention terminates with a $ but I agree that zero termination is better. In addition to the ones mentioned, I've also coded mid(), left(), copy() hex() and a few others. If you define strings at the beginning with space for 32 characters, then use what you need and put a 0 at the end, there is space for most strings. But it uses a bit of memory that way. On a Z80 there is 64k of ram so plenty of space for strings that use up to 32 bytes per string. MBASIC used a more memory efficient way of packing strings, but it did need to go off and tidy things up from time to time, resulting in long pauses at random intervals in programs. So it is a tradeoff of space vs speed.
I've been using vb.net a bit, so named all my strings with vb.net conventions, ie every string routine starts with 'string' eg strings_mid(), strings_left(), strings_copy().
It took about half a day to write all these routines and they are proving extremely useful and about 1000x faster than string functions in MBASIC. There would be no reason string functions could not be written in PASM. Maybe someone already has done this?
-Phil
I always wondered where the $ to terminate DOS string came from. The ones who say that it is not a total rip-off of CP/M may have to give fcb a second look
More flexible compilers is the way to go, macros would be useful.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Visit the home of pPropQL, pPropQL020 and OMU for the pPropQL/020 at omnibus.uni-freiburg.de/~rp92
There are some missing ones in DOS eg 3H and 4H hmm, they correspond to obsolete equipment eg paper tape reader. But looking at the rest:
CPM 0H. Terminate program
DOS 0H- terminate program
CPM 01H Read console - returns an ascii character
DOS 01H Read keyboard
CPM 02H Write to console
DOS 02H Write to console
CPM 06H Direct console input/output.
DOS 06H Direct console read write
CPM 09H Print string (ends with $)
DOS 09H Print string (ends with $)
etc etc
DOS adds some more, eg subdirectories, that were added later after CP/M had lost popularity. But the core of DOS is CP/M.
With the basics of strings in the actual operating system, it is easy to add more assembly code to add higher level string functions. The same concept could be applied to PASM - have some simple input output routines, and add a library of strings etc. Maybe this is a bit esoteric, but sometimes high level languages are not quite fast enough.
Re defining things inline, assembly can be a bit of a pain like that. But it does help to have similar conventions for all function calls. Eg all the above CP/M calls pass a two byte variable in register DE and data comes back in register A. So after a while you don't have to think so much about defining things inline. The code becomes 'define registers, call function, store data, define registers etc'
The other point about writing string functions in PASM is you don't have to wait for SPIN to be rewritten. Or write a new SPIN compiler...