Aligned characters in the DAT section: some notable oddities
Phil Pilgrim (PhiPi)
Posts: 23,514
Monday musings...
In a program's DAT section, if I enter
the assembler produces: "a", 0, 0, 0, "b", 0, "c", 0, "d", 0, "e", 0, "f", 0. I find it odd that the default alignment gets applied to each character rather than each group of characters, and odder still that the size modifier gets applied only to the first character of a group. It appears that the notational cohesion apparent in "abc" extends no further than the source code.
I wonder if this was intended, as it leads to a possible problem. Suppose I want a sequence of characters to begin in memory on a long boundary. I try this:
The assembler flags the byte modifier with an error: "Size override must be larger." Hmm. Catch 22. I could certainly throw in a long 0 ahead of my string to align it, but that's awkward and wasteful. I could also make sure to declare my string after any other long declaration. But what if I had a whole bunch of strings whose first characters I want aligned on long boundaries? Must I then count the characters and pad them out by hand?
The solution, as I've discovered, is to place an empty long ahead of each string, thus:
A bit awkward, but it works. Nonetheless, wouldn't the following notation have been more concise (but for the first behavior noted above)?
Granted, treating strings cohesively would have required a way to include non-ASCII bytes within a group (which is what escape sequences are for). The string() notation might also have worked, but it generates an extra character and doesn't seem to be allowed in the DAT section anyway...
-Phil
In a program's DAT section, if I enter
word long "abc", "def"
the assembler produces: "a", 0, 0, 0, "b", 0, "c", 0, "d", 0, "e", 0, "f", 0. I find it odd that the default alignment gets applied to each character rather than each group of characters, and odder still that the size modifier gets applied only to the first character of a group. It appears that the notational cohesion apparent in "abc" extends no further than the source code.
I wonder if this was intended, as it leads to a possible problem. Suppose I want a sequence of characters to begin in memory on a long boundary. I try this:
long byte "abcd"
The assembler flags the byte modifier with an error: "Size override must be larger." Hmm. Catch 22. I could certainly throw in a long 0 ahead of my string to align it, but that's awkward and wasteful. I could also make sure to declare my string after any other long declaration. But what if I had a whole bunch of strings whose first characters I want aligned on long boundaries? Must I then count the characters and pad them out by hand?
The solution, as I've discovered, is to place an empty long ahead of each string, thus:
long byte "abc" long byte "efg"
A bit awkward, but it works. Nonetheless, wouldn't the following notation have been more concise (but for the first behavior noted above)?
long byte "abc", "efg"
Granted, treating strings cohesively would have required a way to include non-ASCII bytes within a group (which is what escape sequences are for). The string() notation might also have worked, but it generates an extra character and doesn't seem to be allowed in the DAT section anyway...
-Phil
Comments
You just have to explicitly pack the characters as in:
long "a" | "b"<<8 | "c"<<16 | "d"<<24, "e" | "f"<<8 | 0 << 16
This is inconvenient, but gets the job done.
I believe the current behavior is due to a string being treated as a list of character values, so "abc" is the same as "a","b","c" and the rest of the behavior follows from that.
The error message is confusing and most likely wrong.
As to the philosphy of "string denotations", especially in the context of LOOKUP, LOOKDOWN, and CASE there is an extensive discussion mainly based on Hippy's postings in another thread. It comes to the point Mike has brought it to above..
Edit. I noticed that the deleted statements above are just nonsense.
Post Edited (deSilva) : 1/29/2008 1:41:02 AM GMT
-Phil