IMHO we don't have to cater for mechanical teletypes any more.
This should have been sorted years ago, but there were commercial interests at play. I see MS has finally updated NotePad to read and write *nix and Mac files too.
Personally, we should have ditched <lf> years ago, along with autocorrect on phones - they always substitute something wrong when you're not looking
CR and LF are perfectly valid control characters with very specific actions so it is not their fault, don't blame them. It's the fault of all those that have misused them over these decades and left the mess behind. If I'm monitoring a value constantly in TAQOZ I need it to reuse the same line and overprint the old value rather than scrolling off the screen like crazy and making it hard to see the current value as well. If all terminals automatically went to a new line on a CR, how then could I keep the display on the same line.
Take this TAQOZ test code here
0 BEGIN CR DUP . 8 SPACES 1+ KEY UNTIL
This will continue to increment a value and display the result on the same line of the terminal. Just what I want and need.
Peter, the forth of 'C' is so strong, people do not want to understand the difference between CR and LF, even if most of them use IDEs helping with code indention by positioning the cursor in the same column in the next line when one hits enter.
Not at the beginning of the line. Very, very simple text-editors do just that.
But if /n means CRLF one can not position the cursor in the next line without changing the horizontal position.
And if /r means CRLF one can not position the cursor on the beginning of a line without changing the vertical position.
The C language is very clear about the difference between carriage return and line feed. It has escape sequences for both: "\r" and "\n".
It has escape sequences for tab "\t", backspace "\b", etc and any other 8 bit character you like, e.g. "\x4F".
These are among the first things you learn when learning C.
Now, what your operating system or whatever you send those strings to does with those characters is a totally other story and no end of chaos. What do you need to emit to get a new line? How much space is a TAB, etc.
And the cause of much confusion and error, as is the topic of this thread.
If only ASCII had a NEW LINE character, that meant "get me to the start of the next line", that would have saved a mass of confusion between Unix/Linux, DOS/Windows and Mac.
I consider the TAB to be mixed blessing. I like the way Propeller Tool handles it: spacing over by the amount I specify and not putting TAB characters in the text. Sometimes I download or copy/paste programs that mix TABs and spaces. If my TAB settings are not those of the original author, it's a total mess.
OTOH, if text that I download contains clean tabbing, I like to set the TABs to two spaces, regardless of what the original author thought was appropriate. If the author's text has replaced TABs with four spaces, I'm not happy. Fortunately, UltraEdit lets me convert leading blocks of four spaces to TABs, then I can reset tabbing to two spaces and convert back to spaces -- or just leave it at TABs.
One of the most annoying things is when a developer makes a little change to some code but whilst they are there they convert tabs to spaces everywhere or vice-versa or generally mess with the formatting to make look just how they like. Then they check it in to the source repository.
Great, now when you do a diff on it to see what they did you can't see the actual change from all the whitespace change noise.
Yes, you can ignore whitespace changes in a diff but it's annoying. Besides in languages like Spin and Python whitespace changes might be important.
Maybe I am wrong - happened before - and it is NOT the 'C' crowd, responsible. Then it is the '*nix' crowd simply IGNORING the two different escape sequences AVAILABLE in the 'C' language, and simply use \n as CRLF even if it is just a LF.
--- ducking for cover ---
We could solve both problems by declaring \t as LineTerminator and NewLine, thus giving \r and \n their original meaning and removing TABs altogether.
Nothing stops us from RL encoding strings in our own print string routines which makes sense in embedded code. Once you have files though, you have file storage, which these days are millions of times larger than any text file anyway.
btw, one advantage of RL encoding text files is in embedded systems as it can cut down on read speeds, that is as long as we are talking about RLEs in terms of kBs that is.
I'm watching, Chip! I just gave away a teletype that was a working place of an operator to one of the first graphic displays (vector). The early word star text editor made use of CR in allowing you to process a line twice and combine graphic characters from different letters. Was fun to use it.
And I remember the time, when the "ctrl" key on the IBM keyboard was renamed to "strg" in Germany, it took years to educate people that "strg" is not an appreviation on "string" (what would make perfect sense in an alternative world) but the German word "Steuerung", what again means "control", but to control means "steuern", what again leds to "mit Steuern kann man steuern", what means "by taxes you can control more simple then by laws". But we know that pee too means the latest in microprocessors!
Maybe I am wrong - happened before - and it is NOT the 'C' crowd, responsible. Then it is the '*nix' crowd simply IGNORING the two different escape sequences AVAILABLE in the 'C' language, and simply use \n as CRLF even if it is just a LF.
Again, wrong. Read my earlier post. \n is used as end of line marker in Unix text files, it has nothing to do with the actual output to terminal or whatever. To detect the end of line marker and transform that to whatever (which could be something completely different from CRLF!) is the job of the output driver, not the text editor.
The only reason this is confusing is because MS hardcoded the actual final output into their text file format (presumably taken from CP/M, because one thing is certain -- MS never did much thinking themselves).
Comments
CR and LF are perfectly valid control characters with very specific actions so it is not their fault, don't blame them. It's the fault of all those that have misused them over these decades and left the mess behind. If I'm monitoring a value constantly in TAQOZ I need it to reuse the same line and overprint the old value rather than scrolling off the screen like crazy and making it hard to see the current value as well. If all terminals automatically went to a new line on a CR, how then could I keep the display on the same line.
Take this TAQOZ test code here This will continue to increment a value and display the result on the same line of the terminal. Just what I want and need.
Not at the beginning of the line. Very, very simple text-editors do just that.
But if /n means CRLF one can not position the cursor in the next line without changing the horizontal position.
And if /r means CRLF one can not position the cursor on the beginning of a line without changing the vertical position.
Mike
The C language is very clear about the difference between carriage return and line feed. It has escape sequences for both: "\r" and "\n".
It has escape sequences for tab "\t", backspace "\b", etc and any other 8 bit character you like, e.g. "\x4F".
These are among the first things you learn when learning C.
Now, what your operating system or whatever you send those strings to does with those characters is a totally other story and no end of chaos. What do you need to emit to get a new line? How much space is a TAB, etc.
And the cause of much confusion and error, as is the topic of this thread.
If only ASCII had a NEW LINE character, that meant "get me to the start of the next line", that would have saved a mass of confusion between Unix/Linux, DOS/Windows and Mac.
The TAB of course should never have existed.
I can't imagine what IDEs have to do with this.
OTOH, if text that I download contains clean tabbing, I like to set the TABs to two spaces, regardless of what the original author thought was appropriate. If the author's text has replaced TABs with four spaces, I'm not happy. Fortunately, UltraEdit lets me convert leading blocks of four spaces to TABs, then I can reset tabbing to two spaces and convert back to spaces -- or just leave it at TABs.
-Phil
Great, now when you do a diff on it to see what they did you can't see the actual change from all the whitespace change noise.
Yes, you can ignore whitespace changes in a diff but it's annoying. Besides in languages like Spin and Python whitespace changes might be important.
Maybe I am wrong - happened before - and it is NOT the 'C' crowd, responsible. Then it is the '*nix' crowd simply IGNORING the two different escape sequences AVAILABLE in the 'C' language, and simply use \n as CRLF even if it is just a LF.
--- ducking for cover ---
We could solve both problems by declaring \t as LineTerminator and NewLine, thus giving \r and \n their original meaning and removing TABs altogether.
simple,
Mike
Interesting observation. I'm sure a lot of storage resources have been saved, too.
I really like the idea of run-length compression for spaces and characters:
$1E + CHR + $20..$7F = repeat CHR 1..96 times
$1F + $20..$7F = repeat SPACE 1..96 times
And I love mono-space fonts for programming. Variable-space fonts are horrible for code.
We're going to MAKE 1980 GREAT AGAIN!!! (ErNa?)
But ALLWAYS replace tab with spaces when entering code and loading/saving code would clean up at least one of the messes.
Sometimes I need to agree with Heater., the TAB has to die in source files. The TAB key can still do the indention thing, just with spaces.
I think arc or lha or lharc did that simple run-length compression...
Mike
I totally agree, unless you're still programming a TRS80.
-Phil
btw, one advantage of RL encoding text files is in embedded systems as it can cut down on read speeds, that is as long as we are talking about RLEs in terms of kBs that is.
Chars from 128 to 191 are block graphics, 192 to 255 multipsaces from length 0 upwards.
And I remember the time, when the "ctrl" key on the IBM keyboard was renamed to "strg" in Germany, it took years to educate people that "strg" is not an appreviation on "string" (what would make perfect sense in an alternative world) but the German word "Steuerung", what again means "control", but to control means "steuern", what again leds to "mit Steuern kann man steuern", what means "by taxes you can control more simple then by laws". But we know that pee too means the latest in microprocessors!
The only reason this is confusing is because MS hardcoded the actual final output into their text file format (presumably taken from CP/M, because one thing is certain -- MS never did much thinking themselves).