Forum Update - Announcement about May 10th, 2018 update and your password.

CR vs CRLF Dilema

13»

Comments

  • Cluso99 wrote: »
    IMHO we don't have to cater for mechanical teletypes any more.

    This should have been sorted years ago, but there were commercial interests at play. I see MS has finally updated NotePad to read and write *nix and Mac files too.

    Personally, we should have ditched <lf> years ago, along with autocorrect on phones - they always substitute something wrong when you're not looking ;)

    CR and LF are perfectly valid control characters with very specific actions so it is not their fault, don't blame them. It's the fault of all those that have misused them over these decades and left the mess behind. If I'm monitoring a value constantly in TAQOZ I need it to reuse the same line and overprint the old value rather than scrolling off the screen like crazy and making it hard to see the current value as well. If all terminals automatically went to a new line on a CR, how then could I keep the display on the same line.
    Take this TAQOZ test code here
    0 BEGIN CR DUP . 8 SPACES 1+ KEY UNTIL
    
    This will continue to increment a value and display the result on the same line of the terminal. Just what I want and need.


    Tachyon Forth - compact, fast, forthwright and interactive
    useforthlogo-s.png
    --->CLICK THE LOGO for more links<---
    Latest binary V5.4 includes EASYFILE +++++ Tachyon Forth News Blog
    P2 SHORTFORM DATASHEET +++++ TAQOZ documentation
    Brisbane, Australia
  • Peter, the forth of 'C' is so strong, people do not want to understand the difference between CR and LF, even if most of them use IDEs helping with code indention by positioning the cursor in the same column in the next line when one hits enter.

    Not at the beginning of the line. Very, very simple text-editors do just that.

    But if /n means CRLF one can not position the cursor in the next line without changing the horizontal position.
    And if /r means CRLF one can not position the cursor on the beginning of a line without changing the vertical position.

    Mike
    I am just another Code Monkey.
    A determined coder can write COBOL programs in any language. -- Author unknown.
    Press any key to continue, any other key to quit

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.
  • How so?

    The C language is very clear about the difference between carriage return and line feed. It has escape sequences for both: "\r" and "\n".

    It has escape sequences for tab "\t", backspace "\b", etc and any other 8 bit character you like, e.g. "\x4F".

    These are among the first things you learn when learning C.

    Now, what your operating system or whatever you send those strings to does with those characters is a totally other story and no end of chaos. What do you need to emit to get a new line? How much space is a TAB, etc.

    And the cause of much confusion and error, as is the topic of this thread.

    If only ASCII had a NEW LINE character, that meant "get me to the start of the next line", that would have saved a mass of confusion between Unix/Linux, DOS/Windows and Mac.

    The TAB of course should never have existed.

    I can't imagine what IDEs have to do with this.



  • heater wrote:
    The TAB of course should never have existed.
    I consider the TAB to be mixed blessing. I like the way Propeller Tool handles it: spacing over by the amount I specify and not putting TAB characters in the text. Sometimes I download or copy/paste programs that mix TABs and spaces. If my TAB settings are not those of the original author, it's a total mess.

    OTOH, if text that I download contains clean tabbing, I like to set the TABs to two spaces, regardless of what the original author thought was appropriate. If the author's text has replaced TABs with four spaces, I'm not happy. Fortunately, UltraEdit lets me convert leading blocks of four spaces to TABs, then I can reset tabbing to two spaces and convert back to spaces -- or just leave it at TABs.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • One of the most annoying things is when a developer makes a little change to some code but whilst they are there they convert tabs to spaces everywhere or vice-versa or generally mess with the formatting to make look just how they like. Then they check it in to the source repository.

    Great, now when you do a diff on it to see what they did you can't see the actual change from all the whitespace change noise.

    Yes, you can ignore whitespace changes in a diff but it's annoying. Besides in languages like Spin and Python whitespace changes might be important.
  • msrobotsmsrobots Posts: 1,952
    edited May 18 Vote Up-1Vote Down
    How so?

    Maybe I am wrong - happened before - and it is NOT the 'C' crowd, responsible. Then it is the '*nix' crowd simply IGNORING the two different escape sequences AVAILABLE in the 'C' language, and simply use \n as CRLF even if it is just a LF.

    --- ducking for cover ---

    We could solve both problems by declaring \t as LineTerminator and NewLine, thus giving \r and \n their original meaning and removing TABs altogether. :smile:

    simple,

    Mike
    I am just another Code Monkey.
    A determined coder can write COBOL programs in any language. -- Author unknown.
    Press any key to continue, any other key to quit

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.
  • Don't knock it, think how much energy the tab character has saved.
    Formerly known as TonyB
  • cgraceycgracey Posts: 9,139
    edited May 18 Vote Up0Vote Down
    TonyB_ wrote: »
    Don't knock it, think how much energy the tab character has saved.

    Interesting observation. I'm sure a lot of storage resources have been saved, too.

    I really like the idea of run-length compression for spaces and characters:

    $1E + CHR + $20..$7F = repeat CHR 1..96 times
    $1F + $20..$7F = repeat SPACE 1..96 times

    And I love mono-space fonts for programming. Variable-space fonts are horrible for code.

    We're going to MAKE 1980 GREAT AGAIN!!! (ErNa?)
  • I think RLC is not really needed anymore, with current storage sizes, saving spaces by encoding the content is not really needed.

    But ALLWAYS replace tab with spaces when entering code and loading/saving code would clean up at least one of the messes.

    Sometimes I need to agree with Heater., the TAB has to die in source files. The TAB key can still do the indention thing, just with spaces.

    I think arc or lha or lharc did that simple run-length compression...

    Mike
    I am just another Code Monkey.
    A determined coder can write COBOL programs in any language. -- Author unknown.
    Press any key to continue, any other key to quit

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.
  • msrobots wrote:
    I think RLC is not really needed anymore, with current storage sizes, saving spaces by encoding the content is not really needed.
    +1

    I totally agree, unless you're still programming a TRS80. :)

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Peter JakackiPeter Jakacki Posts: 7,459
    edited May 18 Vote Up0Vote Down
    Nothing stops us from RL encoding strings in our own print string routines which makes sense in embedded code. Once you have files though, you have file storage, which these days are millions of times larger than any text file anyway.

    btw, one advantage of RL encoding text files is in embedded systems as it can cut down on read speeds, that is as long as we are talking about RLEs in terms of kBs that is.

    Tachyon Forth - compact, fast, forthwright and interactive
    useforthlogo-s.png
    --->CLICK THE LOGO for more links<---
    Latest binary V5.4 includes EASYFILE +++++ Tachyon Forth News Blog
    P2 SHORTFORM DATASHEET +++++ TAQOZ documentation
    Brisbane, Australia
  • Nostalgia attack...
    Chars from 128 to 191 are block graphics, 192 to 255 multipsaces from length 0 upwards.
    ◁ Stay OmmmmmmPtimistic! ▷ ◁ Facebook. ▷ ◁ Google. ▷ ◁ Microsoft. ▷ ◁ No Source – No Go! ▷ ◁ Please help: http://rosettacode.org/wiki/Category:Spin ▷ ◁ Why Asimov's Laws of Robotics Don't Work - Computerphile ▷ ◁ DNA is a four letter word. ▷
  • Nostalgia attack:

    _FLEXO.GIF
  • ErNaErNa Posts: 1,090
    I'm watching, Chip! I just gave away a teletype that was a working place of an operator to one of the first graphic displays (vector). The early word star text editor made use of CR in allowing you to process a line twice and combine graphic characters from different letters. Was fun to use it.
    And I remember the time, when the "ctrl" key on the IBM keyboard was renamed to "strg" in Germany, it took years to educate people that "strg" is not an appreviation on "string" (what would make perfect sense in an alternative world) but the German word "Steuerung", what again means "control", but to control means "steuern", what again leds to "mit Steuern kann man steuern", what means "by taxes you can control more simple then by laws". But we know that pee too means the latest in microprocessors!
  • TorTor Posts: 1,934
    msrobots wrote: »
    How so?

    Maybe I am wrong - happened before - and it is NOT the 'C' crowd, responsible. Then it is the '*nix' crowd simply IGNORING the two different escape sequences AVAILABLE in the 'C' language, and simply use \n as CRLF even if it is just a LF.
    Again, wrong. Read my earlier post. \n is used as end of line marker in Unix text files, it has nothing to do with the actual output to terminal or whatever. To detect the end of line marker and transform that to whatever (which could be something completely different from CRLF!) is the job of the output driver, not the text editor.
    The only reason this is confusing is because MS hardcoded the actual final output into their text file format (presumably taken from CP/M, because one thing is certain -- MS never did much thinking themselves).

Sign In or Register to comment.