Forum Update - Announcement about May 10th, 2018 update and your password.

CR vs CRLF Dilema

2

Comments

  • If memory serves, we sent LF CR CR from an SDS920 to a model 35 Teletype. Conventional wisdom held that both CR's were needful.
    cgracey wrote: »
    jmg wrote: »
    cgracey wrote: »
    I've thought for a while that a sub-$20 trigger character followed by a $20..$7F space count character would be great for consolidating white space and getting around the tab issue. For example, say $1F (trigger) + $7F (count) would mean 96 spaces ($7F - $20 + 1 = 96). This would shorten up source code files nicely and be easy to handle within editors, etc. The ship has sailed, I know, but some convention like that established a while back would have been great. $7F would have made a good trigger character, too, keeping things out of the murky sub-$20 range.
    Interesting idea - run length coded text files.
    Way, way back when disk storage mattered, and fixed space fonts were the most common, that certainly could have helped.
    These days, file size in source code is a `who cares` and variable width fonts complicate things...

    But I want to go back to 1980.

    All this talk (in another thread?) about what IDE to use is so BLAH!! We've got a whole computer in the P2 by 1980 standards. A really good one. I like the idea of getting off the reservation and making an IDE that is about 16KB and runs on the P2, itself. Just plug in a monitor, keyboard, and mouse. Instant rock-solid development experience with real-time turn-around like no one's ever seen. It would enable super fast development. People don't know what they are missing with the current paradigm.

    Will you be able to design the P3 on this system?
  • potatoheadpotatohead Posts: 9,196
    edited May 16 Vote Up0Vote Down
    Nope. Or, maybe more accurately, not in a time that makes any sense.

    But, it's no different from that time in there were bigger machines. For many things, they are not needed. Same story here.

    But... a friend and I were talking about this system. We did computer music back in the day. Was rough. The Apple had one bit beeps. Could get a chord, but it's all very painful.

    Atari machines had 4 voices, but mostly out of tune. I learned some arranging and composition on that one. Make it fit. :dizzy:

    C64 had the awesome SID chip! Hints at just what was to come. It was accurate. Sometimes I would input sheet music. Had learned sight reading for vocal performance by that time, but it was good for exploring different parts. I could learn them, then direct a choir.

    A P2 doing something like this: https://en.wikipedia.org/wiki/Spem_in_alium#Renditions

    !!

    Yeah, take me back in style. :D

    Do not taunt Happy Fun Ball! @opengeekorg ---> Be Excellent To One Another SKYPE = acuity_doug
    Parallax colors simplified: http://forums.parallax.com/showthread.php?123709-Commented-Graphics_Demo.spin<br>
  • kwinnkwinn Posts: 7,873
    AJL wrote: »
    Heater. wrote: »
    msrobots,
    it is not the time taken for the commands, but different meaning.
    Let's just say that I don't believe so.

    Good old electro-mechanical teletypes were marvels of electric motors, relays, cogs and wheels. All synchronized to the incoming "start bit" on the serial input characters.

    As such, I imagine that a carriage return could be done in one character time. A line feed could be done in another.

    Doing both, NEW LINE, in one character time may have not been possible to engineer.

    As you quite rightly point out "...some extra pressure was needed to add a LF..." with the old typewriters.

    In this situation, your abstract meaning are of no consequence. What matters is the signal you have to feed into that mass of motors, relays, cogs, and wheels to get it to do what you want.

    Given that the carriage return acts on the carriage mechanism, and the line feed acts on the platen, I can't see any mechanical reason for the two actions to require decoupling. As ITA2 did not contain a "new line" code it is impossible to test with teletype hardware without modifying the decoder; Good luck finding someone with hardware who is willing to experiment on your behalf.

    I think conventional inertia lead us to this situation, not mechanical concerns.

    The early teletypes did require two separate actions to perform for mechanical reasons. They had a single synchronous motor that provided the mechanical power to perform all of the movements to print a character, perform a carriage return, or a line feed. The RPM of that motor also determined the bit timing so there were constraints on how much of a load could be placed on it.
    In science there is no authority. There is only experiment.
    Life is unpredictable. Eat dessert first.
  • For RTTY, there were Teletype and Western Union teleprinter adaptations that automatically did a CR when a LF was received (or vice-versa, can't remember which). The idea was that if one of the endline characters were missed, you didn't want the next line writing over the top of the previous line.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • potatohead wrote: »
    Nope. Or, maybe more accurately, not in a time that makes any sense.

    But, it's no different from that time in there were bigger machines. For many things, they are not needed. Same story here.

    But... a friend and I were talking about this system. We did computer music back in the day. Was rough. The Apple had one bit beeps. Could get a chord, but it's all very painful.

    Atari machines had 4 voices, but mostly out of tune. I learned some arranging and composition on that one. Make it fit. :dizzy:

    C64 had the awesome SID chip! Hints at just what was to come. It was accurate. Sometimes I would input sheet music. Had learned sight reading for vocal performance by that time, but it was good for exploring different parts. I could learn them, then direct a choir.

    A P2 doing something like this: https://en.wikipedia.org/wiki/Spem_in_alium#Renditions

    !!

    Yeah, take me back in style. :D
    A music application for the P2 sounds really interesting!

  • AJLAJL Posts: 51
    edited May 16 Vote Up0Vote Down
    kwinn wrote: »
    AJL wrote: »
    Heater. wrote: »
    msrobots,
    it is not the time taken for the commands, but different meaning.
    Let's just say that I don't believe so.

    Good old electro-mechanical teletypes were marvels of electric motors, relays, cogs and wheels. All synchronized to the incoming "start bit" on the serial input characters.

    As such, I imagine that a carriage return could be done in one character time. A line feed could be done in another.

    Doing both, NEW LINE, in one character time may have not been possible to engineer.

    As you quite rightly point out "...some extra pressure was needed to add a LF..." with the old typewriters.

    In this situation, your abstract meaning are of no consequence. What matters is the signal you have to feed into that mass of motors, relays, cogs, and wheels to get it to do what you want.

    Given that the carriage return acts on the carriage mechanism, and the line feed acts on the platen, I can't see any mechanical reason for the two actions to require decoupling. As ITA2 did not contain a "new line" code it is impossible to test with teletype hardware without modifying the decoder; Good luck finding someone with hardware who is willing to experiment on your behalf.

    I think conventional inertia lead us to this situation, not mechanical concerns.

    The early teletypes did require two separate actions to perform for mechanical reasons. They had a single synchronous motor that provided the mechanical power to perform all of the movements to print a character, perform a carriage return, or a line feed. The RPM of that motor also determined the bit timing so there were constraints on how much of a load could be placed on it.

    Yet, arguably, a larger motor could have supported the double action if it were deemed necessary; I submit that it wasn't.

    Addit: I suggest this is a classic example of a problem not being solved, because it wasn't perceived.

    Why put in a large enough motor to perform both actions simultaneously, when the code set doesn't have room for a "new line" code?
    Why eliminate another character from the code set to allow addition of a "new line" character when the motor isn't strong enough to perform them both simultaneously, and the two code combination will suffice?

  • Heater.Heater. Posts: 20,832
    edited May 16 Vote Up0Vote Down
    Quite so. They were not worried about the semantics of carriage return, line feed, etc. They just wanted to get the teletype to do what was necessary,

    We finally got something that actually means "new line" in one symbol with the Unicode NEL, The HTML hex entity & #x85;

    Without the space between "&" and "#". Stupid forum corrupts my post and displays it as ... if I don't put the space.

    Sadly the UTF8 for that is still two bytes.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 21,967
    edited May 16 Vote Up0Vote Down
    heater wrote:
    Without the space between "&" and "#". Stupid forum corrupts my post and displays it as ... if I don't put the space.
    That's what it's supposed to do. Hex 85 is the ASCII ellipsis character. It isn't really the forum software that does it, though; it's just vanilla HTML.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • & amp ; # x 8 5 ;
    strip spaces
    &#x85;
    ◁ Stay OmmmmmmPtimistic! ▷ ◁ Facebook. ▷ ◁ Google. ▷ ◁ Microsoft. ▷ ◁ No Source – No Go! ▷ ◁ Please help: http://rosettacode.org/wiki/Category:Spin ▷ ◁ Why Asimov's Laws of Robotics Don't Work - Computerphile ▷ ◁ DNA is a four letter word. ▷
  • Heater.Heater. Posts: 20,832
    edited May 16 Vote Up0Vote Down
    Phil,
    That's what it's supposed to do. Hex 85 is the ASCII ellipsis character. It isn't really the forum software that does it, though; it's just vanilla HTML.
    I have to disagree, because:

    1) Never mind what Hex 85 is. What I entered into my post was a regular string of printable characters. Namely "& # x 8 5 ;"

    Without all the spaces of course. I see no reason for the forum too corrupt that into "..." or anything else.

    2) Who said anything about HTML?

    I'm just writing regular text here. Not composing HTML.

    3) If that NEL entity is displayed as "..." rather than creating a new line. Then Why?

    Which seems to be the case even here: https://www.fileformat.info/info/unicode/char/0085/index.htm

    Is this yet another reason Unicode is brain damaged?

  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 21,967
    edited May 16 Vote Up0Vote Down
    heater wrote:
    Who said anything about HTML? I'm just writing regular text here. Not composing HTML.
    True. But when that text gets sent to your browser, the browser interprets it as an HTML literal. Check the HTML source, and you will see that what you wrote comes through verbatim. It's your browser, not the forum software or the server that sends it.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Let me try that again: …
  • Well, it's impossible. None of the text in my last post exists in the "view source" of this page. Not even words like "try" or "again". I wonder where it is.

    Anyway, what you are claiming, is that it is OK for normal everyday text to go into the system, and something totally different to come out the other end.

    I find this unacceptable. Computers should not corrupt data.


  • Roy ElthamRoy Eltham Posts: 2,295
    edited May 16 Vote Up0Vote Down
    Heater. wrote: »
    Let me try that again: …

    When I view the source for your message Heater, it has the & # x 8 5 ; (minus spaces). Also when I use the quote feature, it shows those chars when I am editing the message. (and when I edit my message the editor shows the chars instead of the ellipsis)
  • Yep, now that I have reloaded the thing a few times the "view source" view actually contains my post above.

    Seems the "view source" in Chrome is not necessarily the source of the page you are actually looking at!

  • heater wrote:
    I find this unacceptable. Computers should not corrupt data.
    Surly to bed, surly to rise.
    :)
    Oh, come on! It's not a corruption. Would you also want bbcodes to be displayed verbatim, rather than performing their formatting functions? Anyway, yeti posted the perfect solution.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Well, sure enough there it is in the page source: "Let me try that again: & # x 8 5 ;"

    Without the spaces between & and ; of course.

    What does this mean?

    Clearly my text has not been escaped properly in the forum software. It comes back to my browser as an HTML entity rather than the text it should be.

    How about if I enter &#x85; ?
  • Looks like it works right?

    But what I actually wrote was:

    & a m p ; # x 8 5 ;

    Without all the space of course.

    The same corruption of text on the round trip.
  • Heater.Heater. Posts: 20,832
    edited May 16 Vote Up0Vote Down
    Gibberish deleted...

  • Heater.Heater. Posts: 20,832
    edited May 16 Vote Up0Vote Down
    Phil,
    Oh, come on! It's not a corruption. Would you also want bbcodes to be displayed verbatim, rather than performing their formatting functions?
    Good point.

    Of course a markup language, like bbcodes or markdown, etc are useful. But where is it in the instructions here that bbcode code includes HTML?

    And what about this:
        someFunc(x + 8);
    
    That is data corruption.
    Anyway, yeti posted the perfect solution.
    Indeed. As I also found above. But now one is writing the HTML markup language into ones post. Not bbcodes.
  • Cluso99Cluso99 Posts: 13,770
    edited May 16 Vote Up0Vote Down
    For a newsletter we send out at work, I have to escape out a lot of characters so they format correctly in a browser.

    Things like common ( and ) and $ have to be escaped otherwise they render wrong or disappear. For instance, on an iPhone the ( and ) go MIA.
  • Heater.Heater. Posts: 20,832
    edited May 16 Vote Up0Vote Down
    Exactly. The escaping going on in this forum is not totally correct.

    When I type "& # x 8 5 ;", without the spaces, it should come back as "&#x85;"

    Instead of "…"
  • Heater. wrote: »
    Quite so. They were not worried about the semantics of carriage return, line feed, etc. They just wanted to get the teletype to do what was necessary,

    We finally got something that actually means "new line" in one symbol with the Unicode NEL, The HTML hex entity & #x85;

    Without the space between "&" and "#". Stupid forum corrupts my post and displays it as ... if I don't put the space.

    Sadly the UTF8 for that is still two bytes.

    You missed it by about 23 years; EBCDIC, introduced around the same time as 7-bit ASCII (1963), was 8-bit and used 15h for newline, while Unicode was introduced in 1987.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 21,967
    edited May 17 Vote Up0Vote Down
    heater wrote:
    Exactly. The escaping going on in this forum is not totally correct.
    Oh, please! The forum handles this perfectly, by not handling it. If you're not HTML-aware, you shouldn't be posting HTML literals here.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Aha, EBCDIC.

    Built ASCII to/from EBCDIC converters for mainframes to communicate ;)
  • TorTor Posts: 1,934
    edited May 17 Vote Up0Vote Down
    The text box we write our posts in is *not* supposed to be HTML aware. We are not composing HTML when we write our messages. The forum software is the only entity that produces HTML for our browsers. Not random users writing in a text box. The software adds some specific, limited, and *controlled* formatting tools (bbcodes) for the users. The software is perfectly able to produce HTML that displays the text the way the user wrote it. It certainly isn't supposed to let random HTML slip through, that's simply a bug.
  • Heater.Heater. Posts: 20,832
    edited May 17 Vote Up0Vote Down
    Phil,
    Oh, please! The forum handles this perfectly, by not handling it.
    Not so. I seriously hope the forum software is handling it. Websites have to escape all kind of user input to ensure that malicious Javascript is not injected into the page or SQL queries constructed that are up to no good.

    In fact on one Parallax forum change I was nearly banned when I demonstrated how easy it was to inject malicious code into a post. A security issue caused by the forum not handling it and failing to escape my input strings correctly. Luckily the powers that be decided to fix the issue rather than ban me for pointing it out.

    Example:

    The forum software is correctly (I hope) handling this:

    <script>alert("You are pwned!");</script>

    If those script tags were not escaped correctly that would be a serious security issue.
    If you're not HTML-aware, you shouldn't be posting HTML literals here.
    As pointed out above the text we enter into a post is not supposed to be rendered as HTML, it's supposed to be rendered as the text I type. Unless it includes some specified bbcode.

    What if I'm not HTML aware and I'm not posting HTML literals here? Things that may look like HTML entities turn up in all kind of other text.

    Example:

    In C one gets the address of a variable with the "&" operator. like so: int *ltp = <

    And poof my post is corrupted!
  • Heater. wrote: »
    ... I imagine that a carriage return could be done in one character time.

    The initiation of the carriage return, yes.

    But some teletypes (at least the one I used) had a carriage that took more than one character time to return to the home position.

    My understanding is that there was no flow control on some of these units, nor buffering.

    So, in order to not lose information to be printed, a number of nulls would be sent after each line end sequence to allow for the time it took to physically return the carriage.

    I'm sure someone will correct me if I'm wrong :smiley:
    Tulsa, OK

    My OBEX objects:
    AGEL: Another Google Earth Logger
    DHT11 Sensor

    I didn't do it... and I promise not to do it again!
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 21,967
    edited May 17 Vote Up0Vote Down
    Heater,

    I see the HTML literals as a convenience — the same as I do the bbcodes — especially when you're trying to display a character that's not on the keyboard, like the ellipsis or, even more especially, unicode characters like the one for the Euro: €.

    Note the use of the mdash in the above paragraph, which I entered as &mdash;.
    What if I'm not HTML aware and I'm not posting HTML literals here? Things that may look like HTML entities turn up in all kind of other text.
    The same is true if you're not bbcode-aware and are not trying to post bbcode, but do inadvertently. You just need to be aware, that's all. It's not so hard, really.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • IMHO we don't have to cater for mechanical teletypes any more.

    This should have been sorted years ago, but there were commercial interests at play. I see MS has finally updated NotePad to read and write *nix and Mac files too.

    Personally, we should have ditched <lf> years ago, along with autocorrect on phones - they always substitute something wrong when you're not looking ;)
Sign In or Register to comment.