Terminal Text Editor on the Propeller in C/C++

Heater. · 2016-04-19 14:55

I have never thought much about writing a text editor, sounds like one of those horribly messy things I'd rather not deal with. On one project where we needed such a thing we all volunteered another guy to do it

But I do recall reading something by a guy who claimed to have had an epiphany regarding building an editor when the following idea occurred to him:

1) Keep all the text in some region of RAM reserved for file editing. We'll ignore the problem of editing text that is bigger than will fit in RAM for now.
2) Keep all the text before the cursor position/insertion point at the bottom of this RAM space.
3) Keep all the text after the cursor position as high as possible in the RAM region. You have a hole in the middle.

This means you are always adding removing from the end of a string which is the "low text". Which is easy.

If the user positions the cursor earlier in the line or on previous lines just move the text they moved over up to the "high text" and continue.
Similarly if the user moves the cursor forward, just move text from the high end to the low end and continue.

Possibly/maybe the way of managing things is much simpler and takes up less code and data space than managing doubly linked lists or whatever.

DavidZemon · 2016-04-19 15:10

Heater. wrote: »

I have never thought much about writing a text editor, sounds like one of those horribly messy things I'd rather not deal with. On one project where we needed such a thing we all volunteered another guy to do it

But I do recall reading something by a guy who claimed to have had an epiphany regarding building an editor when the following idea occurred to him:

1) Keep all the text in some region of RAM reserved for file editing. We'll ignore the problem of editing text that is bigger than will fit in RAM for now.
2) Keep all the text before the cursor position/insertion point at the bottom of this RAM space.
3) Keep all the text after the cursor position as high as possible in the RAM region. You have a hole in the middle.

This means you are always adding removing from the end of a string which is the "low text". Which is easy.

If the user positions the cursor earlier in the line or on previous lines just move the text they moved over up to the "high text" and continue.
Similarly if the user moves the cursor forward, just move text from the high end to the low end and continue.

Possibly/maybe the way of managing things is much simpler and takes up less code and data space than managing doubly linked lists or whatever.

Very interesting idea. Doesn't make it very easy to display the file though. That is one thing I like very much about organizing it by line in RAM: displaying the content is now very easy. The above plan would make insertion wonderfully simple... until the cursor is moved. It's also highly efficient in terms of RAM usage. But... I'm not convinced (yet) that it is simpler overall.

Heater. · 2016-04-19 15:56

What's the problem with displaying the file?

I'll assume for now to ignore problems of lines that are longer than the screen width, wrapping lines etc...

Your cursor is at some position on the screen. On that screen line you display the the last line in the low-ram area. That gets you up to the cursor position. And you display up till the line terminator whatever is at the start of the high RAM area. You can find the start of the last line in low-ram by simply scanning backwards from the top.

Then you continue scanning backwards in low RAM and displaying what you find in the lines above the cursor on the screen. Until the screen is filled to the top.

Also continue scanning forwards in the high-ram and displaying what you find below the cursor on the screen. Until the screen is filled to the bottom.

Of course I may have missed some complications here, and have not though about long lines and line wrapping etc.

DavidZemon · 2016-04-19 16:51

Heater. wrote: »

What's the problem with displaying the file?

I'll assume for now to ignore problems of lines that are longer than the screen width, wrapping lines etc...

This is exactly the problem I was imagining. Hard to handle files with lines longer than the screen width. I think that's a rather critical aspect of a text editor, and not worth ignoring, even temporarily.

Heater. · 2016-04-19 17:04

Hmm....presumably as you scan backwards in the low-ram you can find where the next last line starts, and how many characters long it is. Knowing it's length you know how many screen lines it will take up. That tells you where to paint it onto the screen. Keep scanning backwards till you run off the top of the screen.

Displaying the lines after the insertion point just means scanning chars forwards in the high-ram and painting them to the screen until you run off the bottom.

Sounds a bit fiddly to get right but should not be very big in terms of actual code.

This all assumes you are redrawing the screen after every insertion and deletion I guess. Is that a problem?

What have I missed here?

DavidZemon · 2016-04-19 17:41

PWEdit's current implementation actually does no line-wrap, but enables scrolling. This feels much more natural to me.

DavidZemon · 2016-04-19 17:46

For reference, here's how PWEdit draws the screen. It takes two parameters, the first line number, and the first column number

void display_file_from (const unsigned int startingLineNumber, const unsigned int startingColumnNumber) {
    auto lineIterator = this->m_lines.cbegin();

    unsigned int i = startingLineNumber;
    while (i--)
        ++lineIterator;

    for (unsigned int row = 1; row <= this->m_rows; ++row) {
        this->print_line_at_row(startingColumnNumber, lineIterator, row);
        ++lineIterator;
    }

    this->m_firstLineDisplayed = startingLineNumber;
}

void print_line_at_row (const unsigned int startingColumnNumber,
                        const std::list<PropWare::StringBuilder *>::const_iterator &lineIterator,
                        unsigned int row) const {
    const uint16_t charactersInLine = (*lineIterator)->get_size();

    this->move_cursor(row, 1);
    unsigned int column;
    for (column = 0; column < this->m_columns && (column + startingColumnNumber) < charactersInLine; ++column)
        *this->m_printer << (*lineIterator)->to_string()[column + startingColumnNumber];

    while (column++ < this->m_columns)
        *this->m_printer << ' ';
}

void move_cursor (const unsigned int row, const unsigned int column) const {
    *this->m_printer << ESCAPE << '[' << row << ';' << column << 'H';
}

I think this makes it pretty easy to display and move around to different parts of the file

DavidZemon · 2016-04-20 03:09

PWEdit is definitely going down the lines of vim, with separate insert/normal modes. I'd like some way to distinguish between them which does not take up an entire row of the display. The larger the display, the longer redraws take, so it pays to keep the display small. Also, I've been successfully testing PWEdit on my little 4x20 HD44780 screen and it works! How cool

But it sure wouldn't work very well if 25% of the screen was used up by a status line like in vim.

So: who has an idea for signaling that the user is in "insert" mode without using the bottom row of the terminal?

Heater. · 2016-04-20 07:48

A LED !

Change the cursor some how.

macca · 2016-04-20 09:42

DavidZemon wrote: »

PWEdit is definitely going down the lines of vim, with separate insert/normal modes. I'd like some way to distinguish between them which does not take up an entire row of the display. The larger the display, the longer redraws take, so it pays to keep the display small. Also, I've been successfully testing PWEdit on my little 4x20 HD44780 screen and it works! How cool But it sure wouldn't work very well if 25% of the screen was used up by a status line like in vim.

So: who has an idea for signaling that the user is in "insert" mode without using the bottom row of the terminal?

Usually in text mode, the cursor shape is used to distinguish the insert/overwrite mode: a full character cursor is overwrite, an underline (a bit more tick than a single line) cursor is insert. I think you can define the shape for that kind of display, right ?

Alternatively, a timed indication that overlays the last line for few seconds may be used to indicate the current state when the mode is toggled, something that display 'INSERT' or 'OVERWRITE' then disappear so you don't waste space.

DavidZemon · 2016-04-21 01:01

Line endings: thought on how to handle? Hardcoding \n would be the easiest/smallest code. A boolean option to the constructor for whether or not to insert \r would also be an efficient option. Auto-detecting could be reasonably easy, depending on how "smart" it was. I think if i were going to auto-detect, I would just look at the first line and nothing else. If the file didn't have any newlines, what would the default be?

Dave Hein · 2016-04-21 01:14

The way I usually handle text files on input is to assume a \n at the end, and remove both \r and \n for internal storage relying only on the NULL terminator. On output, a \r could be added as an option just before the \n if you want to create a DOS style text file.

DavidZemon · 2016-04-21 01:29

Dave Hein wrote: »

The way I usually handle text files on input is to assume a \n at the end, and remove both \r and \n for internal storage relying only on the NULL terminator.

That's exactly what I'm doing. Glad to know I came up with the same solution as you

Dave Hein wrote: »

On output, a \r could be added as an option just before the \n if you want to create a DOS style text file.

That option is what I'm asking about though. How should I determine whether or not to add the \r before \n?

Electrodude · 2016-04-21 04:04

To allow you to edit bigger files, what if you have a swapfile on an SD card or in EEPROM that you offload edited but unsaved parts of the file in? Keep untouched parts of the file in the source file, edited but unsaved parts that don't fit in RAM in the swapfile, and whatever you're currently working on and whatever else fits in RAM.

You should also read this:

http://www.chiark.greenend.org.uk/~sgtatham/tweak/btree.html

That guy (the author of PuTTY and a lot of other cool things) wrote a really neat hex editor using the data structure described on that page. It can insert, delete, cut, and paste virtually instantaneously on gigabyte-sized binary files. Saving the file still takes O(n) time, obviously, but everything other than saving and searching takes O(log(n)) time. The only reason it isn't the perfect hex editor is because it has an Emacs interface

.

As a basic summary, his format is a B-tree, where each branch node knows how many bytes are under it, to allow efficient seeking. Leaf nodes of the tree are either fixed-sized buffers that only have so much used that fill from the bottom up, or placeholders indicating unloaded parts of the file (these can be bigger than one block). The buffers are in a doubly linked list, for quickly finding neighbors. If you try inserting into a full buffer, it splits the buffer into two around the insertion point first.

Dave Hein · 2016-04-21 12:41

DavidZemon wrote: »

That option is what I'm asking about though. How should I determine whether or not to add the \r before \n?

The DOS format is \r\n. I don't think anybody uses the reverse order, though there are probably some oddball cases of it. Personally, I wouldn't bother with supporting \n\r.

@Electrodude, I started working on an editor that used a swap file about a year ago, but I never finished it. It uses an 8K cache in RAM, and keeps the rest of the text file in the swap file. When a file is opened the contents of the file are converted to a doubly-linked list and written to the swap file. Only the portion of the file that is currently being edited is kept in RAM. The file is saved by converting the doubly-linked list swap file back to a normal text file terminated with newlines.

DavidZemon · 2016-04-21 12:57

Dave Hein wrote: »

DavidZemon wrote: »

That option is what I'm asking about though. How should I determine whether or not to add the \r before \n?

The DOS format is \r\n. I don't think anybody uses the reverse order, though there are probably some oddball cases of it. Personally, I wouldn't bother with supporting \n\r.

Sorry, I said that in a poor way. I simply meant whether or not I should add \r at all. The \n will be hardcoded in as the last character. So I'd have something like

if (this->insertCarriageReturn) {
  line << '\r';
line << '\n';

But the question is how should I set the value "this->insertCarriageReturn"?

macca · 2016-04-21 13:55

DavidZemon wrote: »
Sorry, I said that in a poor way. I simply meant whether or not I should add \r at all. The \n will be hardcoded in as the last character. So I'd have something like
if (this->insertCarriageReturn) {
  line << '\r';
line << '\n';
But the question is how should I set the value "this->insertCarriageReturn"?

I don't understand the problem. You can just use \n without \r, windows users may have some complaints reading the file with nodepad, but every other editor should be fine with that. Or, add a boolean to the class constructor, or to the function that saves the file, I don't see the problem.

DavidZemon · 2016-04-21 14:17

macca wrote: »
DavidZemon wrote: »
Sorry, I said that in a poor way. I simply meant whether or not I should add \r at all. The \n will be hardcoded in as the last character. So I'd have something like
if (this->insertCarriageReturn) {
  line << '\r';
line << '\n';
But the question is how should I set the value "this->insertCarriageReturn"?
I don't understand the problem. You can just use \n without \r, windows users may have some complaints reading the file with nodepad, but every other editor should be fine with that. Or, add a boolean to the class constructor, or to the function that saves the file, I don't see the problem.

It's not a "problem" per se. Just a question what folks would like. I know I personally do everything without \r, but I also live on Linux where that is never a problem. I also know a lot of (most?) users on this forum are Windows users. So your vote goes to adding a boolean to the constructor? This would make it non-adjustable at runtime. That's probably okay - but I'd love to hear other opinions on here (because obviously we can't all agree, that'd be against forum policy or something)

DavidZemon · 2016-04-21 14:25

macca wrote: »

DavidZemon wrote: »

PWEdit is definitely going down the lines of vim, with separate insert/normal modes. I'd like some way to distinguish between them which does not take up an entire row of the display. The larger the display, the longer redraws take, so it pays to keep the display small. Also, I've been successfully testing PWEdit on my little 4x20 HD44780 screen and it works! How cool But it sure wouldn't work very well if 25% of the screen was used up by a status line like in vim.

So: who has an idea for signaling that the user is in "insert" mode without using the bottom row of the terminal?

Alternatively, a timed indication that overlays the last line for few seconds may be used to indicate the current state when the mode is toggled, something that display 'INSERT' or 'OVERWRITE' then disappear so you don't waste space.

Been thinking more about this and I like the idea of the last line displaying the current mode whenever the mode is switched and holding that display until the first key is pressed after the mode switch, at which point the last line redraws. Eventually, I can add a shortcut such as "ctrl m" to display the current mode again until the next key press.

I can't think of any good way to use a timer for this that doesn't have one of the following drawbacks: RTC required, extra cog required, 53-rollover. I'm not okay with any of those significant drawbacks for such a simple requirement.

macca · 2016-04-21 14:32

DavidZemon wrote: »

It's not a "problem" per se. Just a question what folks would like. I know I personally do everything without \r, but I also live on Linux where that is never a problem. I also know a lot of (most?) users on this forum are Windows users. So your vote goes to adding a boolean to the constructor? This would make it non-adjustable at runtime. That's probably okay - but I'd love to hear other opinions on here (because obviously we can't all agree, that'd be against forum policy or something)

I see, in that case maybe a key sequence to toggle the line termination, with a default/initial setting in the constructor. Back in the BBS era the full screen editors used a sequence like ^k + <other key> to access advanced options, maybe ^k + ^l toggles the line termination.

You can also try to auto-detect the terminator and default setting on file loading, if not empty of course.

DavidZemon · 2016-06-29 02:45

Life is slowly starting to go back to normal again. I'm back at work on this text editor a little every night. I decided to try mocking some of the PropWare classes and then running PWEdit locally where I'd have the luxury of a faster development cycle and a real debugger. That failed horribly

. The terminal used by CLion is not accepting backspace (0x08), but rather just showing a non-sense character. It's also ignoring my attempt to disable echo on std::cin - though at least that part works successfully in Ubuntu's standard terminal emulator. And my "cursor" (a # that is moved around) doesn't actually move when I press the asdw keys, despite PWEdit registering the keys perfectly well and doing the right thing behind the scenes. I'll have to put a bit more work into this later... the idea of desktop development sure is appealing.

DavidZemon · 2016-06-29 02:51

Escape sequences aren't printing correctly in CLion's run terminal either:

Booooo

msrobots · 2016-06-30 03:15

Visual Studio checks line endings while loading a file. If consistent it just switch to that mode, if not the user get ask if he want to adjust line endings to be consistent.

The options are

CRLF or \r\n (Windows)
LF or \n (Linux)
CR or \r (old Mac format)

In the save dialog you can also decide how you like to save the file.

But basically a editor should NEVER press any specific line endings on a user file. This should definitely be a user decision.

Say I edit .ini files. Without CRLF I will trash my system. Same with manual editing Email files .eml or just creating http requests and responses. you MUST adhere to the RFCs and there definitive you need CRLF.

Enjoy!

Mike

msrobots · 2016-06-30 03:19

or interpret CR and LF correctly.

CR should position the curser on the first char of the actual line and LF should do a new line

Enjoy!

Mike

Terminal Text Editor on the Propeller in C/C++

Comments