Seeed Studio's 3-color E-Ink display (176x264 pixels) (with FlexProp C & Catalina) on P2 and P1

Rayman · 2022-01-29 23:21

Just got Seeed's github Arduino files converted over to FlexProp C.
https://www.seeedstudio.com/2-7-Triple-Color-E-Ink-Shield-for-Arduino-p-4069.html

Update 22May22: Improved P1 version a bit and now at 80 MHz.

Update: Fonts work in new version attached here.
You can open 3Color_EInk.c in FlexProp's IDE and run it.

Also, it used a less than full screen display buffer because Arduino doesn't have enough RAM for full screen. P2 does, so that needs fixing too.

Update: Now works with both P2 and P1.

For a regular P2 or P1 board, you'd probably have to run about 11 jumper wires. Don't need SDA or SCL or the two keys wires for buttons.

You may need the 5V and put 3.3V on VREF in order to make the level translator work...

Rayman · 2022-01-29 23:26

Used my Arduino layout P2 board to test it out...

evanh · 2022-01-30 00:18

Oh, man, only +1 colour (red). No shades even. When I saw "colour" I got slightly higher hopes.
Certainly are amazing as low-power displays though.

Rayman · 2022-01-30 00:36

Looks like you can't do shades with this one...

Not just low-power, NO-power! I have a grayscale one in the basement that's had the same image with no power for years...

Rayman · 2022-01-30 16:38

Got the fonts working!

I might leave in the option of a small display buffer.
That way, might work on P1 too...

Don't think I've tried FlexProp C with P1 yet.
Think I'll try with the Arty S7 P1V, since it has an Arduino compatible looking layout...

Rayman · 2022-01-30 16:54

I see you can buy compatible looking modules from here:
https://www.buydisplay.com/e-paper-display-module-e-ink-display-kit-manufacturers/e-paper-display-panel?appearance=902
Sizes up to 800x480

They also have yellow-white-black versions...

Rayman · 2022-01-31 20:49

Updated top post with zip where fonts now work.
Should be reproducing the results of the Arduino version from where it came.

The SPI.spin2 driver (that uses a cog) for spi doesn't work, so left it off, using spi.c instead that doesn't use a cog and bit-bangs with the smart pins.

What is probably better that both of these is a smartpin based SPI driver.
I see @avsa242 has one in Spin2 using synchronous serial smartpin mode.
Not sure if there's already one in C...

evanh · 2022-01-31 21:45

Bit-bashing works efficiently when optimised for whole bytes. Funnily, a combination of bashing writes and using a smartpin for reads would produce the least brain twist. This is because of the I/O staging flops introduces so much lag between output and input.

With pin outputs, everything is fire and forget. Therefore the pins toggle in the same order the software bashes. But pin inputs, back from a SPI device, are dependant on responses to prior outputs. The clock pin in particular. The various round trip latencies, internal to the Prop2, create a confusion and are hard to figure out.

Well, as it turns out, a smartpin's synchronous serial mode takes it's clock as an input from the pins. This matters. For incoming read data, and the matching incoming clock signal, those Prop2 internal latencies are all equal and therefore don't interfere.

Okay, that's the spiel. Now the code ... man, I haven't looked at SPI in a long time ...

evanh · 2022-02-01 03:09

Oh, Rayman,
Did you know the original source code only does transmits? That kind of shortcuts any effort.

void  spi_start_bb( int cspin, int ckpin, int ckidle )
{
//  This routine sets SPI clock idle level to match ckidle
//    0 is CPOL=0
//    1 is CPOL=1
//
    _pinh( cspin );
    _pinw( ckpin, ckidle );
}


void  spi_txbyte_bb( int txpin, int ckpin, uint8_t data )
{
// Transmit SPI byte data at sysclock/8.  eg: Prop2 at 160 MHz
//    will produce a 20 MHz SPI clock.
//
    __asm volatile {    //directive to use fcache for deterministic timing
        shl data, #32-7  wc

        rep @.rend, #8
        drvc    txpin
        outnot  ckpin
        shl data, #1  wc
        outnot  ckpin
.rend
    }
}

EDIT: Bug fix. Corrected the initial shift amount.

Rayman · 2022-02-01 13:52

Yes, this code doesn’t really need speed at all…

Was thinking more about ftdi eve and also uSD…

evanh · 2022-02-01 22:52

I've updated the above to use int datatypes the same as yours. Also added the volatile directive for compiling to Fcache in newer versions flexC. This cleans up erratic I/O timing.

evanh · 2022-02-02 20:38

Huh, with a smartpin mode, clock and data inputs can actually be up to six pins apart. Doesn't work for outputs. It's a little convoluted beyond three pins, and higher probability of clashing with outputs, but completely doable.

evanh · 2022-02-03 06:38

Fixed a bug with the initial bit-shift of the transmitter routine - actually tested the code now.

EDIT: Here's the matching receiver routine. Notably bigger because of the compensation for clock to data lag out and back again. I had to sit down and test this with something real. The lag timing is always tricky.

Tested with 4 MHz sysclock (500 kHz SPI clock) and 360 MHz sysclock (45 MHz SPI clock).

uint8_t  spi_rxbyte_bb( int rxpin, int ckpin )
{
// Receive SPI byte data at sysclock/8.  eg: Prop2 at 160 MHz
//   will produce a 20 MHz SPI clock.
//
    uint8_t  data;

    __asm volatile {    //directive to use fcache for deterministic timing
        outnot  ckpin           //lag start
        nop             //2
        outnot  ckpin           //4

        rep @.rend, #7      //6
        outnot  ckpin           //8
        testp   rxpin  wc       //9  (TESTP does an early sample of pin)
        outnot  ckpin
        rcl data, #1
.rend
        nop
        testp   rxpin  wc
        rcl data, #1
        zerox   data, #8
    }

    return data;
}

evanh · 2022-02-03 08:27

I've got a smartpin version operational but it doesn't provide any great speed up and is more complicated. A tightly crafted block based routine would offer a bigger payoff. An alternative is using Chip's streamer based EEPROM programmer/booter. Can't beat it for speed. The hardware directly transfers between the EEPROM and hubRAM. Software just manages the start and stop.

One decent side-effect of the testing with smartpins is I did work out configuring of all four SPI clocking modes. So the smartpins can do that no problem. Dunno how much that'd be of interest.

avsa242 · 2022-02-03 13:52

@evanh,
I'd definitely be interested if you're willing to share. That's one of the biggest limitations of my smartpin-based engine Ray mentioned above: it only supports mode 0.

Cheers

evanh · 2022-02-03 18:27

avsa242,
Well, the smartpin based functional routines are currently in an experimental state. I'm making use of the hardware buffering to give an apparent overhead reduction.

However, the clocking mode thing wasn't a big deal. It was a simple case of inverting the clock pins appropriately. The bit-bashing above is almost there itself. CPOL is in place, just needs CPHA ... In fact I think I could add it ...

EDIT: ... hmm, slight brain fart there. There's no clock inputs when bashing, it's blind to clocking, just relies on predicted timing. But I do have a solution by making an alternate receiver routine. Albeit untested. Note the extra four sysclock ticks - Exactly +180 degree phase shift from the earlier receiver routine.

uint8_t  spi_rxbyte_pha1( void )
{
// Receive SPI byte data at sysclock/8.  eg: Prop2 at 160 MHz
//   will produce a 20 MHz SPI clock.
// Uses CPHA=1 clock phase
//
    uint8_t  data;

    __asm volatile {    //directive to use fcache for deterministic timing
        outnot  ckpin           //lag start
        nop             //2
        outnot  ckpin           //4

        rep @.rend, #7      //6
        outnot  ckpin           //8
        rcl data, #1        //10
        outnot  ckpin           //12
        testp   rxpin  wc       //13  (TESTP does an early sample of pin)
.rend
        rcl data, #1
        waitx   #2
        testp   rxpin  wc
        rcl data, #1
        zerox   data, #8
    }

    return data;
}

evanh · 2022-02-03 19:49

And, rather coolly, no changes needed for the transmitter routine. That's because the clock-data phase there is solidly +90 degree as is, which allows the SPI device to be either way inclined and still accurately pick up what you've sent.

evanh · 2022-02-03 21:26

@evanh said:
avsa242,
Well, the smartpin based functional routines are currently in an experimental state. I'm making use of the hardware buffering to give an apparent overhead reduction.

Here's an example test I'm using for reading the contents of the EEPROM on the Eval Board:

        ...
        _pinl( cspin );
        spi_txbyte_sp( txpin, ckpin, 0x0b );  // Fast Read
        spi_txbyte_sp( txpin, ckpin, 0 );  // addr
        spi_txbyte_sp( txpin, ckpin, 0 );  // addr
        spi_txbyte_sp( txpin, ckpin, 0 );  // addr
        spi_rxbyte_sp( rxpin, ckpin );  // dummy read
        spi_rxlong_sp( rxpin, ckpin );  // tx echo
        _wxpin( rxpin, 31 );  // on the fly adjust to 32-bit shifter
        for( idx = 0; idx < 100; idx++ )
            buff[idx] = spi_rxlong_sp( rxpin, ckpin );
        _wxpin( rxpin, 7 );  // adjust back to 8-bit shifter
        _pinh( cspin );
        for( idx = 0; idx < 100; idx++ )
            printf( " %x", buff[idx] );
        printf( "\n" );
        ...

You'll note the apparent excess leading "tx echo" longword receive. That does two things. One, the way the hardware buffering is utilised means that it'll immediately return a garbage value picked up from the previously inactive rx data pin. Two, it initiates clocking of 32 unbroken clock pulses. Hence the matching change in the shifter length.

EDIT: Actually, I might have "dummy read" and "tx echo" comments reversed from reality. That means the two parts I listed are actually split across the two of them. The rxbyte() returns immediately but the rxlong() still initiates the 32 clock pulses while absorbing the dummy byte from SPI Fast Read command.

evanh · 2022-02-04 03:10

Cool! The Winbond W25Q128 used for the Eval Board's EEPROM indicates support for SPI clocking modes 0 and 3. With the smartpins, modes 1 and 3 (CPHA=1) can run faster than modes 0 and 2 - Tx smartpin is affected by the I/O staging lag. So ... setting mode 3 and then winding the sysclock ratio down from 8 to 4 ... works ... and at 360 MHz sysclock ... yes! So now tested at 90 MHz SPI clock.

Also gave sysclock/3 (smallest possible with smartpins) a shot and it worked up to 90 MHz SPI clock as well. Beyond that it seemed to give up. Most likely due to attenuation of signal. The EEPROM has very long tracks around the revB Eval Board I believe.

Test code has changed. Made a custom routine for block read of one 256 byte page. It runs an unbroken 2048 clock pulses and writes the incoming data straight to hubRAM.

        _pinl( cspin );
        _wrpin( txpin, txmode );    // enable tx output drive
        spi_txbyte_sp( txpin, ckpin, 0x0b );    // SPI Fast Read
        spi_txbyte_sp( txpin, ckpin, 0 );    // addr
        spi_txbyte_sp( txpin, ckpin, 0 );    // addr
        spi_txbyte_sp( txpin, ckpin, 0 );    // addr
        spi_rxbyte_sp( rxpin, ckpin );    // dummy byte read (SPI Fast Read)
        _wrpin( txpin, txmode_d );    // disable tx output drive
        idx = spi_rxpage_sp( rxpin, ckpin, buff );    // SPI page read (256 bytes)
        _pinh( cspin );

evanh · 2022-02-04 03:52

It did take a lot of experimenting with smartpin status checking and deciding on appropriate sequencing. In the end there ain't many options that can work. Aside from plain screwing up at times, like not reading the docs first, there was a notable hurdle to get my head around: ... Knowing the difference between a smartpin buffer filled and one that has stale data already loaded. Hint: Can only be done with full state knowledge. And, guess what, entry to routine never has full knowledge.

So, trick is to avoid needing to check buffer full statuses until state of smartpin is known within the routine.

EDIT: Removed snippets and posted runnable code in next post

evanh · 2022-02-04 03:58

Also found __asm volatile is not needed when using smartpins. I can see on the scope the sometimes small variations due to hubexec branches but the straight line execution is deterministic and that's all that matters for firing of a few consecutive instructions to initiate an SPI transfer with smartpins.

Runnable test code attached. It prints to terminal the first 256 bytes of the EEPROM. So make sure the "Flash" switch is turned on. Should work with any Prop2 board that has a boot EEPROM.

Update: Merged in the earlier CPHA1 bit-bashed routines as an option.

Rayman · 2022-02-06 16:34

@System (or any moderator) Can you please move this thread to the C/C++ category?

Going to try to get this code working with P1 too (actually, P1V, but same thing).

Rayman · 2022-02-06 17:07

Ran into major problems trying to make this work with P1.
Ok, actually P1V on a Digilent Arty S7 FPGA board.

First of all the grove connector on the e-ink board interferes with a jumper on the FPGA board, but that wasn't a big deal, I just cut off the grove connector.

The real problem I had was that the P1V would freeze when P10 was set low, which is actually the first thing this code does with I/O.

Took me a long time to figure out that for some reason Digilent decided to connect one of the Arduino style ICSP pins to Slave Select, which also happens to be connected to P10.

Unfortunately, this pin is connected to RESET on the e-ink board... This seems to be a common arrangement for Arduino boards.

Anyway, think I can just cut a trace on the E-Ink board and get it going with P1...

Publison · 2022-02-06 18:57

Moved to C++

Rayman · 2022-02-06 19:43

Got it working on P1!
(P1V actually with Arty S7 FPGA, but should also work on real P1 too)

This was actually really easy to convert from P2 to P1 (once I figured out the reset issue described above and cut a trace).
My first time using FlexProp C on P1, I think...

Uses 26964 bytes. There's almost enough room left to increase the screen buffer size to full screen (think that takes ~5kB).
Ditching some of the fonts would probably get it there...

Code posted in top post.

Rayman · 2022-02-06 22:33

As an exercise, I got this working with Catalina 4.9 on P1. Code in top post.

Rayman · 2022-03-03 16:22

Was just thinking that this display with a P1 and coin cell battery would be a neat way to keep track of things like dog feeding.
Thinking of a board with just on/off button, 14 buttons for AM/PM Mon..Fri.

Maybe better if didn't need on/off button, but not sure I know how to do that... Guess would need DPDT buttons...

Actually, guess it needs a clear button for new week too...

Rayman · 2022-03-03 18:15

Thread about coin cell usage here: https://forums.parallax.com/discussion/167939/parallax-propeller-coin-cell

Rayman · 2022-05-22 20:55

Made a slight update to P1 Flexprop code, posted to top post.

Also, figured out how to use rotation to get it in landscape mode.
A bit of a trick to it because some functions still use absolute pixel addressing even after specifying a rotation.
Also, the init function resets rotation.

Here's a pic and code with rotation. So, I think the code is done.

More about the project is here: https://forums.parallax.com/discussion/174473/weekly-chore-tracker

Rayman · 2022-05-22 23:42

Looks like the CR2032 will last at least a week. But, not much more, with the current code and config.

The partial refresh feature of the driver is great.
Let's me blacken the red squares without having to remember the state of the display.
Also, it's faster than doing the full screen.

Seeed Studio's 3-color E-Ink display (176x264 pixels) (with FlexProp C & Catalina) on P2 and P1

Comments