Propeller 1 and touchscreens, continued

average joe · 2016-05-21 04:17

After playing with a few different conceptions of connecting a display to the propeller I'm wondering what others think. Memory performance vs free pins vs glue logic? It seems to be a giant case of tradeoffs and while I'm perfectly happy with the prototype board I'm currently using (although I REALLY wish I had 4qpi flash chips onboard)

If you really want the most free pins it makes sense to me to run the display in 8 bit mode and use a couple QPI chips. 12-14 pins used. If you want the fastest memory transfers *the current design* then use all the pins. I've gone back and forth on the next iteration of hardware I want to build and I'm hoping to find some consensus on an acceptable design. There was quite a bit of interest in the old design except for the no free pins part. I've had quite a bit of luck doing surface mount although BGA is on the limit of my process as I have yet to try one. If I can find a consensus I could seeing a small run int the future.

I know in the propeller world we are used to having our cake and eating it too but lets try to be realistic here. So.

MINIMUM number of free pins? 4? 8? 12? 20? (this is going to be the biggest fight)
HOW much additional memory? 256k? 1m? 256m?
How fast do you need to access the memory?
Is the memory dedicated to the display or do you plan on using it for program data?
Price point?
Through hole for self assembly or pre-assembled SMT?

I think the most important part is to keep it noob-friendly. Some way to go from the simple toggle tutorial to the first "hello world".

jmg · 2016-05-21 10:46

You do not mention the Screen resolution targeted, or of this needs to do both VGA and LCD interfaces, or if just LCD is ok ?

average joe · 2016-05-21 13:13

I have two target resolutions in mind, 320 * 240 and 480 * 800. I see merits to both. Regarding VGA, Ray had a SSD to HDMI board I was trying to get my hands on. IIRC it did HDMI (or maybe it was VGA) and used the same controller the 7" displays are using. It seemed like a drop-in conversion from LCD to VGA.

The trickiest part for me was dealing with the two different resolution displays. I had one spin program that was able to run both but the filesystem needed some tweaks, along with the fact every program needed to be written for each display.

T Chap · 2016-05-21 22:25

I am a sleep deprived, but can jmg or someone refresh my memory on why LCD/VGA graphics can't have access to HUB memory? Is it the cycle speed HUB access time that is not fast enough to read HUB and update pins to meet VSYNC and HSYNC ? I am starting to study the LCD timings to get an understanding of how this stuff works.

I want to ask a simple question about LCD drivers. If you tied RGB inputs HIGH, and only ran the VSYNC, HSYNC and CLK pins, does the LCD care that the RGB pins are fixed, and it will produce white for all pixels?

msrobots · 2016-05-21 22:33

@TChap,

it is the size of the HUB. A decent screen size simply eats most of your HUB ram. Especially if more the 256 colors.

If I remember correctly the controller Ray is using has 1 MB ram on its own.

Enjoy!

Mike

jmg · 2016-05-21 22:38

average joe wrote: »

I have two target resolutions in mind, 320 * 240 and 480 * 800. I see merits to both.

For LCD and those resolutions, you could also look at FTDI EVE & EVE2

Or this RA8875 alternative to SSD1963
https://www.adafruit.com/products/1590

average joe wrote: »

If you really want the most free pins it makes sense to me to run the display in 8 bit mode and use a couple QPI chips. 12-14 pins used. If you want the fastest memory transfers *the current design* then use all the pins. I've gone back and forth on the next iteration of hardware I want to build and I'm hoping to find some consensus on an acceptable design. There was quite a bit of interest in the old design except for the no free pins part. I've had quite a bit of luck doing surface mount although BGA is on the limit of my process as I have yet to try one.

To confuse you with more choice, there is now HyperRAM (BGA but 1mm & 25 pins 6x8)

IS66WVH8M8ALL-166B1LI (1.8V) 8M x 8 RAM shows stock at Digikey, but going fast !
3V ones due next week.
Some timing details are ??, but this would take resolutions above 480 * 800.

Clock speed itself looks tolerant, but there is this unexpected detail
CS# must go high before tCSM is violated. tCSM is given as 1us or 4us ?

The forced CS# Hi time is quite short
Chip Select High Between Transactions tCSHI > 10.0 ns
Looks like a P1 timer in Duty can meet that, but the catch here is after every CS# cycle, you need to respecify the read address, = 13 edges before Rd Data appears

Two timers should manage LCD_CLK & HR_CK, and some nimble PASM could manage the address, for budgets something like :
Address & delays can be @ 20MHz, and possibly data can read at 40MHz, using just he NCO Clock from P1.
Taking a rough 25% dead time for Address and allowing time for auto-refresh windows
a 800x480 LCD scan would be around 25ms. (ie P1 Clocking the HyperRAM PIxel image)

T Chap · 2016-05-21 22:48

MS, you have 32k HUB - application = space for graphics, however as I understand it you only have 512 longs in the core for real world graphics visible at a given time. That's why I was asking why if you have 32k HUB - program, are we limited to 512 longs on the LCD drivers we have been using. I always thought that it was access time to the HUB else we would be using the HUB memory directly versus have graphics present in core.

T Chap · 2016-05-21 22:49

BTW for ave joe, I have concluded that best choice for LCD is EVE2 and will be moving my stuff towards that design.

msrobots · 2016-05-22 00:37

TC,

usually the driver is in the Cog and sometimes some font data. Screen data is usually in the Hub, and the Video driver can handle that from the Hub. But with lesser resolution or lesser color information per pixel

Alas 480 * 800 * 1 byte does not fit. You are already at 384 KB. So the Hub is to small for a Bitmap like that.

Enjoy!

Mike

T Chap · 2016-05-22 01:10

What you need then is external RAM and look up the color for each pixel one at a time as it streams? That is, if there is speed to do a RAM read. I would guess that it would be ideal to have a method of RAM that could be loaded with the data for bitmaps, then no SD card would be required for the device. Otherwise in the case of the HyperRam being discussed, the SD card will be required to be read to load grapgics data into RAM.

This is a good read I just ran across today on Prop + RAM

https://www.parallax.com/sites/default/files/downloads/AN012-SRAM-v1.0.pdf

jmg · 2016-05-22 01:50

T Chap wrote: »

I would guess that it would be ideal to have a method of RAM that could be loaded with the data for bitmaps, then no SD card would be required for the device. Otherwise in the case of the HyperRam being discussed, the SD card will be required to be read to load grapgics data into RAM.

Any RAM is going to need to have fonts/bitmaps in NV memory somewhere...

T Chap · 2016-05-22 01:59

I was referring to the possibility of flash based, not even sure that is an option.

jmg · 2016-05-22 02:28

Looking more at HyperRAM, that looks to pair well with both P2 and P1.
P1 can clock it at good speeds, and the LCD displays look to tolerate short gaps in the clocking.

That leaves width - probably simplest is 2 HyperRAM and a 16b LCD BUS.

For VGA, I think things get more complex, but P2 can probably manage the buffering and CLUT, and do that with a Single HyperRAM

Dr_Acula · 2016-05-22 04:26

I wish I had more time to devote to this. Fundamentally, need more pins. To talk to these touchscreens needs a 16 bit data bus, and 16 to 19 bit address bus for external ram. The solution we had with a propeller and external TTL chips as a counter was pretty good, but it was't great with all those TTL chips.

The three solutions I've played around with would be dual propeller, or propeller plus cheap fpga, or just do it all on a fgpa. The latter are now sub $20 and I've been building designs that have enough pins to run external touchscreen, external ram, vga, keyboard, sd card and have enough smarts to run CP/M.

The dual propeller options ends up with the address of the ram on one prop and the data bus on the other, which isn't really ideal as need to coordinate things between the two props. Data probably best sent in burst mode, eg set up address, and send 256 x 16 bits as a group. Essentially the second prop becomes just a counter, and it is a bit marginal to justify a $9 propeller chip plus all the support components, vs five TTL counters.

There certainly could be merit going back to our original design and doing it all in surface mount. Then it would be the same size as the touchscreen display.

Having found too many limitations with the propeller (actually, just one limitation, it doesn't have enough pins), I've drifted over to the FPGA world. A $20 cyclone II can do similar things to a propeller - sd, keyboard, vga, and ports and still have pins left over. Downside is having to use very old versions of programming languages in CP/M (though I kind of like the simplicity of 1980s C).

It would probably be a bit rude though to brainstorm a fpga touchscreen design on a propeller forum

Then again, it will be hard for the propeller to compete with the ever decreasing price of fpga chips, when the default for these chips is 70 to 100 pins.

jmg · 2016-05-22 05:38

Dr_Acula wrote: »

I wish I had more time to devote to this. Fundamentally, need more pins. To talk to these touchscreens needs a 16 bit data bus, and 16 to 19 bit address bus for external ram. The solution we had with a propeller and external TTL chips as a counter was pretty good, but it was't great with all those TTL chips.

The new low pin count RAMs turn all this on its head

The best pairing seems to be to use a single 8Mx8 3v3 HyperRAM, (or even 2 QuadSPI SRAM eg Microchip) and a modest FPGA in QFN48.
eg Lattice iCE5LP1K,iCE5LP2K,iCE5LP4K

The iCE5LP1K is 3.156/50+ iCE5LP2K is 4.1032 / 50+, with 39io
That maps as Data flow of 8b to HyperBUS and Prop, and 16b to LCD, plus sync lines 39-24 = up to 15 io for control lines.
This level of modest FPGA has a PLL and can support a CLUT, or can pass through Pixel info unchanged.
During Active Scan, the HyperRAM is controlled by the QFN48 part, and during pauses/blanking it can be loaded from Prop.
Prop can be preparing info to load during the scan times, or idling if there is nothing to do.

That pairing of parts would support well above 800x480 LCDs, and with some line buffering, could do VGA too.
EVE2 is another choice, tho that tops out at 800x480, and is LCD only (I think no VGA ?)

Dr_Acula · 2016-05-22 06:15

@jmg, Ah, technology may indeed come to the rescue!

The way these touchscreens work is you do a whole lot of setting things up, but then when you send a burst of data it is a simple as
1) put 16 bits of data on the data bus (one word, which is pixel data RRRRRGGGGGGBBBBB)
2) toggle the write pin on the touchscreen
3) ? toggle a pin on the ram to send out the next word.

You set up the touchscreen so it knows in advance how many words you are going to send. If you can set up the ram chip so it also knows how many to send, then it becomes a very tight pasm loop.

So... maybe you don't even need the fpga? Maybe do this with some of the serial ram chips in parallel to create a 16 bit bus?

And... some of those 16 pins can be shared with other things, eg the SD card. Move a block of data into hub first, then move it out to ram.

Hmm - 16x1mbit srams gives '1 megapixels'. But doing some searches brings up newer 4mbit spi srams. These could be interesting especially if they come with 4 bits. Then could do this with 4 chips. Will do some searching...

jmg · 2016-05-22 06:42

Dr_Acula wrote: »

Maybe four quadSPI ram chips. You have got me thinking here

Yes, that's another valid approach. Less total memory, and lower MHz, and cannot do a CLUT output, but uses simple repeating hardware.
(2 SPI RAMS are about the same price as the FPGA.)

Dr_Acula wrote: »

But doing some searches brings up newer 4mbit spi srams.

Any links ? Are those SRAMs, or the more niche FRAM ?

MJB · 2016-05-22 15:00

@T_Chap
if you don't need fast dynamic content displayed
there are those cheap ($6-7) SPI QVGA 320x240 displays which only need a few lines, have their own memory, so you don't need a big display buffer in RAM.
I used one with TACHYON and like them.

average joe · 2016-05-23 04:57

Dr_Acula wrote: »

I wish I had more time to devote to this. Fundamentally, need more pins. To talk to these touchscreens needs a 16 bit data bus, and 16 to 19 bit address bus for external ram. The solution we had with a propeller and external TTL chips as a counter was pretty good, but it was't great with all those TTL chips.

I'm still in love with this board (after doubling the amount of SRAM) although hindsight is a killer. The REV3 (aug2012) is perfect for using the dual 240 x 320 displays. Unfortunately when you plug a 7" display in it becomes very a bit uncomfortable to use. The stacked SRAM chips don't help either. I'm thinking an 8bit transfer mode could save a bunch of pins. Addresses would have to be loaded with 3 transfers but that's not a big deal IMO.

Dr_Acula wrote: »

The three solutions I've played around with would be dual propeller, or propeller plus cheap fpga, or just do it all on a fgpa. The latter are now sub $20 and I've been building designs that have enough pins to run external touchscreen, external ram, vga, keyboard, sd card and have enough smarts to run CP/M.

P1+CPLD is still my vote. I'm thinking you could get away with using 12-14 propeller pins total. The release of the P1v core makes it VERY tempting to just go the FPGA route...

Dr_Acula wrote: »

The dual propeller options ends up with the address of the ram on one prop and the data bus on the other, which isn't really ideal as need to coordinate things between the two props. Data probably best sent in burst mode, eg set up address, and send 256 x 16 bits as a group. Essentially the second prop becomes just a counter, and it is a bit marginal to justify a $9 propeller chip plus all the support components, vs five TTL counters.

I've always felt like the second propeller was overkill.

Dr_Acula wrote: »

There certainly could be merit going back to our original design and doing it all in surface mount. Then it would be the same size as the touchscreen display.

I've already played with it and it comes out TINY! Looks really nice too!

Dr_Acula wrote: »

It would probably be a bit rude though to brainstorm a fpga touchscreen design on a propeller forum

Maybe, but if it was based around the p1v?? The OBVIOUS answer is to stick a memory controller to portB that handles all the display and memory logic.

IIRC, one of your other designs used QPI parts in parallel. I did create a flash QPI module that worked pretty well as a SD card replacement. The SD card was the BIGGEST bottleneck (and uses 1/3 program memory right off the top)

jmg wrote: »

Looking more at HyperRAM, that looks to pair well with both P2 and P1.
P1 can clock it at good speeds, and the LCD displays look to tolerate short gaps in the clocking.

That leaves width - probably simplest is 2 HyperRAM and a 16b LCD BUS.

For VGA, I think things get more complex, but P2 can probably manage the buffering and CLUT, and do that with a Single HyperRAM

I'll have to take a look at these. The simplest solution I was able to come up with is 2 QPI devices on an 8 bit bus. Leaves you a bunch of pins free.

Need to digest all this again but no one's hit on my biggest question. HOW MANY PINS? (yeah I know, all 64 right? lol) In all seriousness, REV3 has 4 pins FREE.. 31..28. You can get 2 more if you don't need audio. 4 more if you make the SD card use the 16b -bus.

I think the last compact design I did used 8b data and 4b control. There was a ton of logic but it fit into a $3.80 XC2C64A - http://www.digikey.com/product-detail/en/xilinx-inc/XC2C64A-7VQG100C/122-1409-ND/949460

Or if you want to completely remove the LCD/MEM bus from the propeller bus the XC2C256 could fit the mess of logic needed. $15.45 http://www.digikey.com/product-detail/en/xilinx-inc/XC2C256-7VQG100C/122-1402-ND/949453

For my "pet project" REV3 is close. 16Mbit sram 4x 4m chips. The MAIN bottleneck was SD card. A FLASH "disk" should solve that problem. I'll have something prototyped next week.

http://www.mouser.com/ProductDetail/Alliance-Memory/AS6C4008-55PCN/?qs=E5c5%2bmu3i3%2bMOyro1Tlhzg==
If I can find a decent replacement for these SRAMs (smt, better price point) I think I would be a start. $5 a chip isn't bad but using 4 to get to 16Mbit seems a bit wasteful.

I've done exposed pad QFN, TQFP and the like with great success. BGA is a little scary but I might be willing to try it.

jmg · 2016-05-23 06:07

average joe wrote: »

P1+CPLD is still my vote. I'm thinking you could get away with using 12-14 propeller pins total. The release of the P1v core makes it VERY tempting to just go the FPGA route...

Yes, but even Std P1 can augment with a more modest FPGA than p1V needs.

average joe wrote: »

I think the last compact design I did used 8b data and 4b control. There was a ton of logic but it fit into a $3.80 XC2C64A - http://www.digikey.com/product-detail/en/xilinx-inc/XC2C64A-7VQG100C/122-1409-ND/949460

Or if you want to completely remove the LCD/MEM bus from the propeller bus the XC2C256 could fit the mess of logic needed. $15.45 http://www.digikey.com/product-detail/en/xilinx-inc/XC2C256-7VQG100C/122-1402-ND/949453

The QFN48 parts I spec'd above are similar price to the 2C64A, but are smaller and open some more choices.
They have RAM, so you could do a CLUT to reduce the pixel load.

I think the HyperRAM may also need a pixel buffering scheme, as they spec does not quite cover streaming a full line at fixed clock spacing.
Probably ok for LCD, but not tolerated for VGA.

In that case, small RAM becomes mandatory, and simpler CPLDs no longer quite do it.

My only grumble about the QFN48 parts, is Lattice do not yet seem to have a 48 pin Eval Board. (?!)

jmg · 2016-05-24 05:13

jmg wrote: »

Looking more at HyperRAM, that looks to pair well with both P2 and P1.
P1 can clock it at good speeds, and the LCD displays look to tolerate short gaps in the clocking.

That leaves width - probably simplest is 2 HyperRAM and a 16b LCD BUS.

For VGA, I think things get more complex, but P2 can probably manage the buffering and CLUT, and do that with a Single HyperRAM

HyperRAM and P1 :
I found this in the data :

" The host system may also effectively increase the tCMS value by explicitly taking
responsibility for performing all refresh and doing burst refresh reading of multiple sequential rows in order to catch up on distributed refreshes missed by longer transactions. "

I think that opens up some simple arrangements, with some caveats :
* The HyperRAM can be addressed once only per frame, and then Stream Clocked (Linear mode). This Address once is extremely simply to configure.
* This refreshes the used locations, but others are not refreshed.
(ie you have one display, but it can be many sizes)

Looking at some LCD data (800*480), and the HyperRAM, I get numbers like this
This LCD has two modes, and DE mode is simpler = just one framing pin needed.

Shortest time Specs :
((800+88)*480+4*(800+88))/40M = 10.75 ms - rest of HyperRAM MAX of 64 ms is free for Write times. (40M LCD is 20M DDR clock from P1)
(or if meet right most LCD columns MAX time, for frame that gives 240tH times, is just inside HR limit of 64ms and a 10M scan rate gives 32.77% time free for WrBW, or 419136 P1 Opcodes)

My guess is 10.75ms & 64ms will be ok, which gives 83.20% of free time or 1065000 P1 Opcodes, for Display writes.
(for VGA, just the Frame time is available for Display Writes)

Memory used: P1 generates the LCD_DE framing pulse train, and HyperRAM clocks out ((800+88)*480+4*(800+88)) = 429792 locations, of which 800*480 are active, the rest are black.

During HR Write LCD_DE is LOW, ignoring RWDS from writes.
During HR Read, Clock LCD is from HR_RWDS, fed to AUP1G57 XNOR doubler, (max Fdo 40MHz, generates =\_/= clocks from XNOR ) XNOR Not required for VGA.

P1 generates
LCD_DE
HR_CSN
HR_CLK

16b BUS is shared between LCD and P1 and HR

This would need a PCB with 2 x HyperRAM mounted, mapping to P1 pins & LCD connector, for 16b Pixels.

T Chap · 2016-05-24 05:36

Do you have a way to draw this? This would need a PCB with 2 x HyperRAM mounted, mapping to P1 pins & LCD connector, for 16b Pixels.

How would you stream a camera to LCD using the dual HR as the RGB source, since you can read and write at the same time? I am assuming for LCD you are speaking about stored data only, from SD>HR>LCD

T Chap · 2016-05-24 05:41

How do you show something like a button effect, or progress bar, clock etc streaming from HR? Are you jumping around to various memory locations per screen draw to pick and choose what will make up the image? Or is this a single contiguous read starting at a certain adress. Hard to visualize the dynamic use.

jmg · 2016-05-24 05:53

T Chap wrote: »

Do you have a way to draw this? This would need a PCB with 2 x HyperRAM mounted, mapping to P1 pins & LCD connector, for 16b Pixels.

Yes. P1-HR-LCD is parallel, I think that can work, with separate HR_CS and LCD_DE lines.

T Chap wrote: »

How would you stream a camera to LCD using the dual HR as the RGB source, since you can read and write at the same time?

I've not looked at cameras, but the DE mode for LCDs allows a single simple framing pulse. (A wide gap infers Frame)

T Chap wrote: »

I am assuming for LCD you are speaking about stored data only, from SD>HR>LCD

Yes SD or Flash for Icons and Fonts etc

T Chap wrote: »

How do you show something like a button effect, or progress bar, clock etc streaming from HR?
Are you jumping around to various memory locations per screen draw to pick and choose what will make up the image? Or is this a single contiguous read starting at a certain adress. Hard to visualize the dynamic use.

It's a single contiguous read starting at a certain address. During that time, one P1 COG is outputting a clock, and doing the simple DE modulation 800:88.

You need to write to the RAM, during those free time budgets to change areas like progress bars, button effects etc.

Dr_Acula · 2016-05-25 23:26

Looking at some of the things Rogloh has been doing with the P1V and massive amounts of hub ram and external ram, I am wondering if this might be worth pursuing?

http://forums.parallax.com/discussion/161587/p1v-with-2mb-of-hub-visible-ram-and-now-32mb-of-sdram/p1

cheezus · 2018-12-04 08:31

Just picking back up on this, excuse the name change.

I've been working on a design to target both the P1 as well as the P2. I'm just waiting on parts to arrive for footprint verification before sending the board in for fab.

The cpld logic is pretty simple. 20 bit address counter, a few registers, some 1-4 decoders.. Should be REALLY fast
cpld%20logic.png

FLASH - NAND Memory IC 8Gb (512M x 16) Parallel
SRAM - Asynchronous Memory IC 16Mb (1M x 16) Parallel
CPLD - XC2C64A - 64 IO
1v8 reg for CPLD
footprint for SPI / QPI soic8

Rayman · 2018-12-04 18:16

That's an impressive looking design!
Looks like a lot of work...

If you wanted to take an easier path, check out the EVE2 chips by FTDI and also the newer version from Bridgetek.

jmg · 2018-12-04 18:55

Rayman wrote: »

That's an impressive looking design!
Looks like a lot of work...

If you wanted to take an easier path, check out the EVE2 chips by FTDI and also the newer version from Bridgetek.

True, but Eve2 do not use Prop's, and they climb in price - latest is $7.15/100.

I like the CPLD approach, and there is already a RaspPi board that does 128MHz SPI CPLD to LCD.
A SPI link can make sense, as sometimes the LCD is remote from the MCU, or you just want to save pins.

Worth checking where that board tops-out, and if there is room for a "P1/P2-SPI" CPLD version - ie low pin count.
P1 & P2 could do DDR SPI and Dual & QuadSPI to bump the bandwidth.
DualSPI is the highest bandwidth, with the common 4 pin SPI connection.

Rayman · 2018-12-04 19:52

EVE is a slave device, so you still need a controller, such as P1.
And, I just saw the newest version at Mouser for $4...

jmg · 2018-12-04 20:18

Rayman wrote: »

EVE is a slave device, so you still need a controller, such as P1.
And, I just saw the newest version at Mouser for $4...

Interesting, checking I see BT816
Mouser : 1 $4.7000 25 $4.5700 100 $4.4300 500 $4.3900
Digikey: 1 $9.6800 25 $7.9336 100 $7.1595 520 $5.9985
someone has made a mistake there I think ?

jmg · 2018-12-05 01:17

cheezus wrote: »

Just picking back up on this, excuse the name change. I've been working on a design to target both the P1 as well as the P2. I'm just waiting on parts to arrive for footprint verification before sending the board in for fab.

A new part you could consider supporting is the MUX-RAM from ISSI (shared AD & Q pins, internal address latch - sadly, no counter ?! )
IS61WV25616MEBLL x16 2.4~3.6V 10, 12ns -40 to 125°C TSOP2(44), BGA(48) Prod Muxed SRAM

This allows fewer CPLD pins, as you can time-multiplex Address and Data.

That would need a change of footprint, but it does move to the easier pitch of 0.8mm on the long edges.

Propeller 1 and touchscreens, continued

Comments