Propeller 1 and touchscreens, continued
average joe
Posts: 795
in Propeller 1
After playing with a few different conceptions of connecting a display to the propeller I'm wondering what others think. Memory performance vs free pins vs glue logic? It seems to be a giant case of tradeoffs and while I'm perfectly happy with the prototype board I'm currently using (although I REALLY wish I had 4qpi flash chips onboard)
If you really want the most free pins it makes sense to me to run the display in 8 bit mode and use a couple QPI chips. 12-14 pins used. If you want the fastest memory transfers *the current design* then use all the pins. I've gone back and forth on the next iteration of hardware I want to build and I'm hoping to find some consensus on an acceptable design. There was quite a bit of interest in the old design except for the no free pins part. I've had quite a bit of luck doing surface mount although BGA is on the limit of my process as I have yet to try one. If I can find a consensus I could seeing a small run int the future.
I know in the propeller world we are used to having our cake and eating it too but lets try to be realistic here. So.
MINIMUM number of free pins? 4? 8? 12? 20? (this is going to be the biggest fight)
HOW much additional memory? 256k? 1m? 256m?
How fast do you need to access the memory?
Is the memory dedicated to the display or do you plan on using it for program data?
Price point?
Through hole for self assembly or pre-assembled SMT?
I think the most important part is to keep it noob-friendly. Some way to go from the simple toggle tutorial to the first "hello world".
If you really want the most free pins it makes sense to me to run the display in 8 bit mode and use a couple QPI chips. 12-14 pins used. If you want the fastest memory transfers *the current design* then use all the pins. I've gone back and forth on the next iteration of hardware I want to build and I'm hoping to find some consensus on an acceptable design. There was quite a bit of interest in the old design except for the no free pins part. I've had quite a bit of luck doing surface mount although BGA is on the limit of my process as I have yet to try one. If I can find a consensus I could seeing a small run int the future.
I know in the propeller world we are used to having our cake and eating it too but lets try to be realistic here. So.
MINIMUM number of free pins? 4? 8? 12? 20? (this is going to be the biggest fight)
HOW much additional memory? 256k? 1m? 256m?
How fast do you need to access the memory?
Is the memory dedicated to the display or do you plan on using it for program data?
Price point?
Through hole for self assembly or pre-assembled SMT?
I think the most important part is to keep it noob-friendly. Some way to go from the simple toggle tutorial to the first "hello world".
Comments
The trickiest part for me was dealing with the two different resolution displays. I had one spin program that was able to run both but the filesystem needed some tweaks, along with the fact every program needed to be written for each display.
I want to ask a simple question about LCD drivers. If you tied RGB inputs HIGH, and only ran the VSYNC, HSYNC and CLK pins, does the LCD care that the RGB pins are fixed, and it will produce white for all pixels?
it is the size of the HUB. A decent screen size simply eats most of your HUB ram. Especially if more the 256 colors.
If I remember correctly the controller Ray is using has 1 MB ram on its own.
Enjoy!
Mike
For LCD and those resolutions, you could also look at FTDI EVE & EVE2
Or this RA8875 alternative to SSD1963
https://www.adafruit.com/products/1590
To confuse you with more choice, there is now HyperRAM (BGA but 1mm & 25 pins 6x8)
IS66WVH8M8ALL-166B1LI (1.8V) 8M x 8 RAM shows stock at Digikey, but going fast !
3V ones due next week.
Some timing details are ??, but this would take resolutions above 480 * 800.
Clock speed itself looks tolerant, but there is this unexpected detail
CS# must go high before tCSM is violated. tCSM is given as 1us or 4us ?
The forced CS# Hi time is quite short
Chip Select High Between Transactions tCSHI > 10.0 ns
Looks like a P1 timer in Duty can meet that, but the catch here is after every CS# cycle, you need to respecify the read address, = 13 edges before Rd Data appears
Two timers should manage LCD_CLK & HR_CK, and some nimble PASM could manage the address, for budgets something like :
Address & delays can be @ 20MHz, and possibly data can read at 40MHz, using just he NCO Clock from P1.
Taking a rough 25% dead time for Address and allowing time for auto-refresh windows
a 800x480 LCD scan would be around 25ms. (ie P1 Clocking the HyperRAM PIxel image)
usually the driver is in the Cog and sometimes some font data. Screen data is usually in the Hub, and the Video driver can handle that from the Hub. But with lesser resolution or lesser color information per pixel
Alas 480 * 800 * 1 byte does not fit. You are already at 384 KB. So the Hub is to small for a Bitmap like that.
Enjoy!
Mike
This is a good read I just ran across today on Prop + RAM
https://www.parallax.com/sites/default/files/downloads/AN012-SRAM-v1.0.pdf
P1 can clock it at good speeds, and the LCD displays look to tolerate short gaps in the clocking.
That leaves width - probably simplest is 2 HyperRAM and a 16b LCD BUS.
For VGA, I think things get more complex, but P2 can probably manage the buffering and CLUT, and do that with a Single HyperRAM
The three solutions I've played around with would be dual propeller, or propeller plus cheap fpga, or just do it all on a fgpa. The latter are now sub $20 and I've been building designs that have enough pins to run external touchscreen, external ram, vga, keyboard, sd card and have enough smarts to run CP/M.
The dual propeller options ends up with the address of the ram on one prop and the data bus on the other, which isn't really ideal as need to coordinate things between the two props. Data probably best sent in burst mode, eg set up address, and send 256 x 16 bits as a group. Essentially the second prop becomes just a counter, and it is a bit marginal to justify a $9 propeller chip plus all the support components, vs five TTL counters.
There certainly could be merit going back to our original design and doing it all in surface mount. Then it would be the same size as the touchscreen display.
Having found too many limitations with the propeller (actually, just one limitation, it doesn't have enough pins), I've drifted over to the FPGA world. A $20 cyclone II can do similar things to a propeller - sd, keyboard, vga, and ports and still have pins left over. Downside is having to use very old versions of programming languages in CP/M (though I kind of like the simplicity of 1980s C).
It would probably be a bit rude though to brainstorm a fpga touchscreen design on a propeller forum
Then again, it will be hard for the propeller to compete with the ever decreasing price of fpga chips, when the default for these chips is 70 to 100 pins.
The new low pin count RAMs turn all this on its head
The best pairing seems to be to use a single 8Mx8 3v3 HyperRAM, (or even 2 QuadSPI SRAM eg Microchip) and a modest FPGA in QFN48.
eg Lattice iCE5LP1K,iCE5LP2K,iCE5LP4K
The iCE5LP1K is 3.156/50+ iCE5LP2K is 4.1032 / 50+, with 39io
That maps as Data flow of 8b to HyperBUS and Prop, and 16b to LCD, plus sync lines 39-24 = up to 15 io for control lines.
This level of modest FPGA has a PLL and can support a CLUT, or can pass through Pixel info unchanged.
During Active Scan, the HyperRAM is controlled by the QFN48 part, and during pauses/blanking it can be loaded from Prop.
Prop can be preparing info to load during the scan times, or idling if there is nothing to do.
That pairing of parts would support well above 800x480 LCDs, and with some line buffering, could do VGA too.
EVE2 is another choice, tho that tops out at 800x480, and is LCD only (I think no VGA ?)
The way these touchscreens work is you do a whole lot of setting things up, but then when you send a burst of data it is a simple as
1) put 16 bits of data on the data bus (one word, which is pixel data RRRRRGGGGGGBBBBB)
2) toggle the write pin on the touchscreen
3) ? toggle a pin on the ram to send out the next word.
You set up the touchscreen so it knows in advance how many words you are going to send. If you can set up the ram chip so it also knows how many to send, then it becomes a very tight pasm loop.
So... maybe you don't even need the fpga? Maybe do this with some of the serial ram chips in parallel to create a 16 bit bus?
And... some of those 16 pins can be shared with other things, eg the SD card. Move a block of data into hub first, then move it out to ram.
Hmm - 16x1mbit srams gives '1 megapixels'. But doing some searches brings up newer 4mbit spi srams. These could be interesting especially if they come with 4 bits. Then could do this with 4 chips. Will do some searching...
Yes, that's another valid approach. Less total memory, and lower MHz, and cannot do a CLUT output, but uses simple repeating hardware.
(2 SPI RAMS are about the same price as the FPGA.)
Any links ? Are those SRAMs, or the more niche FRAM ?
if you don't need fast dynamic content displayed
there are those cheap ($6-7) SPI QVGA 320x240 displays which only need a few lines, have their own memory, so you don't need a big display buffer in RAM.
I used one with TACHYON and like them.
I'm still in love with this board (after doubling the amount of SRAM) although hindsight is a killer. The REV3 (aug2012) is perfect for using the dual 240 x 320 displays. Unfortunately when you plug a 7" display in it becomes very a bit uncomfortable to use. The stacked SRAM chips don't help either. I'm thinking an 8bit transfer mode could save a bunch of pins. Addresses would have to be loaded with 3 transfers but that's not a big deal IMO.
P1+CPLD is still my vote. I'm thinking you could get away with using 12-14 propeller pins total. The release of the P1v core makes it VERY tempting to just go the FPGA route...
I've always felt like the second propeller was overkill.
I've already played with it and it comes out TINY! Looks really nice too!
Maybe, but if it was based around the p1v?? The OBVIOUS answer is to stick a memory controller to portB that handles all the display and memory logic.
IIRC, one of your other designs used QPI parts in parallel. I did create a flash QPI module that worked pretty well as a SD card replacement. The SD card was the BIGGEST bottleneck (and uses 1/3 program memory right off the top)
I'll have to take a look at these. The simplest solution I was able to come up with is 2 QPI devices on an 8 bit bus. Leaves you a bunch of pins free.
Need to digest all this again but no one's hit on my biggest question. HOW MANY PINS? (yeah I know, all 64 right? lol) In all seriousness, REV3 has 4 pins FREE.. 31..28. You can get 2 more if you don't need audio. 4 more if you make the SD card use the 16b -bus.
I think the last compact design I did used 8b data and 4b control. There was a ton of logic but it fit into a $3.80 XC2C64A - http://www.digikey.com/product-detail/en/xilinx-inc/XC2C64A-7VQG100C/122-1409-ND/949460
Or if you want to completely remove the LCD/MEM bus from the propeller bus the XC2C256 could fit the mess of logic needed. $15.45 http://www.digikey.com/product-detail/en/xilinx-inc/XC2C256-7VQG100C/122-1402-ND/949453
For my "pet project" REV3 is close. 16Mbit sram 4x 4m chips. The MAIN bottleneck was SD card. A FLASH "disk" should solve that problem. I'll have something prototyped next week.
http://www.mouser.com/ProductDetail/Alliance-Memory/AS6C4008-55PCN/?qs=E5c5%2bmu3i3%2bMOyro1Tlhzg==
If I can find a decent replacement for these SRAMs (smt, better price point) I think I would be a start. $5 a chip isn't bad but using 4 to get to 16Mbit seems a bit wasteful.
I've done exposed pad QFN, TQFP and the like with great success. BGA is a little scary but I might be willing to try it.
Yes, but even Std P1 can augment with a more modest FPGA than p1V needs.
The QFN48 parts I spec'd above are similar price to the 2C64A, but are smaller and open some more choices.
They have RAM, so you could do a CLUT to reduce the pixel load.
I think the HyperRAM may also need a pixel buffering scheme, as they spec does not quite cover streaming a full line at fixed clock spacing.
Probably ok for LCD, but not tolerated for VGA.
In that case, small RAM becomes mandatory, and simpler CPLDs no longer quite do it.
My only grumble about the QFN48 parts, is Lattice do not yet seem to have a 48 pin Eval Board. (?!)
HyperRAM and P1 :
I found this in the data :
" The host system may also effectively increase the tCMS value by explicitly taking
responsibility for performing all refresh and doing burst refresh reading of multiple sequential rows in order to catch up on distributed refreshes missed by longer transactions. "
I think that opens up some simple arrangements, with some caveats :
* The HyperRAM can be addressed once only per frame, and then Stream Clocked (Linear mode). This Address once is extremely simply to configure.
* This refreshes the used locations, but others are not refreshed.
(ie you have one display, but it can be many sizes)
Looking at some LCD data (800*480), and the HyperRAM, I get numbers like this
This LCD has two modes, and DE mode is simpler = just one framing pin needed.
Shortest time Specs :
((800+88)*480+4*(800+88))/40M = 10.75 ms - rest of HyperRAM MAX of 64 ms is free for Write times. (40M LCD is 20M DDR clock from P1)
(or if meet right most LCD columns MAX time, for frame that gives 240tH times, is just inside HR limit of 64ms and a 10M scan rate gives 32.77% time free for WrBW, or 419136 P1 Opcodes)
My guess is 10.75ms & 64ms will be ok, which gives 83.20% of free time or 1065000 P1 Opcodes, for Display writes.
(for VGA, just the Frame time is available for Display Writes)
Memory used: P1 generates the LCD_DE framing pulse train, and HyperRAM clocks out ((800+88)*480+4*(800+88)) = 429792 locations, of which 800*480 are active, the rest are black.
During HR Write LCD_DE is LOW, ignoring RWDS from writes.
During HR Read, Clock LCD is from HR_RWDS, fed to AUP1G57 XNOR doubler, (max Fdo 40MHz, generates =\_/= clocks from XNOR ) XNOR Not required for VGA.
P1 generates
LCD_DE
HR_CSN
HR_CLK
16b BUS is shared between LCD and P1 and HR
This would need a PCB with 2 x HyperRAM mounted, mapping to P1 pins & LCD connector, for 16b Pixels.
How would you stream a camera to LCD using the dual HR as the RGB source, since you can read and write at the same time? I am assuming for LCD you are speaking about stored data only, from SD>HR>LCD
I've not looked at cameras, but the DE mode for LCDs allows a single simple framing pulse. (A wide gap infers Frame)
Yes SD or Flash for Icons and Fonts etc
It's a single contiguous read starting at a certain address. During that time, one P1 COG is outputting a clock, and doing the simple DE modulation 800:88.
You need to write to the RAM, during those free time budgets to change areas like progress bars, button effects etc.
http://forums.parallax.com/discussion/161587/p1v-with-2mb-of-hub-visible-ram-and-now-32mb-of-sdram/p1
The cpld logic is pretty simple. 20 bit address counter, a few registers, some 1-4 decoders.. Should be REALLY fast
FLASH - NAND Memory IC 8Gb (512M x 16) Parallel
SRAM - Asynchronous Memory IC 16Mb (1M x 16) Parallel
CPLD - XC2C64A - 64 IO
1v8 reg for CPLD
footprint for SPI / QPI soic8
Looks like a lot of work...
If you wanted to take an easier path, check out the EVE2 chips by FTDI and also the newer version from Bridgetek.
True, but Eve2 do not use Prop's, and they climb in price - latest is $7.15/100.
I like the CPLD approach, and there is already a RaspPi board that does 128MHz SPI CPLD to LCD.
A SPI link can make sense, as sometimes the LCD is remote from the MCU, or you just want to save pins.
Worth checking where that board tops-out, and if there is room for a "P1/P2-SPI" CPLD version - ie low pin count.
P1 & P2 could do DDR SPI and Dual & QuadSPI to bump the bandwidth.
DualSPI is the highest bandwidth, with the common 4 pin SPI connection.
And, I just saw the newest version at Mouser for $4...
Interesting, checking I see BT816
Mouser : 1 $4.7000 25 $4.5700 100 $4.4300 500 $4.3900
Digikey: 1 $9.6800 25 $7.9336 100 $7.1595 520 $5.9985
someone has made a mistake there I think ?
A new part you could consider supporting is the MUX-RAM from ISSI (shared AD & Q pins, internal address latch - sadly, no counter ?! )
IS61WV25616MEBLL x16 2.4~3.6V 10, 12ns -40 to 125°C TSOP2(44), BGA(48) Prod Muxed SRAM
This allows fewer CPLD pins, as you can time-multiplex Address and Data.
That would need a change of footprint, but it does move to the easier pitch of 0.8mm on the long edges.