HyperRam/Flash as VGA screen buffer (Now XGA, 720p &1080p)

12345679»

Comments

  • I don't think so. If this doesn't work for HyperRam too, I'll probably try that...
    Prop Info and Apps: http://www.rayslogic.com/
  • RaymanRayman Posts: 9,716
    edited 2019-04-14 - 21:18:00
    Ok, I went back to the HyperRam 8bpp VGA demo and verified that the data was actually not starting at first byte of each row, but the fifth byte.
    Was able to switch from waitse1 to waitx #38 on read and have a stable image at 250 MHz clock. The write latency clocks were adjusted to put the data start at the first byte of each row.
    Prop Info and Apps: http://www.rayslogic.com/
  • RaymanRayman Posts: 9,716
    edited 2019-04-15 - 20:21:15
    Finally! This took a lot of head scratching, but finally have the VGA bird streaming from HyperFlash.

    There is very little difference between this and the HyperRam version in terms of the reading and display routines.
    Writing is a bit different though. There are two programs here:

    "HyperFlash_Test2d.spin2" loads the bitmap bits into flash. Stores 320 bytes on each 1024 byte row (easiest way).
    "HyperVGA_640_x_480_8bpp_1p_Flash" is the modified version of the HyperRam VGA code. I've removed the writing portion and changed for two reads per line instead of just one. Also, rearranged latency waitx location.
    Prop Info and Apps: http://www.rayslogic.com/
  • jmgjmg Posts: 13,920
    edited 2019-04-15 - 19:51:42
    Rayman wrote: »
    Finally! This took a lot of head scratching, but finally have the VGA bird streaming from HyperFlash.
    Good progress :)
    Code appears to drive RWDS from P2, but the data says it is a Flash output pin ?
    "The Read Data Strobe (RWDS) is an output from the HyperFlash device that indicates when data is being transferred from the memory to the host. RWDS is referenced to the rising and falling edges of CK during the data transfer portion of read operations."
    Is this using clocked IO mode in the pin-config yet ?

  • RaymanRayman Posts: 9,716
    edited 2019-04-15 - 20:20:52
    I don't think I'm using RWDS at all any more... Well, I think I am for HyperRam write, to specify latency, but think could just tie to ground.

    Also, I read somewhere that you don't need the Reset pin.

    Not using clocked IO.
    Prop Info and Apps: http://www.rayslogic.com/
  • YanomaniYanomani Posts: 882
    edited 2019-04-15 - 21:29:10
    Hi Rayman

    Sure you are bringing some good news from the HyperThings frontier. Another breakthrough!

    I also have a trend to agree with kwinn's early comment, about bringing such a nightmare-alike interface to a new product. If I ever have to choose a dance, a polka is way funnyer than a minuet, but... :lol:

    But this also adds to your victory; besides the determination you've show, conforming the steel untill the Katana is straight and sharp.

    Congratulations
  • Looking closer into the datasheet, the latency crossing a page boundary (page=16 words=32 bytes) is fairly complicated....

    But, I was completely ignoring this and have no gaps in data.
    It appears that if I start a read on a word address with 3 lower bits all zero, then I can read the entire device in one operation with no latency gaps in the data.

    This could make VGA display much simpler that HyperRam. With HyperRam, I believe there is a latency when crossing a 1024-byte row boundary. And, you can't keep CS low for more than around 640 bytes.

    Here, I think I can keep CS low during the whole display period.
    Prop Info and Apps: http://www.rayslogic.com/
  • jmgjmg Posts: 13,920
    Rayman wrote: »
    Looking closer into the datasheet, the latency crossing a page boundary (page=16 words=32 bytes) is fairly complicated....

    But, I was completely ignoring this and have no gaps in data.
    It appears that if I start a read on a word address with 3 lower bits all zero, then I can read the entire device in one operation with no latency gaps in the data.

    This could make VGA display much simpler that HyperRam...

    Here, I think I can keep CS low during the whole display period.

    I think that's related to fractional page crossings. As your tests indicate, larger blocks are ok, and I think the issue is the time needed to read the next page needs to be enough.
    The example they give has only 9 clocks on first page, for a 12 clock latency so it has to add another 3. I think that also means if you started further back than -12 bytes for 12 setting, you would be ok.
    Word_xx_000 would be a 16 byte lead-in, so that should be ok as linear page crossing.
    ie your word address with 3 lower bits all zero rule is then enough to meet this :


    "When configured in linear burst mode, while a page is being burst out, the device will automatically fetch the next sequential page from the MirrorBit flash memory array. This simultaneous burst output while fetching from the array allows for a linear sequential burst operation that can provide a sustained output of 333 MB/s data rate"

    Rayman wrote: »
    .. With HyperRam, I believe there is a latency when crossing a 1024-byte row boundary. And, you can't keep CS low for more than around 640 bytes.

    That 640 looks to be refresh dictated, and if you did manual refresh during frame flyback, that may relax ?
    Based on that 640, it looks like the refresh counter increments on (or near) CS=\_, then refreshes one row during the set-address part of the cycles, and needs to scan the whole RAM (there is no option to partial-scan to save time), so that would mandate so many CS/clks sets to keep inside the overall refresh milliseconds max. That's simple HW, with no extra clock generation needed.

    Another way to 'hurry-along' the refresh counter, inside that overall milliseconds time ceiling, could be to issue a few CS =\__/== with some minimum clocks contained.
    eg 3 'advance refresh' short bursts in a flyback, and then a full, real set-address-read would advance the refresh ctr 4 per line, and that might allow you to read 640*4 = 2560 in that 4th CS window ?
    A smart pin might be able to do those 'advance refresh' bursts quite quickly, as address info is 'don't care'
  • YanomaniYanomani Posts: 882
    edited 2019-04-16 - 23:45:58
    Hi Rayman

    Have you ever tried to leverage from the fact that 640 = 512 + 128?

    That way, each image you are using at 640 * 480 resolution, wich occupies 307200 bytes (or exactly 300 x 1024 bytes) can be layered in two blocks, within HyperFlashs address space.

    The first block, starting at, says, address 00000H, will hold 480 x 512 = 245760 bytes (240 kB), right thru address 3BFFFH; the second block, would start at address 3C000H, holding the remaining 480 x 128 = 61440 bytes (60 kB).

    There would never be any address boundary crossing anymore, either during Write Buffer Lines (512 bytes) write operationss, either during read ones.

    Since you are using HyperFlash, you don't need to worry about self-refresh limits when accessing the data blocks, thus, the longest (256 + 24 = 280) period of Hyper_CSn = Low would be limited to 4.480 uS, and the shortest ones (64 + 24 = 88) would hit 1.408 uS, totalizing 5.888 uS, or 147.2 pixel clocks (@ 25 MHz / pixel). (the 24 value should be enough to account for 16 latency clocks, plus the time to do the CA-Phase transfer and some spare cycles to ensure sane limits to switch Hyper_CSn and Hyper_CK, within specced timings)

    Since the HFront Porch (16) + HSync Pulse (96) + HBack Porch (48) totals 160 pixel clocks (6.4 uS), I believe the two read operations, needed to transfer each horizontal line under such conditions, could fit within those limits, sparing the whole horizontal display period for any other activities one would intend to do.

    Hope it helps a bit

    Henrique

Sign In or Register to comment.