HyperRAM driver for P2

18911131424

Comments

  • roglohrogloh Posts: 2,707
    edited 2020-05-01 - 10:44:51
    Evanh, you can just use INA or INB for pin input and then test by ANDING with bit values

    eg. at startup time when you hold a pushbutton switch down you could select the timing you want to try out: (switches might be active low, can't recall)
       if ina and $1
          timing := xxx
       elseif ina and $2 
          timing := yyy
    
       initDisplay(...., timing, ...)
    
  • rogloh wrote: »
    Evanh, you can just use INA or INB for pin input and then test by ANDING with bit values

    eg. at startup time when you hold a pushbutton switch down you could select the timing you want to try out: (switches might be active low, can't recall)
       if ina and $1
          timing := xxx
       elseif ina and $2 
          timing := yyy
    
       initDisplay(...., timing, ...)
    

    I think you want "ina & $1" (bit-wise AND) there, rather than "ina and $1" (logical AND).
  • Oopps! Yeah you got me, too many beers drunk tonight already.
  • evanhevanh Posts: 9,877
    edited 2020-05-01 - 11:08:10
    First thing I was wanting is a pause function, in milliseconds or similar.
  • roglohrogloh Posts: 2,707
    edited 2020-05-01 - 11:10:25
    Use waitcnt(cnt + clkfreq) or waitcnt(cnt + _clkfreq), can't recall. This is a one second delay. For milliseconds just divide clkfreq by 1000 and multiply with your desired value.
  • Ah, in the demo.spin2, I just worked out that screenbuf has to be big enough to hold the contents of features.txt. Otherwise, the static copy of features.txt gets corrupted and subsequent mode changes look a mess.

  • Some modes need a couple more minimum v-blanking of 8 instead of 6.
  • Oh, suck, the big plasma TV doesn't even notice the prop2 exists. It just sits blank while I cycle the modes. I guess it is waiting for a complex negotiation or something. :(

  • evanh wrote: »
    Oh, suck, the big plasma TV doesn't even notice the prop2 exists. It just sits blank while I cycle the modes. I guess it is waiting for a complex negotiation or something. :(
    Might be the 5V pin on HDMI. Tubular noticed this too on one of his screens I think. He might have worked around it by strapping it through using the extra breakout board's pin, though I think it would be safer to use a resistor. I didn't feel too comfortable driving it directly but my own plasma didn't need it.
  • roglohrogloh Posts: 2,707
    edited 2020-05-01 - 15:03:11
    evanh wrote: »
    Ah, in the demo.spin2, I just worked out that screenbuf has to be big enough to hold the contents of features.txt. Otherwise, the static copy of features.txt gets corrupted and subsequent mode changes look a mess.


    Yeah that demo is a bit lame. I think one of the demos I used to run might have been close to the actual binary limit, and if you exceed it bad things will happen. It's not difficult to overrun. They weren't designed to be especially robust, just something to get some sample test/gfx output displayed on screen.
  • Oh, okay, me goes digging ...
    4.2.7 +5V Power Signal
    The HDMI connector provides a pin allowing the Source to supply +5.0 Volts to the cable and Sink.

    All HDMI Sources shall assert the +5V Power signal whenever the Source is using the DDC or
    TMDS signals. The voltage driven by the Source shall be within the limits specified for TP1 voltage in Table 4-34. An HDMI Source shall have +5V Power signal over-current protection of no more than 0.5A.

    All HDMI Sources shall be able to supply a minimum of 55 mA to the +5V Power pin.

    A Sink shall not draw more than 50 mA of current from the +5V Power pin. When the Sink is powered on, it can draw no more than 10mA of current from the +5V Power signal. A Sink shall assume that any voltage within the range specified for TP2 voltage in Table 4-34 indicates that a Source is connected and applying power to the +5V Power signal.

    A Cable Assembly shall be able to supply a minimum of 50mA to the +5V Power pin to a Sink, even when connected to a Source supplying no more than 55mA.

    The return for the +5V Power signal is DDC/CEC Ground signal.
    Table 4-34 +5V Power Pin Voltage
    Item		Min		Max
    TP1 voltage	4.8 Volts	5.3 Volts
    TP2 voltage	4.7 Volts	5.3 Volts
    

    Okay, so the prop2 board should supplying 5 V power while transmitting.

  • evanhevanh Posts: 9,877
    edited 2020-05-01 - 15:51:52
    Ah, and the HDMI accessory board brings the 5 V track to J102 pin 2. So all it needs is a pull-up from the Eval board's 5 V on pin 11 of the accessory header ...

    Got a picture now ... Looks like only select modes supported though. :( And the list is small too I think! Massive contrast to the cheapo LCD TV I have in the project room.

  • Good to know that 5v resistor fixed things once again. I need to check whether it worked for @RJSM too
  • Got a picture now ... Looks like only select modes supported though. :( And the list is small too I think! Massive contrast to the cheapo LCD TV I have in the project room.

    Glad it is working. Yes I've also found the branded consumer electronics gear can to be more selective and tends to support a smaller subset of video modes. Probably they don't want a lot of support calls or returns when people ask why isn't it supporting my xyz device etc. Instead just release something that is tested with a very limited set of input capabilities, which is hopefully at least documented somewhere in the manual.
  • evanhevanh Posts: 9,877
    edited 2020-05-02 - 06:38:41
    That approach is much more likely to fail with user chosen device XYZ. No, it allows them to say supplier chosen UVW works. For everything else, go jump in the lake.

    EDIT: BTW: That behaviour is learned. It happens when big companies get their asses sued for petty issues. Like what happened to Apple not long ago where they added a power reducing feature to make the battery more usable after it naturally degrades. Some dumbass decided that was grounds for complaint. So now we end up with nerf'd products.

  • Continuing discussion here:
    rogloh wrote:
    In my case I am just sorting out the Spin driver at the moment figuring out a simple way to do the mapping of address ranges to a device and bus, which potentially allows management of multiple instances for systems with more than one Hyperbus in the future. The multiple buses capability itself might come later as I don't have the HW to test it, but I just don't want the API to have to fundamentally change...
    Tubular wrote: »
    I think Parallax are just about to build the next hyperram boards, I have an order waiting on that batch

    Perhaps we could persuade them to run a couple of "overruns" with dual Hyperram, as opposed to normal one flash and one hyperram

    That may be useful for some people but actually I was talking about two HyperRAM buses, each bus with it's own COG that can manage the multiple shared devices on each bus. The existing board with HyperRAM and HyperFlash can be used to test the multiple devices on the same bus. I just need to play with a second HyperRAM board on the same P2-eval to gain the multiple HyperBuses, it can come later...
    Yanomani wrote:
    This could be really interesting, mainly If it also includes using variable-latency-aware HyperRams, diverselly from the ones that were assembled during the first batch.

    Besides the obvious capability of being able to improve access timings, it would enable experimenters to gather a better understanding of the self-refresh circuitry/logic operation, embodied into newer devices.

    Things like extending read/write access cycles way over the present 4uS limit and, at the same time, being able to take full control over the mandatory/necessary refresh operations, could unleash the full potential of having a better integration between P2 and HyperRams.

    I will have the ability to override the 4uS limit for those willing to try it, plus allow registers to be read/written to setup whatever special options it may allow. The refresh operation is not something that I plan to do in the driver at this point, though others could perhaps modify it or try to manage it independently at a higher level with a background refresh request list for example at the lowest COG or Round-Robin priority.








  • I have two Hyper accessory boards if you get an urge to try something. One currently has the 22 pF capacitor on it.
  • roglohrogloh Posts: 2,707
    edited 2020-05-08 - 05:01:21
    evanh wrote: »
    I have two Hyper accessory boards if you get an urge to try something. One currently has the 22 pF capacitor on it.
    Good to know. Also when we are freed to travel/meet up again I can probably work with Lachlan/ozpropdev with trying this. They will have boards too.
  • Of course you might want to try 4....

    Out of interest how many cogs are taken up with what you're doing at the moment?
  • roglohrogloh Posts: 2,707
    edited 2020-05-09 - 02:51:46
    Tubular wrote: »

    Out of interest how many cogs are taken up with what you're doing at the moment?

    It's still only one COG driver per HyperBus which will manage multiple devices on the same bus. The SPIN2 wrapper API is invoked by the client COG, but raw accesses can also be done directly from other non-SPIN COGs running PASM2 or other languages by direct mailbox interaction. These PASM clients will likely need to examine how the SPIN2 API does it and other documentation to understand the lower level formats required.

    Now in terms of this SPIN2 API I am trying to keep it as simple as I can to setup (which is not always easy). The API list is looking a bit like this below for now but this may change further or be added to if I find it is insufficient or doesn't work out the way I want. Early feedback on this API is welcome and has a strong chance of influencing the outcome now if the suggestions make sense and can improve/simplify things:

    busId:= mem.create(busBasePin, frequency, flags)
    This API initially allocates a new external memory bus containing Hyper devices, returns a bus id which is the data bus's pin group from 0-7, or -1 (or other negative error codes) if an error.
    - busBasePin is the base pin of the Hyper data bus (a multiple of 8)
    - frequency is the frequency that the P2 will be clocked at during driver operation - may differ to current frequency if this gets changed before starting the COG - e.g., after a video driver starts
    - flag bit options may specify sysclk/1 vs sysclk/2 default read operation (or other non-default settings TBD)

    err:= mem.mapAddrDevice(startAddress, busId, deviceType, deviceSize, clkPin, rwdsPin, csPin, resetPin, maxBurst, flags)
    This API maps a 16MB block (out of the 4G address space) to a particular device/size and HyperBus using the given pins. It requires an already created bus so it knows the frequency and appropriate burst settings to apply for the device. Returns 0 on success or -1 (or other negative error codes) on error.
    - startAddress whose upper 8 bits defines the 16MB block to be mapped. Lowest 24 bits ignored.
    - busId defines which HyperBus to add this device to
    - deviceType defines whether HyperFlash or HyperRAM - later may include v2 device HyperRAM if different
    - deviceSize defines size of device in multiples of 16MB
    - clkPin, rwdsPin, csPin, resetPin define control pins for device (-1 for argument when no resetPin control)
    - maxBurst is the maximum transfer burst size allowed in bytes while CS is low, (-1 means let driver automatically decide on the burst based on a known device/streamer limit and operating frequency)
    - flags are options if required (TBD)

    cogId:= mem.start(busId)
    or
    cogId:= mem.startExplicit(busId, cog)
    These APIs start up a Hyper driver COG controlling the HyperBus which has already had some device(s) mapped. It returns the cog ID of driver COG spawned or -1 if an error occurs (or other negative error codes).
    - busId is the HyperBus to start
    - cog is the cog id to use when starting explictly, otherwise one will be automatically allocated when the driver is spawned

    err:= mem.stop(busId)
    API to shutdown a bus and stop its COG driver.

    err:= mem.enableCog(busId, cog, maxBurstLimit, pollPriority, notifyATN, lockedTransfers)
    This API enables one or more COG's mailboxes to be polled to let it access the bus devices. Return 0 on success, -1 on error (or other negative error codes). Limiting the number of COGs that are polled to just the set required reduces polling latency.
    - busId refers to a given Hyper Bus driver
    - cog is id of the COG to enable (or -1 for all COGs)
    - maxBurstLimit - controls the number of bytes a COG can transfer while CS is low (this also controls the latency for the highest priority COGs to get serviced), and -1 means to just use the device's or streamer limit
    - pollPriority, 0-7 priority associated with this COG in the driver's poller (-1 for round-robin COGs)
    - notifyATN - flag set true if COG is to also be notified by COGATN upon request completion
    - lockedTransfers - flag set true to keep the entire COG's transfer requests atomic and not separated by polling other COGs during the broken up bursts. This potentially affects video quality and should be enabled with care. Usually it would only enabled on the video COG for example.

    err:= mem.disableCog(busId, cog)
    This API removes a COG from having it's mailbox polled by a Hyper driver on a given bus.

    val8 := mem.readByte(addr)
    val16 := mem.readWord(addr)
    val32 := mem.readLong(addr)
    err := mem.readBurst(dstHubAddr, addr, len)
    Primary read APIs - returns data or 0 for success, or error code(s) if unprovisioned memory address

    err := mem.writeByte(addr, val8)
    err := mem.writeWord(addr, val16)
    err := mem.writeLong(addr, val32)
    err := mem.writeBytes(addr, val8, count)
    err := mem.writeWords(addr, val16, count)
    err := mem.writeLongs(addr, val32, count)
    err := mem.writeBurst(addr, srcHubaddr, len)
    Primary write APIs - returns 0 or error code(s) if unprovisioned memory address

    err := mem.copy(dstAddr, srcAddr, totalBytes, hubBufAddr, hubBufSize)
    API to copy bytes from one external memory area to another external memory area (via an intermediate HUB RAM buffer). Does not check for overlapping transfers - it can only increment addresses during the copy. Copying data from one HyperBus device to another device on another bus may eventually also be supported by this API when ready (TBD).

    val16 := mem.readReg(devaddr, addrlo, addrhi)
    err := mem.writeReg(devaddr, addrlo, addrhi, val16)
    APIs to access internal device registers, bus/device is determined from devAddr

    err:= mem.gfxRead(dstHubAddr, addr, dstPitch, srcPitch, width, height)
    err:= mem.gfxWrite(addr, srcHubAddr, dstPitch, srcPitch, width, height)
    err:= mem.gfxCopy(dstAddr, srcAddr, dstPitch, srcPitch, width, height, hubBufAddr)
    err:= mem.gfxByteFill(addr, val8, pitch, width, height)
    err:= mem.gfxWordFill(addr, val16, pitch, width, height)
    err:= mem.gfxLongFill(addr, val32, pitch, width, height)
    APIs for graphics memory related transfers. The gfxCopy must be between devices on the same bus.

    err:= mem.exec(bus, listPtr)
    API for initiating a request list. The actual raw list item population APIs are TBD or may be up to the client to setup natively for themselves.

    err:= mem.setDeviceLatency(devaddr, latency)
    err:= mem.setDeviceBurst(devaddr, burst, delay)
    APIs to setup device's Latency & Burst sizes and input read delay timing (TBD)

    val32:= mem.getDeviceLatency(devaddr)
    val32:= mem.getDeviceBurst(devaddr)
    APIs to read a device's current Latency & Burst sizes (TBD)

    Here's an example of what things may look like when using this driver. For simplicity it is not testing for returned errors fully etc, but hopefully still gives the basic idea of how to start to use it.
    CON
       hyperBase= 32   ' base data bus pin of HyperRAM board/device
       hyperMem = $80000000 ' start address where 16MB HyperRAM is mapped to
    
    OBJ
       mem: "hyperdrv"
    
    PUB demo | val, x
        initMemory(hyperMem, hyperBase)
        repeat x from 0 to 1000
           mem.writeByte(hyperMem+x, x)
        repeat x from 0 to 1000
           val:= mem.readByte(hyperMem+x)
           if val <> x ' light an LED on data mismatch
              dirb |= 1<<(56-32) ' outb pin value assumed already 0
        repeat
    
    PUB initMemory(addr, base) | bus, driverCog
       bus := mem.create(base, _clkfreq, 0)  
       if bus < 0 
          return -1
       if mem.mapAddrDevice(addr, bus, mem#HYPERRAM, 1, base+9, base+11, base+13, base+ 15, -1, 0) < 0
          return -1
       driverCog := mem.start(bus)   
       mem.enableCog(bus, cogid, -1, 0, 0, 0)
       return driverCog 
    
  • VonSzarvasVonSzarvas Posts: 2,047
    edited 2020-05-13 - 13:05:50
    @evanh

    Sorry this took so long to get back to. Here's a couple results that look rather different.

    The significantly lower error counts are with the trace-matched layout.

    As it's been a while, are there any other/newer useful tests I could run ?


    Edit: Added P32. Wow, that seems to bite by comparison!
  • roglohrogloh Posts: 2,707
    edited 2020-05-14 - 02:45:17
    I'm surprised there were no working frequencies with the P32 base pin to write to HyperRAM if you have matched trace lengths. If this test with P32 is repeated with data pins unregistered, do you also see the same thing?

    Update: Actually if these writes were always at sysclk/1, perhaps that means the transitions between data and clock is too close (too well matched, not enough setup/hold delay) and so always corrupts the writes. Sysclk/2 should not show the same result or there are other problems. The 22pF capacitor approach evanh tried may help too at sysclk/1.
  • Yep, for HR writes at sysclock/1, the capacitor is needed on the HR clock pin. Once that's in place the bad frequencies, unlike with reads, all go away. I note that all three runs have some bad spots.

    The reason why, with P32 base pin, that is worse will be because one or two of the data pins (P32-P39) are slower. Making for too small a data setup time before the clock arives.
  • One of the timing problems to overcome with the accessory board is that the combination of both HyperRAM and HyperFlash chips load the data pins more than the clock pin. This means there is longer slew time on the data pins. Conveniently, having the capacitor on the clock pin actually corrects this discrepancy. It's the main reason I settled on 22 pF. The slope looked best matched on the scope then.

    Von,
    Add the capacitor to the accessory board and run the tests again. Photo - https://forums.parallax.com/discussion/comment/1493539/#Comment_1493539
    And matched slope snapshot - https://forums.parallax.com/discussion/comment/1493280/#Comment_1493280

  • evanh wrote: »
    Yep, for HR writes at sysclock/1, the capacitor is needed on the HR clock pin. Once that's in place the bad frequencies, unlike with reads, all go away. I note that all three runs have some bad spots.

    The reason why, with P32 base pin, that is worse will be because one or two of the data pins (P32-P39) are slower. Making for too small a data setup time before the clock arives.

    Do you think this a problem with the P2 chip's pin to pin timing differences, the P2-EVAL board signal layout, or the HyperRAM board signal layout? i.e. what part of this is potentially under our control?
  • VonSzarvas wrote: »
    As it's been a while, are there any other/newer useful tests I could run ?
    I haven't done anything new since that program. And that looks to be up to date with the extra config info printing.

  • Adding the 22pF now.... posting soon !

    In the meantime, just to be clear that of those 3 test results I posted, only the P0 test was trace-matched. The other 2 (p16 and p32) were on an original Eval board layout which did not have trace matching lengths. P16 seems a lot better (if I understood the results well). There was a header that had better matching than the others, and it may have been P16. But none had deliberate matching on the original Eval board.
  • Also for @rogloh
    The HyperRAM board was trace matched as far as the first Hyper chip. And from memory was very closely matched from the 1st to 2nd Hyper chip, but not exactly matched.
  • VonSzarvasVonSzarvas Posts: 2,047
    edited 2020-05-14 - 10:07:52
    Here goes, Base pins 0,16,32 each with an 0603 22pF across IO+8 to GND.

    Same setup as last time, same P2 and HyperRam board.
    0 has trace matching, 16 and 32 nothing special.


    It seems the 22pF compensates P16 and P32 for the lack of trace matching pretty darn well.
    With P0 the 22pF cleans away all the small errors all the way up to 360 Mhz. Nice! Above 360 MHz the capacitor impacts the results negatively.

    As I have this set up, I'll go try a smaller cap with the P0 test.
  • roglohrogloh Posts: 2,707
    edited 2020-05-14 - 10:40:14
    Interesting there was one single error observed at 256MHz on P32 even with the capacitor. Maybe some brief random noise spike?
Sign In or Register to comment.