Its happy 3v HyperRam day - stock at both Digikey and Mouser today

Tubular · 2016-06-02 20:06

And they seemed to have upped their next order to 500 pcs, so there should be proper stock down the track...

cgracey · 2016-06-02 20:10

Great! I wish we could have built the Prop123-A9 board with those, instead of SDRAMs. It would have freed up a whole bunch of I/O's.

Tubular · 2016-06-02 20:30

Yep, they're a good match, with DDR and 3v and low pin count and hidden refresh

I was revisiting this design with fellow local propheads last week. Because hyperram seems in its infancy I think it may make sense to make a twin hyperram to tssop-54 adapter pcb, castellated, so it can be used with 16 bit wide existing hardware designs

jmg · 2016-06-02 21:54

Tubular wrote: »

.. Because hyperram seems in its infancy I think it may make sense to make a twin hyperram to tssop-54 adapter pcb, castellated, so it can be used with 16 bit wide existing hardware designs

The TSSOP54 side could be a challenge..
There is also talk of using 2 x HyperRAM for 16b LCD drive, where it should be possible to clock out an image in one burst, with some usage caveats.
I also did a SCH for a DDR to LCD clock pulse using a SOT23 XNOR gate.

That HW setup should even also work with P1, as well as P2.

Rayman · 2016-06-02 22:11

Next stock at Digikey looked to be a couple months away.
I just ordered 3, just to make sure I had a chance to try one soon...

Rayman · 2016-06-03 00:07

I think I saw in the data sheet that two chips can share same bus

not sure when that would make sense though

Yanomani · 2016-06-03 00:40

Hi Rayman

Sure they can share the same bus, provided their chip select inputs (CS#) are connected to different master (processor) pins.

RWDS pins, being I/Os, should be treated separatelly.

As for RSTO# and INT# outputs, if used, they should be at least combined by means of suitable dual input gates, before routing to any processor pin, or other use elsewhere.

(no need to do the former procedure, both outputs are open-drain, so they can be or-chained. The existence or not of an weak internal pull-up should be determined as per each device's datasheet)

Each pair of CKs, CK#s and RESET#s inputs can share the same processor pins (CK#s are not used in 3V devices).

But, as clock frequency scales up, the whole effect of bus loading, lenght, capacitance, and ringing (to name a few) should be taken in account.

Henrique

Rayman wrote: »

I think I saw in the data sheet that two chips can share same bus

not sure when that would make sense though

rjo__ · 2016-06-03 03:27

Rayman wrote: »

I think I saw in the data sheet that two chips can share same bus

not sure when that would make sense though

If you can refresh one, while you are writing to another... all on the same bus, it would make sense, but not sure
it this is possible.

Tubular · 2016-06-03 03:36

I think this is the diagram, from the hyperbus spec. Shows shared RWDS but separate CS lines

However I was thinking more of having two to achieve 16 bits, for driving LCDs etc.

Yanomani · 2016-06-03 03:48

rjo__ wrote: »

Rayman wrote: »

I think I saw in the data sheet that two chips can share same bus

not sure when that would make sense though

If you can refresh one, while you are writing to another... all on the same bus, it would make sense, but not sure
it this is possible.

The answer is Yes, you can access (read or write) one, while the other(s) is(are) being self-refreshed.

During the time CS# is inactive (high), the not selected HyperRam chip(s) can perform its(their) self-refresh operation(s), unatended.

jmg · 2016-06-03 03:55

Yanomani wrote: »

During the time CS# is inactive (high), the not selected HyperRam chip(s) can perform its(their) self-refresh operation(s), unatended.

That's what I first thought, but more careful reading of the specs suggests that is not the case :

The Min CS# hi time is very short and the CS low time has a MAX (not the CS hi), and the timing suggests refresh is only possible in a short window of a few clocks after address, (before Data) with a choice of
a) always use that time, (Less jitter) or
b) use RWDS to indicate when that time was actually stolen by refresh. (better averages, but a pain to manage)

Or, I think you can stream clock the part, and use the natural refresh of that area of RAM only, in which case 64ms is your time budget.

rjo__ · 2016-06-03 03:57

Yanomani wrote: »

rjo__ wrote: »

Rayman wrote: »

I think I saw in the data sheet that two chips can share same bus

not sure when that would make sense though

If you can refresh one, while you are writing to another... all on the same bus, it would make sense, but not sure
it this is possible.

The answer is Yes, you can access (read or write) one, while the other(s) is(are) being self-refreshed.

During the time CS# is inactive (high), the not selected HyperRam chip(s) can perform its(their) self-refresh operation(s), unatended.

You gotta love it.

rjo__ · 2016-06-03 03:59

So lets go 4x4 and stop messing around.

Yanomani · 2016-06-03 04:35

jmg wrote: »

Yanomani wrote: »

During the time CS# is inactive (high), the not selected HyperRam chip(s) can perform its(their) self-refresh operation(s), unatended.

That's what I first thought, but more careful reading of the specs suggests that is not the case :

The Min CS# hi time is very short and the CS low time has a MAX (not the CS hi), and the timing suggests refresh is only possible in a short window of a few clocks after address, (before Data) with a choice of
a) always use that time, (Less jitter) or
b) use RWDS to indicate when that time was actually stolen by refresh. (better averages, but a pain to manage)

Or, I think you can stream clock the part, and use the natural refresh of that area of RAM only, in which case 64ms is your time budget.

Please, read the specs again.

The extra latency, that is "requested" by the slave (HyperRam) device, by keeping RWDS high during the Command-Address phase of the interface, is used to warn the master (e.g P1 or P2) that the internal self-refresh circuit does require an extra number of clocks (programmable and can be fixed, to ease the design of the master interface protocol) before it can initiate the transaction specified during the Command-Address phase.

During the time CS# is high, self-refresh can take place, freely.

I'm using both HyperBus™ Specification Document Number: 001-99253 Rev. *C Revised May 09, 2016 from Cypress and ISSI 8M x 8 HyperRAM™ Advanced Information May 2015 - Rev.00C 05/01/2015 as references.

Henrique

jmg · 2016-06-03 04:53

Yanomani wrote: »

During the time CS# is high, self-refresh can take place, freely.

I cannot find that quote in my ISSI data sheet. Do you have an actual page number ?

I can find this (p32)
Chip Select High Between Transactions tCSHI >= 10.0ns
Chip Select Maximum Low Time - Industrial Temperature tCSM <= 4.0 us
Refresh Time tRFH = 40ns (4 clocks, or 8 edges)

(p4)
During the Command-Address (CA) part of a transaction, the memory will indicate whether an additional latency for a required refresh time (tRFH) is added to the initial latency; by driving the RWDS signal to the High state

(p52)
CR0.3 Fixed Latency Enable
0 - Variable Initial Latency
1 - Fixed Initial Latency (default)

Yanomani · 2016-06-03 05:00

jmg wrote: »

Yanomani wrote: »

During the time CS# is high, self-refresh can take place, freely.

I cannot find that quote in my ISSI data sheet. Do you have an actual page number ?

First reference: page 25

"7.9 Hardware Reset

The RESET# input provides a hardware method of returning the device to the standby state. During tRPH the device will draw ICC5 current.
If RESET# continues to be held Low beyond tRPH, the device draws CMOS standby current (ICC4).
While RESET# is Low (during tRP), and during tRPH, bus transactions are not allowed.

A hardware reset will:  cause the configuration registers to return to their default values,  halt self-refresh operation while RESET# is low,  and force the device to exit the Deep Power Down state.

After RESET# returns High, the self-refresh operation will resume.

Because self-refresh operation is stopped during RESET# Low, and the self-refresh row counter is reset to its default value, some rows may not be refreshed within the required array refresh interval per Table 5.6, Array Refresh Interval per Temperature on page 13.

This may result in the loss of DRAM array data during or immediately following a hardware reset.

The host system should assume DRAM array data is lost after a hardware reset and reload any required data."

Yanomani · 2016-06-03 05:05

jmg

The other reference, I'm also looking for its page number, is the maximum burst lenght, to avoid problems with the refresh circuit, that is device and temperature dependent.

Henrique

P.S. You've got it, page 32.

Yanomani · 2016-06-03 05:13

page 13

However, the host system generally has better things to do than to periodically read every row in memory and keep track that each row is visited within the required refresh interval for the entire memory array.

The HyperRAM family devices include self-refresh logic that will refresh rows automatically so that the host system is relieved of the need to refresh the memory.

The automatic refresh of a row can only be done when the memory is not being actively read or written by the host system.

The refresh logic waits for the end of any active read or write before doing a refresh, if a refresh is needed at that time.
If a new read or write begins before the refresh is completed, the memory will drive RWDS high during the Command-Address period to indicate that an additional initial latency time is required at the start of the new access in order to allow the refresh operation to complete before starting the new access.

I believe I've got the very sense of your doubt.

There is not an specific mention to CS#, but it can be infered that, if you are not using the chip actively, then it is performing self-refresh, unless it is in a Reset state.

And, in a dual (or even more) device situation, controlling each one has CS# activated or not, is a means to let the inactivated ones to do their self-refresh operations, while doing some operation in the other ones.

Henrique

jmg · 2016-06-03 05:18

Yanomani wrote: »

After RESET# returns High, the self-refresh operation will resume.

That is not the same as your claim - that merely says that RESET# defeats any refresh action.

Following the current envelope of your quote, confirms what I said.
On entry to Reset, ICC5 is 20mA for just 40ns, and then ICC4 is 135uA while CS#=H
Read Icc is also 20mA, so that indicated when refresh is occurring.

Still looks to me like 40ns bites for refresh.

I think they have chosen this somewhat less intuitive approach to allow CS#=H to power down, and allow clock gating, to save more system power.

jmg · 2016-06-03 05:18

Yanomani · 2016-06-03 05:25

jmg

Perhaps it's due my poor understanding of the english language, but...

page 14

"The distributed refresh method requires that the host does not do burst transactions that are so long as to prevent the memory from doing the distributed refreshes when they are needed.
This sets an upper limit on the length of read and write transactions so that the refresh logic can insert a refresh between transactions.

This limit is called the CS# low maximum time (tCMS).

The tCMS value is determined by the array refresh interval divided by the number of rows in the array, then reducing this calculation by half to ensure that a distributed refresh interval cannot be entirely missed by a maximum length host access starting immediately before a distributed refresh is needed.
Because tCMS is set to half the required distributed refresh interval, any series of maximum length host accesses that delay refresh operations will be catching up on refresh operations at twice the rate required by the refresh interval divided by the number of rows. The host system is required to respect the tCMS value by ending each transaction before violating tCMS.
This can be done by host memory controller logic splitting long transactions when reaching the tCMS limit, or by host system hardware or software not performing a single read or write transaction that would be longer than tCMS.
As noted in Table 5.6, Array Refresh Interval per Temperature on page 13 the array refresh interval is longer at lower temperatures such that tCMS could be increased to allow longer transactions.
The host system can either use the tCMS value from the table for the maximum operating temperature or, may determine the current operating temperature from a temperature sensor in the system in order to set a longer distributed refresh interval. The host system may also effectively increase the tCMS value by explicitly taking responsibility for performing all refresh and doing burst refresh reading of multiple sequential rows in order to catch up on distributed refreshes missed by longer transactions."

Henrique

Yanomani · 2016-06-03 05:36

Even the clock can be stopped while CS# is high, without causing loose of contents, due to the internally controlled self-refresh.

page 5

"The clock may stop in the idle state while CS# is High.

The clock may also stop in the idle state for short periods while CS# is Low, as long as this does not cause a transaction to exceed the CS# maximum time low (tCSM) limit. This is referred to as Active Clock Stop mode. In some HyperBus devices this mode is used for power reduction. However, due to the relatively short tCSM period for completing each data transfer, the Active Clock Stop mode is generally not useful for power reduction but, may be used for short duration data flow control by the HyperBus master."

jmg · 2016-06-03 05:42

Yanomani wrote: »

Perhaps it's due my poor understanding of the english language, but...

No, the document is poorly written, and often contradictory. Your English is very good

Taking the marketeeze from above

"However, the host system generally has better things to do than to periodically read every row in memory and keep track that each row is visited within the required refresh interval for the entire memory array.

The HyperRAM family devices include self-refresh logic that will refresh rows automatically so that the host system is relieved of the need to refresh the memory."

Maybe in an ideal world users need do nothing more, but this actually means only the user needs not run a refresh counter.
They do still need to budget for refresh, which is more obvious from the timing and Icc , than the text.

Here comes that fish hook, in your paste :

"page 14

"The distributed refresh method requires that the host does not do burst transactions that are so long as to prevent the memory from doing the distributed refreshes when they are needed.
This sets an upper limit on the length of read and write transactions so that the refresh logic can insert a refresh between transactions.

This limit is called the CS# low maximum time (tCMS).

The CS# hi time is not tRFH, but much shorter. Perhaps that is a typo ? I think not.

There is also no clock shown during CS#=H, and I do not believe these part are trying to multiplex between an on Chip CLOCK and an external clock, the timing is way to tight to be doing that.

That leaves needing a window when the CLK is running, to do Refresh, inside that 40ns tRFH.
The timing diagrams do show exactly this.

They are strictly correct, that "the refresh logic can insert a refresh between transactions", but it does that insertion after the address and before the data, when it has Active Clock edges that it can use.

Yanomani · 2016-06-03 05:51

The CS# hi time (tCSHI) is only used to distinguish between two consecutive interface transactions, as a marker. That's why it's so short (10nS) (minimum).

My English is what it is, thanks to Google translate!

jmg · 2016-06-03 06:00

It would be nice if the part magically self-clocked during CS=H, and managed refresh totally invisibly, but that is not how I read the timings.
I read that tRFH requires user clocks, and borrows/steals those 8 edges, to insert refresh during the window between Address and Data.
This is why their CS low spec is so short, each iteration allows only 8 edges for refresh.

When someone has a part connected, it will soon be clear where the refresh budget is needed

Yanomani · 2016-06-03 06:18

Now from page 10 of the Cypress document (HyperBus spec):

"Notes:
1. A Row is a group of words relevant to the internal memory array structure and additional latency may be inserted by RWDS when crossing Row boundaries — this is device dependent behavior. Refer to each HyperBus device data sheet for additional information. Also, the number of Rows may be used in the calculation of a distributed refresh interval for HyperRAM memory.
2. A Page is a 16-word (32-byte) length and aligned unit of device internal read or write access and additional latency may be inserted by RWDS when crossing Page boundaries — this is device dependent behavior. Refer to each HyperBus device data sheet for additional information. "

tRFH is only mentioned at the ISSI document, as a means of indicating when the internal self-refresh circuit is crossing a row or page boundary or not, so it needs those extra cycles to settle the ongoing internal refresh cycle, before the intended command could be performed.

And Yes, the part self clocks its internal refresh, unless it is put into a RESET# active state.

Yanomani · 2016-06-03 06:25

And if you program the device to use fixed latency intervals (ISSI, p12), even the decision of waiting or not the self-refresh to settle can be avoided, but then, data throughput will be affected.

jmg · 2016-06-03 06:41

Yanomani wrote: »

2. A Page is a 16-word (32-byte) length and aligned unit of device internal read or write access and additional latency may be inserted by RWDS when crossing Page boundaries — this is device dependent behavior. Refer to each HyperBus device data sheet for additional information. "

Yes, this is another question mark area, ISSI data is strangely silent on this issue, but if it does insert latency on boundaries, that will make VGA type streaming much harder, but it will probably still be ok for LCD streaming.
It will make P2 streaming more complicated, and may yet necessitate P2 needing DDR Clock In support, so RWDS can connect to control the Streamer.

Yanomani · 2016-06-03 12:46

jmg

Sorry for the late reply, but it was 03:25 a.m. and I was almost asleep, sitting on my chair.

Since, by device construction specs, RWDS and DQ(7:0) change their values at the same time, during data read interface operations, one should not rely on RWDS as a clock to latch data, at the Master (P2 or P1) internal memory space.

It (RWDS) should be seen as a byte qualifier signal instead, in its high and low states.

The CK signal (assuming 3V devices) low-to-high and high-to-low transitions (hence DDR) must be used, inside master's state machine, to sample and hold incoming data.

Focusing at P2 as a master, if you use one of its 160 MHz clock's edge to sample HyperRam DQ(7:0) at the input pins register level (I'm assuming that there is one), the same edge must be used to toggle the 80 MHz clock that will be driving HyperRam CK pin. This way, the device (HyperRam) will notice the completion of the ongoing read operation, and start the next one, if any, changing DQ(7:0) and toggling RWDS accordingly.

I'm not fully aware of the construction of the data path, connecting P2's input pins and its internal memory bus, so I'm inferring the existence of at least one latch stage, that is clocked by the internal 160 MHz clock, and then, at the NEXT 160 MHz active transition, data that was latched in the previous transition will finally be written into P2 memory space.

Henrique

jmg · 2016-06-03 20:51

Yanomani wrote: »

Since, by device construction specs, RWDS and DQ(7:0) change their values at the same time, during data read interface operations, one should not rely on RWDS as a clock to latch data, at the Master (P2 or P1) internal memory space.

Well, not without taking some care.
However, RWDS does have correct, and tight, time reference, you just need to skew it to some legal phase and that depends on your Tsu/Th of the sampling circuit.

You can use a simple XNOR scheme, to extract a safe clock from RWDS, for LCD Drive, as per my other posts.

Yanomani wrote: »

The CK signal (assuming 3V devices) low-to-high and high-to-low transitions (hence DDR) must be used, inside master's state machine, to sample and hold incoming data.

Not quite, this is where Phase matters, and propagation delays start to bite

The intention of the HyperRAM is that RWDS is used to qualify/time align the data out.
Looking at the numbers

CK transition to RWDS valid tCKDS 1~ 7 ns
RWDS transition to DQ Valid tDSS -0.8 +0.8 ns
RWDS transition to DQ Invalid tDSH -0.8 +0.8 ns
You can see the tight PVT matching on RWDS and DQ, but the clock delay is much more open.
To that 1~7ns you have to add any Prop pin delays/skews, which will be significant themselves.

It is quite easy to end up with a sub 50MHz clock limit, maybe even sub 25MHz because of these variable delays.

With FPGA SysCLKs limited to 80MHz and FPGA fabric being far faster than P2 process, it is not going to be easy to test these delay/skew related issues.

jmg · 2017-02-03 04:59

waistnick21 wrote: »

Look at page 5
"The clock may stop in the idle state while CS# is High.

Not sure what point you were making, but I have since found this info, with waveforms
https://warmcat.com/embedded/hardware/lattice/hyperbus/hyperram/2016/09/19/hyperbus-implementation-tips-on-ice5.html

There, the poster makes this observation

"If you're not operating it at near its fastest specified clock rate (166MHz for my chip, when I am constrained to 64MHz by the FPGA), you can inform the Hyperram to use a smaller number of your slow clocks that meets Tacc. In my case, it only needs 3, the smallest settable number. So we save 3 or 6 clocks depending on if the chip asked for double wait or not.

You can set this in b7..b4 of configuration register 0."
ie for up to ~ 83MHz, a Tacc of 3 CLKs is OK

It is still gong to be a challenge on P1 or P2 to get good bandwidths from HyperRAM.

Its happy 3v HyperRam day - stock at both Digikey and Mouser today

Comments