@Wuerfel_21 said:
Anyways, single 8MB chip support just got committed to NeoYume. Obviously only plays a handful of really small games. Crossed Swords, Fatal Fury 1, Mutation Nation and Sengoku work
So, at least a one 8MB chip board looks like it will work. May have to scrap the multi-chip plans...
For MegaYume I tried a test with the CS pin of a second chip lifted up and tied to VCC. That didn't work either. So, a second chip just being on the bus breaks it...
My driver has the ability to set different input timing per bank in case of variation there. That may help you out @Rayman unless your issue is the extra input load on the data bus or clocks affecting things dramatically or other signal integrity issues with refections at device inputs etc. Remember these signals are getting clocked at 168MHz or thereabouts so it's not really low speed stuff any more. I wonder if some series termination resistors may be needed on your board or if the paths are too long...?
If you have a very good scope maybe you can compare the bus signals with and without the 2nd device fitted to see what it does to the waveform, but probing itself will affect things too.
EDIT: Actually the per-bank timing might only help your board work with my own code in general, but if it turns out be timing differences per device, then Wuerfel_21's PSRAM code would also need to be adjusted to support per bank timing too, because currently it is just a global applied to all devices.
@Rayman said:
After some more testing, looks like it only works with one chip installed, and only with these settings;
PSRAM_WAIT = 10
PSRAM_DELAY = 4
So is the problem the very presence of the extra memory chip fitted on the board, even though only a single 8MB device is sufficient to run the emulator with small games, or is it that the emulated game needed to make use the second device, and it was getting corrupted data in that case?
That's the first time I realy take a closer look at the 3-chip "nibble"-version, and your findings seem to confirm APMemory's datasheet:
Summing any two devices "COUT"s (8 pF each) in parallel is enough to exceed the maximum allowed 15pF Load Capacitance.
Add to this value the load represented by both connector pins (male and female), the involved P2 I/O pin and all the traces/lanes, at both boards... it just makes sense...
P.S. The clock lane can also be suffering from the same load effects, but P2 "Fast" output driver is very forgiving on it...
@Yanomani said:
That's the first time I realy take a closer look at the 3-chip "nibble"-version, and your findings seem to confirm APMemory's datasheet:
Summing any two devices "COUT"s (8 pF each) in parallel is enough to exceed the maximum allowed 15pF Load Capacitance.
Add to this value the load represented by both connector pins (male and female), the involved P2 I/O pin and all the traces/lanes, at both boards... it just makes sense...
P.S. The clock lane can also be suffering from the same load effects, but P2 "Fast" output driver is very forgiving on it...
So they built an SPI bus chip that can't actually be bussed together? That can't be right. That'd be too galaxy brain.
@Yanomani said:
P.S. The clock lane can also be suffering from the same load effects, but P2 "Fast" output driver is very forgiving on it...
Also the device doesn't ever drive the clock pin, only the P2 does.
They don't, for sure; I was only mentally-tinkering (or thinkering ) about and trying to correlate some facts (P2-pins + Edge/Eval lanes and connector + memory board connector + lanes + three memory chips, togheter, on the same bus)...
Being less than 20 Ohm, the "Fast" drivers does helps a lot in supporting such "heavy-loaded" conditions...
So they built an SPI bus chip that can't actually be bussed together? That can't be right. That'd be too galaxy brain.
SPI is kind of a "family-brand"; it's all about the control-signaling/data-exchange protocol; the "bus"-part it's not mentioned anywhere in the datasheets.
Interestingly, all the 1.8 V "faster"-parts (including the octal-ones) seems to have extra provisions, enableing the programming of the output strength up to 25 Ohm; all the "3.0 V" (bolds by me) are limited to 50 Ohm driving-strengths.
P.S. not exactly OT, but the "achieved-speed-figures" seem to fit so nicely; simply can't resist...
case one: > 80 mph, fast and safe, with "Reading":
@Rayman said:
Well the video test did work with all the chips. Not sure what that means though...
Maybe it works at a slower speed?
It does work at a slightly slower speed than NeoYume, but not by much. I believe the XGA test runs at 325MHz, while VGA uses 252MHz. Did you try to run at XGA as well as VGA Rayman?
@Rayman said:
@rogloh Does the code change the clock frequency to whatever it needs? Or, do I need to change it in the main file?
So far, only 640x480 works... Is there a place to change PSRAM_WAIT & PSRAM_DELAY somewhere?
Yes it does change the P2 frequency from the video mode. To try other delays for XGA (I only need one value), by just hacking the first number in the delayTable array in the DAT section of psram4.spin2 file (7 below), or alternatively you can pass an explicit delay value (from 1..15) in the driver's startx call parameter instead of 0, which looks it up from the profile instead depending on which frequency band it falls in. The latter is the preferred way to experiment.
E.g. for the profile below:
if freq < 92MHz below, delay = 7,
if 92MHz < freq < 150MHz, delay = 8.
if 150MHz < freq < 206MH, delay = 9, etc
' delay profile delayTable long 7,92_000000,150_000000,206_000000,258_000000,310_000000,333_000000,0
There is something flaky about the Prop Tool when using this program... I don't think it likes that you've embedded the .spin2 file inside itself...
Getting all kinds of errors and crashes and failure to loads...
@Rayman said:
There is something flaky about the Prop Tool when using this program... I don't think it likes that you've embedded the .spin2 file inside itself...
Getting all kinds of errors and crashes and failure to loads...
Just yank it out then and reference some other text file instead. I used flex and it didn't have that problem.
VGA test works in XGA @325MHz clock for both 1, 2, and 3 chip configurations.
Can use any of the 3 chips... Don't seem to need pullups on the unused CE pins
96MB board lower nibble seems to work with VGA test in XGA @325MHz clock for all 6 banks, but something is very strange with the upper nibble bus...
It seems to not want to use the bus at basepin+4 to basepin+7 using the DATABUS value...
Comments
Hmm, for pik's funny wire contraption, DELAY = 4 did it. Double-check pin config.
Delay = 4 works! Now, wish I didn't unsolder the other two chips... Have to put them back...
After some more testing, looks like it only works with one chip installed, and only with these settings;
PSRAM_WAIT = 10
PSRAM_DELAY = 4
So, at least a one 8MB chip board looks like it will work. May have to scrap the multi-chip plans...
So you got NeoYume running one of the small games? Note that the emulators have different timings, so one might work when the other doesn't.
Try adding manual pinh calls for the other chipselects to exmem_start.
I'll try neoYume next.
For MegaYume I tried a test with the CS pin of a second chip lifted up and tied to VCC. That didn't work either. So, a second chip just being on the bus breaks it...
Ok, got Crossed Swords to run with NeoYume and one chip.
Well that's something at least. See if that works with multiple chips, I think the timing in neoyume might be more stable (P_SYNC_IO my beloved).
My driver has the ability to set different input timing per bank in case of variation there. That may help you out @Rayman unless your issue is the extra input load on the data bus or clocks affecting things dramatically or other signal integrity issues with refections at device inputs etc. Remember these signals are getting clocked at 168MHz or thereabouts so it's not really low speed stuff any more. I wonder if some series termination resistors may be needed on your board or if the paths are too long...?
If you have a very good scope maybe you can compare the bus signals with and without the 2nd device fitted to see what it does to the waveform, but probing itself will affect things too.
EDIT: Actually the per-bank timing might only help your board work with my own code in general, but if it turns out be timing differences per device, then Wuerfel_21's PSRAM code would also need to be adjusted to support per bank timing too, because currently it is just a global applied to all devices.
So is the problem the very presence of the extra memory chip fitted on the board, even though only a single 8MB device is sufficient to run the emulator with small games, or is it that the emulated game needed to make use the second device, and it was getting corrupted data in that case?
That's the first time I realy take a closer look at the 3-chip "nibble"-version, and your findings seem to confirm APMemory's datasheet:
Summing any two devices "COUT"s (8 pF each) in parallel is enough to exceed the maximum allowed 15pF Load Capacitance.
Add to this value the load represented by both connector pins (male and female), the involved P2 I/O pin and all the traces/lanes, at both boards... it just makes sense...
P.S. The clock lane can also be suffering from the same load effects, but P2 "Fast" output driver is very forgiving on it...
Also the device doesn't ever drive the clock pin, only the P2 does.
I'd break out the delay tester. Maybe modify it a little to check consecutive banks using smaller block size.
So they built an SPI bus chip that can't actually be bussed together? That can't be right. That'd be too galaxy brain.
Wait, you said the video test does work with multiple chips? So it's not a hard hw issue?
They don't, for sure; I was only mentally-tinkering (or thinkering ) about and trying to correlate some facts (P2-pins + Edge/Eval lanes and connector + memory board connector + lanes + three memory chips, togheter, on the same bus)...
Being less than 20 Ohm, the "Fast" drivers does helps a lot in supporting such "heavy-loaded" conditions...
SPI is kind of a "family-brand"; it's all about the control-signaling/data-exchange protocol; the "bus"-part it's not mentioned anywhere in the datasheets.
Interestingly, all the 1.8 V "faster"-parts (including the octal-ones) seems to have extra provisions, enableing the programming of the output strength up to 25 Ohm; all the "3.0 V" (bolds by me) are limited to 50 Ohm driving-strengths.
P.S. not exactly OT, but the "achieved-speed-figures" seem to fit so nicely; simply can't resist...
https://getreading.co.uk/news/local-news/reading-buses-sets-new-world-9293952
https://trendhunter.com/trends/the-school-time-jet-powered-school-bus
Well the video test did work with all the chips. Not sure what that means though...
Maybe it works at a slower speed?
Well, tracking says my boards are chilling in Frankfurt rn. Didn't even have to take a detour through the netherlands.
It does work at a slightly slower speed than NeoYume, but not by much. I believe the XGA test runs at 325MHz, while VGA uses 252MHz. Did you try to run at XGA as well as VGA Rayman?
@rogloh Does the code change the clock frequency to whatever it needs? Or, do I need to change it in the main file?
So far, only 640x480 works... Is there a place to change PSRAM_WAIT & PSRAM_DELAY somewhere?
Why doesn't strike through work? --> Only one paragraph at a time for strike though to work...
Yes it does change the P2 frequency from the video mode. To try other delays for XGA (I only need one value), by just hacking the first number in the delayTable array in the DAT section of psram4.spin2 file (7 below), or alternatively you can pass an explicit delay value (from 1..15) in the driver's startx call parameter instead of 0, which looks it up from the profile instead depending on which frequency band it falls in. The latter is the preferred way to experiment.
E.g. for the profile below:
if freq < 92MHz below, delay = 7,
if 92MHz < freq < 150MHz, delay = 8.
if 150MHz < freq < 206MH, delay = 9, etc
' delay profile delayTable long 7,92_000000,150_000000,206_000000,258_000000,310_000000,333_000000,0
Actually, got XGA working with 1 chip. Saw the note "this should match the above ", and worked when did that...
There is something flaky about the Prop Tool when using this program... I don't think it likes that you've embedded the .spin2 file inside itself...
Getting all kinds of errors and crashes and failure to loads...
It seems to be limited to single paragraphs at a time.
Might be a good idea to build up multiple RAM boards so can quickly swap back and forth the number of PSRAMs.
Just yank it out then and reference some other text file instead. I used flex and it didn't have that problem.
VGA test works in XGA @325MHz clock for both 1, 2, and 3 chip configurations.
Can use any of the 3 chips... Don't seem to need pullups on the unused CE pins
96MB board lower nibble seems to work with VGA test in XGA @325MHz clock for all 6 banks, but something is very strange with the upper nibble bus...
It seems to not want to use the bus at basepin+4 to basepin+7 using the DATABUS value...
Probably due to this line in the driver...(which used to align us on 8 bit groups for the original 16 bit driver, but is not needed now).
Change it to this:
Got boards in mail. Yep, some tough pickles to get it to work. Config file now has control over sync/async setting for clock and data.
24MB board (on basepin 0):
96MB board (on basepin 0):
Same timing seems to work for both.
I've attached that test version of MegaYume, combine with the remaining files from current git.
Also, hardware criticism: 96MB board is too wide to fit the acrylic case for P2EVAL, had to take it out.