Shop OBEX P1 Docs P2 Docs Learn Events
Prop2Play - audio player for a P2-EC32 [0.33] - 1024x600 - Page 2 — Parallax Forums

Prop2Play - audio player for a P2-EC32 [0.33] - 1024x600

245

Comments

  • pik33pik33 Posts: 2,402

    0.23.
    New audio driver with PSRAM cache, SID .dmp files now are loaded to PSRAM and played from there

    How to use this new faster sdmm.cc? I tried to replace the old one in Flexprop directory: it didn't work for the player so I restored good old working version of it.

  • evanhevanh Posts: 16,134
    edited 2022-04-03 06:39

    It should just drop in but if Eric has been making changes to filesystem support then maybe not. I'm working from build `Version 5.9.10-beta-v5.9.9-93-ge37a63f5 Compiled on: Mar 30 2022'. So my includes are from then.

    Do you get any compile warnings when using it?

  • pik33pik33 Posts: 2,402
    edited 2022-04-03 07:35

    I got warnings about fcache switched off. Compiled with 5.9.9

  • evanhevanh Posts: 16,134
    edited 2022-04-03 08:00

    Okay, yep, my code needs the Fcache. Eric fixed that for me - https://forums.parallax.com/discussion/comment/1537325/#Comment_1537325 You'll need a newer build of Flexspin.

  • pik33pik33 Posts: 2,402

    So... I returned to Linux, recompiled the spin2cpp and it works. Loading from SD is much faster now

  • pik33pik33 Posts: 2,402
    edited 2022-04-04 15:06

    0.24a.

    A strange adventure with PSRAM working at the maximum possible speed (in this wire connected contraption I did...) . While .dmp and .mod played OK using the PSRAM, wav player generated clicks. The PSRAM read/write errors was the first I suspected so I wrote a few lines of code veryfying the PSRAM after write. Reading and writing was done in 16 kB chunks and verification went OK.

    Then I started to search in the audio driver and then I discovered there are errors at addresses xxxxFFC. Not every time, but at random moments. So why the verification worked at all and why modules didn't have any clicks?

    The problem was: to get as exact 44100 Hz as possible I raised the clock from 336 to 338 MHz and it was too high for this PSRAM. At 336 MHz i got no clicks.

    I then tried this 338 MHz with a video driver and didn't notice any bad pixels. The video driver reads in 1k aligned chunks Also, the verification in the main loop worked at 338 MHz writing and reading 16 kB aligned chunks with no errors. What failed was reading of 256 bytes in the audio driver cog. These reads was also aligned while playing .wav file, so I don't know why this problem exists. I tried shorter reads, and in every try the last read long sometimes was bad and this was always at xxxxFyy

    After all these experiments I left the .wav play buffer in the hub. Then the new SD code arrived: instead of ~10 ms for 4 kB it reads 16 KB kb in ~6 ms. This allowed to read bigger chunks of data at once. Now using the user interface in most cases doesn't interfere with .wav playing while the buffer was reduced to $20000 bytes. This freed the place in the hub so now I can add all these AY, SPC and other things which can generate sound.

  • roglohrogloh Posts: 5,865
    edited 2022-04-05 01:34

    Strange result, although if you are that close to the top end of PSRAM performance its going to be hard to know if it is a HW problem or some weird timing issue in SW. Maybe video corruption is not noticeable if a pixel is only very briefly bad or if your colour mode doesn't show a big change. How often do your clicks arrive, and what is the nature of the bad data? If you can capture the corruption somehow via comparison with good data is it all zeroes, all ones, a copy of the prior read, or random data? Or is that what your xxxxFyy is indicating? Some nibble in the 32 bits is stuck at F? Is this with 4 chips or just one chip?

  • jmgjmg Posts: 15,185

    @pik33 said:

    The problem was: to get as exact 44100 Hz as possible I raised the clock from 336 to 338 MHz and it was too high for this PSRAM. At 336 MHz i got no clicks.

    Sounds like this is running things very close.
    Do you want this to work on any of 100 P2's and any of 100 PSRAMS (from various vendors), and in a box/enclosure, which will elevate the temperature ?

  • pik33pik33 Posts: 2,402
    edited 2022-04-05 06:33

    @rogloh said:
    Strange result, although if you are that close to the top end of PSRAM performance its going to be hard to know if it is a HW problem or some weird timing issue in SW. Maybe video corruption is not noticeable if a pixel is only very briefly bad or if your colour mode doesn't show a big change. How often do your clicks arrive, and what is the nature of the bad data? If you can capture the corruption somehow via comparison with good data is it all zeroes, all ones, a copy of the prior read, or random data? Or is that what your xxxxFyy is indicating? Some nibble in the 32 bits is stuck at F? Is this with 4 chips or just one chip?

    This is 4bit version. Clicks arrive randomly, several seconds between them.

    Testing algorithm looked like this:

    • instead of getting 16 kB from SD, I generated 16 kB "saw wave" where higher byte was always 0 and lower was 1..255
    • the main loop then wrote this to the PSRAM
    • the main loop reads the data from PSRAM and compares them. This step was always OK
    • in the audio driver I added the code which, if the sample was >$FF, writes its address fo the HUB
    • the address was always xxxxFFC or FFE depending of the channel #
    • if I shorten the read amount in the driver, the bad address was always the last long read.
    • lowering the clock from 338 to 336 caused no clicks at all

    I can of course add outputting also the sample value.

    @jmg said:

    @pik33 said:

    The problem was: to get as exact 44100 Hz as possible I raised the clock from 336 to 338 MHz and it was too high for this PSRAM. At 336 MHz i got no clicks.

    Sounds like this is running things very close.
    Do you want this to work on any of 100 P2's and any of 100 PSRAMS (from various vendors), and in a box/enclosure, which will elevate the temperature ?

    The "edge conditions" are caused by my primitive PSRAM contraption soldered via several wires to the 12pin female connector. These PSRAMs can work 20 MHz faster on a P2 Edge with 32 MB. The player - without PSRAMs - worked many hours on at least 4 P2s at 354 MHz. Reducing this to 336 is near 20 MHz lower.

    But then this PSRAM "module" cannot work even at 300 MHz when connected to the Edge breakout board, 290 MHz is the top.

    So I consider all of these experiments as a temporary solution until I can get a real P2-EC32MB. There are 2 problems: they are expensive and they are out of stock.

  • roglohrogloh Posts: 5,865
    edited 2022-04-05 06:54

    So if it is bad data occasionally read on specific addresses, maybe it is a signal integrity problem in your flying lead setup affected by the number of address bits that are driven high near the data transfer? It might something to do with drive strength and IO pin data transitions around this time. IIRC I think the PSRAM address nibbles are sent out highest to lowest, so the lower address nibbles are closer in time to the data being read back after bus turn around which could match your observations, although the normal 8 clock latency should help out with the bus turnaround.

  • pik33pik33 Posts: 2,402
    edited 2022-04-05 10:14

    It seems uncommenting this

    drvh datapins 'enable this only if validating actual tri-state time on a scope

    made these clicks disappear.

    Edit: I tested when these clicks appear again and got first rare clicks at 342 MHz - about 4 MHz gained

  • Er, that was designed for my own testing of when exactly the bus is tri-stated so I could tune things... seems weird that it fixes things. Maybe at the speed you are running the pulses are short and we need to keep them driven actively for longer... not sure there. A NOP there instead may do the same thing.

  • pik33pik33 Posts: 2,402
    edited 2022-04-05 11:10

    Nop instead of drvh doesn't work at all (= it clicks)

  • Huh. Ok. Then something seems to need to pull the bus high at the end of the address in your setup at that speed. Maybe it is drooping low without it.

  • evanhevanh Posts: 16,134

    Pik,
    Try my latest bit-bashed revision - https://forums.parallax.com/discussion/comment/1537576/#Comment_1537576
    It has a few changes incorporated, from the smartpins work, that I'd like others to test out. In particular I think you'll find the slower SD cards you have will now work at high clock rates.

  • pik33pik33 Posts: 2,402
    edited 2022-04-07 14:41

    I have 32 GB Sandisk Edge here - no speed change at all in the player environment with the P2 at 336 MHz. A 16 kB wave chunk read time is 6 ms in both versions. This gives about 3.7 MB/s 2.73 MB/s (instead of about 400 kB/s with the original driver).

  • evanhevanh Posts: 16,134
    edited 2022-04-07 08:13

    Good. Now try a different, slower, SD card. My previous driver would get errors at such a high clock rate. This one should be fine.

  • pik33pik33 Posts: 2,402
    edited 2022-04-07 09:39

    If I can find any :) As an old raspberry man I managed to kill all bad cards with raspberries and after that I didn't buy anything which is worse than 32 GB Sandisk.

  • evanhevanh Posts: 16,134
    edited 2022-04-07 13:48

    I guess that explains you not having problems with my first edition then.

    EDIT: Updated to smartpins and ready for your testing - https://forums.parallax.com/discussion/comment/1537616/#Comment_1537616

  • pik33pik33 Posts: 2,402

    Now this thing is slightly faster. ~5700 us for 16 kB block, ~ 2.87 MB/s, bad value corrected in the previous post

  • evanhevanh Posts: 16,134
    edited 2022-04-07 15:40

    Nice.

    Actually, the smartpins edition shines better at lower sysclock because it adjusts the divider for SPI clock gen.

  • pik33pik33 Posts: 2,402

    These Sandisk SDs can work at high frequencies, so if the divider is adjustable, I will try to go up and check where it fails. The player needs fast read, in the original version .wav file playing was at the edge of its possibilities - 10 ms loading time for 13 ms playing. Now I am loading 16 kB chunks in <6 ms - the main player loop is vblank driven so I have to do all things in less than 20 ms. Every milisecond less for loading time is the milisecond more for fancy effects :)

    I am now rewriting the player to use the simpler, graphic only, PSRAM based HDMI driver. 1024x576, 8bpp, more place, more colors but then much more time to print a character on the screen - instead one long to the hub, now I have to put 32 longs into the PSRAM, in my case, 4-bit one. Asm has to be used in time-critical part. A fast character drawing procedure takes now about 40 us for a 8x16 character.
    The newest player version (0,25c) can do near all of these effect I had in text-mode based 0.24. What was left to rewrite is the scrolling module file info which cannot now be done with a display list (although the lower help scrolling text is still DL-animated).

  • evanhevanh Posts: 16,134

    Well ... in the future, maybe an adventure into Ross's multitasking to manage the streamer concurrently loading alongside processing. That would be a neat win. :)

  • evanhevanh Posts: 16,134

    Oh, if you turn on _DEBUG you get to see single-block (r) vs multi-block (s) reads emitted to terminal. A solid string of s's will indicate all clusters are consecutive and thereby fastest reading.

  • pik33pik33 Posts: 2,402
    edited 2022-04-07 18:46

    Clusters should be consecutive as I formatted the card - this time, at home, it is 128 GB Sandisk Ultra - and saved .wav files on the fresh formatted card.

    Meanwhile I tried to make the clock higher, but this card didn't accept anything less than 7 as a divider, so this standard 50 MHz seems to be the real maximum. People from RPi forums overclocked some cards up to 85 MHz, but I don't know now what cards they were. 336/7=48 MHz, so it still fit in the limit, 16 k block read time is now about 5300 us.

  • evanhevanh Posts: 16,134
    edited 2022-04-07 19:26

    @pik33 said:
    Meanwhile I tried to make the clock higher, but this card didn't accept anything less than 7 as a divider, so this standard 50 MHz seems to be the real maximum. People from RPi forums overclocked some cards up to 85 MHz, but I don't know now what cards they were. 336/7=48 MHz, so it still fit in the limit, 16 k block read time is now about 5300 us.

    What you're hitting is the latency of the rx pin input circuit. It gets relatively further and further behind as the sysclock frequency is ramped up. Maybe this is the nature of overclocking any silicon beyond its design. Maybe there's some trade secret Chip hasn't figured out. Dunno.

    An entirely different latency effect impacts the tx smartpin. It appears as a fixed number of sysclock ticks of lag with respect to the clock pin. The minimum is 4 ticks lag but there is a +1 for clock pin registration. The set clock edges are falling for tx and rising for rx. So, although the SPI clocking mode is 3, I preload the tx smartpin as if it were SPI clock mode 0 just to compensate for this lag. For a sysclock/5 SPI clock cycle, the tx data appears a whole SPI clock later on the subsequent falling edge. Just where desired.

    That's the tx sorted. But, in doing that, the options for managing rx latencies are reduced. In theory, at the highest frequencies, rx smartpin could also be clocked on falling edge but this would only be possible for SPI clocking mode 0 where the clock idles low. The tx lag precludes this option.

    There is another way - use the streamer. Like bit-bashing, it's not externally clocked. Therefore can be arbitrarily aligned at sysclock granularity.

  • evanhevanh Posts: 16,134
    edited 2022-04-07 21:31

    It can be pushed to sysclock/6: 0x0002_0006 It's a very lop-sided SPI clock to give extra leeway to the rx latency. That gives you 56 MHz SPI clock rate.

    A streamer edition would be happy with sysclock/2. Same as the PSRAMs.

    PS: Sounds like Mike (iseries) is also working on streamlining the code layers above the SD driver. That might also save you some microseconds. https://forums.parallax.com/discussion/comment/1537619/#Comment_1537619

  • pik33pik33 Posts: 2,402

    6 doesn't work with these Sandisk cards and 336 MHz. 7 is OK.

  • evanhevanh Posts: 16,134

    Weird, that setting is working at 350 MHz sysclock with four different SD cards for me.

    My slowest is a 16 GB Adata (U1 rated), a similar speed 16 GB Apacer (U1 rated), a 32 GB Sandisk Extreme (V30 rated), and newest is a 64 GB Kingston Select (V10 rated). In the SPI based testing, the Kingston is as fast as the Sandisk.

  • pik33pik33 Posts: 2,402
    edited 2022-04-08 10:04

    Do I need to change something else than this constant? Meanwhile I returned to 8, 7 was not stable enough.
    I will test this in a clear environment: the player is very busy and there are high frequency noise sources: HDMI at pin 0 and PSRAM at pin 48.

    Meanwhile, I managed to add the scrolling file info so all functions from 0.24 work now in 0.25. All which can be placed in PSRAM is now there: the main video framebuffer, auxiliary scrolling text buffer, wav audio buffers. SID .dmp files and Amiga module files are also loaded into PSRAM before playing, so near all hub space is available for the code (and sprites! - this is a sprite video driver, so I will add some sprite based effects in the near future). It looks now like this:

Sign In or Register to comment.