Shop OBEX P1 Docs P2 Docs Learn Events
New p2docs pages: Errata and Optimization Guide — Parallax Forums

New p2docs pages: Errata and Optimization Guide

I just added two new pages to the p2docs site:

Optimization and Coding for Speed, which should collect advice for fast coding - please tell me if you don't understand something (unless there's already a "TODO" there)

Hardware Bugs & Errata, a list of P2 hardware bugs. I feel like I forgot at least one that was previously discovered. (official doc still doesn't have the RDFAST startup bug)

Now debate this 2AM coldpost while I go to bed.

(I also added and moved stuff around elsewhere)

Comments

  • evanhevanh Posts: 16,496
    edited 2025-05-23 17:12

    Typos: Memory Access section - First sentence contains the misspelt word "obviosu".
    I had seen others but I think you've already fixed them.

    Details: HubRAM RDLONG/WRLONG exec times are 9..16 and 3..10 respectively. Not 9..17 and 3..11.

    "Hub slice alignment" is a case of not worrying about initial alignment and concentrate just on optimising the loop times. I say this because every recompile or shifted instruction will move the absolute location, and therefore hubRAM slice number, of the referenced data and/or executing code and therefore make every edit a timing minefield to predetermine how neatly the eggbeater aligns upon entering the loop.

    Basically, inside or outside the loop, both knowing the relative/cyclic slice order of hubRAM accesses and matching that with associated execution timings is needed. Both are equally important. It's just a lot easier to manage optimising such inside a loop rather than outside.

    When it comes to doing both loads and stores in an order, that can be optimised too. But loads are not the same phase as stores. I did measure it once-upon-a-time ...

  • @evanh said:
    Typos: Memory Access section - First sentence contains the misspelt word "obviosu".
    I had seen others but I think you've already fixed them.

    That's how you know it's my authentic artisan post-midnight screeds.
    There's probably many typos in there, I don't have spellchecking in my editor.

    Details: HubRAM RDLONG/WRLONG exec times are 9..16 and 3..10 respectively. Not 9..17 and 3..11.

    I've included the possible +1 disalignment penalty in the table for brevity.

    "Hub slice alignment" is a case of not worrying about initial alignment and concentrate just on optimising the loop times. I say this because every recompile or shifted instruction will move the absolute location, and therefore hubRAM slice number, of the referenced data and/or executing code and therefore make every edit a timing minefield to predetermine how neatly the eggbeater aligns upon entering the loop.

    Basically, inside or outside the loop, both knowing the relative/cyclic slice order of hubRAM accesses and matching that with associated execution timings is needed. Both are equally important. It's just a lot easier to manage optimising such inside a loop rather than outside.

    Yes I meant that, the relative alignment of successive ops. Maybe need to clear up wording. Though I think absolute alignment comes into play when mixing hub access and CORDIC commands. In pure ASM you can align to exact addresses, so it's possible to use that there.

    Though IME when CORDIC is also involved, there's probably time to use the FIFO trick instead.

    When it comes to doing both loads and stores in an order, that can be optimised too. But loads are not the same phase as stores. I did measure it once-upon-a-time ...

    Yes, I remember... I need to run my own tests so I can write a guide. If just for my own use, because I can never remember how to do it properly. Other under-researched topic is access stalls due to FIFO in hubexec.

  • evanhevanh Posts: 16,496
    edited 2025-05-23 17:38

    My attitude is outside a tight loop, it gets way too hard too quickly.

    Cordic commands will count as hub ops I believe. Just they're on a set phase like all the Prop1 hub ops. They could be used as a reference to compare multiple cogs. I'm guessing now.

  • evanhevanh Posts: 16,496

    @Wuerfel_21 said:
    I've included the possible +1 disalignment penalty in the table for brevity.

    Cool.

  • More errata: If using the internal crystal oscillator, pins P28-P31 should not be used to output high frequency signals.

    https://forums.parallax.com/discussion/comment/1520712/#Comment_1520712 Took me way too long to find this.

  • Wuerfel_21Wuerfel_21 Posts: 5,363
    edited 2025-05-26 21:34

    Oh, forgot about that one (because I was thinking more from the software side)

  • evanhevanh Posts: 16,496

    Yep, 'tis nasty one. Only the 20 Ohm Fast pin drive strength is enough to trip it up though. All other output modes don't affect it. Even the hungry 2.0 volt DAC drive is fine.

  • RaymanRayman Posts: 15,229

    Wasn’t this just a layout issue?

  • RaymanRayman Posts: 15,229

    Yeah, found my post where checked with my boards here:
    https://forums.parallax.com/discussion/173205/how-to-kill-a-p2-video-driver-and-probably-usb-etc/p3

    This was just a layout error that only affected the P2 Eval board, AFAIK...

  • evanhevanh Posts: 16,496
    edited 2025-05-29 03:56

    @Rayman said:
    This was just a layout error that only affected the P2 Eval board, AFAIK...

    Huh, I'd missed that Cluso had proven his board reliable under all tests.

    I see Vons trying to come to a resolution but it doesn't look like he did before Chip just decided to use an external crystal oscillator part in place of a plain crystal.

    Cluso's board has everything on a single common 3.3 Volt rail. Given the excessive number of 100 nF capacitors that Cluso used, I now suspect it's down to those caps performing better at high frequency and probably the sheer number of them makes them so effective.

    The Eval and Edge boards use none at all. For each group of eight I/O pins there is just one 1 uF and one 4.7 uF on the Eval board, and two 4.7 uF on the Edge revA.

  • @Rayman said:
    Yeah, found my post where checked with my boards here:
    https://forums.parallax.com/discussion/173205/how-to-kill-a-p2-video-driver-and-probably-usb-etc/p3

    This was just a layout error that only affected the P2 Eval board, AFAIK...

    I am really happy to hear this. Since the Edge went to a TCXO I assumed that was to mitigate this issue. I'm designing a Pico style P2 board and really wanted to use a crystal. The eval gerbers are posted in that thread. The eval board has individually shielded IO lines. And then the XI/XO lines are routed like a differential pair. Or a directional coupler :o

    There is still a possibility that the canned oscillators have better phase noise than the P2 oscillator. I might try to check that.

  • evanhevanh Posts: 16,496

    @SaucySoliton said:
    ... And then the XI/XO lines are routed like a differential pair. Or a directional coupler :o

    Wow, yeah, I suppose that is problematic. And I see Rayman also noted it back then too - https://forums.parallax.com/discussion/comment/1521000/#Comment_1521000

Sign In or Register to comment.