P2-ES crystal inaccuracy? Bad code setting the clock?

DavidZemonDavidZemon Posts: 2,634
edited 2019-01-10 - 06:26:57 in Propeller 2
Is anyone else seeing a relatively significant inaccuracy with the crystal oscillator? Or am I setting the clock incorrectly? The following code is blinking the LED at ~0.86 Hz (13 blinks in 15 seconds) and if I reset the board and run `58 blink` from TAQOZ, I get about 0.6 Hz (9 blinks in 15 seconds).

And the print statements come across the terminal as garbage (surely a baudrate mismatch - haven't scoped it yet to see what baudrate is coming across the wire).

Full code here: https://github.com/DavidZemon/HelloP2GCC/blob/2940554e72ccc921ff2599c95ac99e13cca51658/blinky.c
#include "common.h"
#include <propeller.h>
#include <stdio.h>

static const uint32_t XI             = 20000000;
static const uint32_t INPUT_DIVIDER  = 4;
static const uint32_t VCO_MULTIPLIER = 72;
static const uint32_t FINAL_DIVIDER  = 2;

void main () {
    int            i;
    const uint32_t CLOCK_FREQ = compute_clock(XI, INPUT_DIVIDER, VCO_MULTIPLIER, FINAL_DIVIDER);

    printf("Attempting to set clock for %d Hz\n", CLOCK_FREQ);

    waitx(CLOCK_FREQ);

    const errot_t err = set_clock_pll(INPUT_DIVIDER, VCO_MULTIPLIER, FINAL_DIVIDER);

    if (err)
        printf("Error! %d\n", err);
    else
        printf("Running at %d\n", CLOCK_FREQ);

    while (1) {
        drive_invert(58);
        waitx(CLOCK_FREQ / 2);
    }
}

Full code here: https://github.com/DavidZemon/HelloP2GCC/blob/2940554e72ccc921ff2599c95ac99e13cca51658/common.c
#include "common.h"
#include <propeller.h>

void waitx (const uint32_t clockCycles) {
    __asm__ __volatile ("waitx %0" : : "r" (clockCycles));
}

static errot_t set_clock_mode (const bool enablePll, uint32_t inputDivider, uint32_t vcoMultiplier,
                               uint32_t finalDivider, const xi_status_t xiStatus, const clock_source_t clockSource) {
    __asm__ __volatile("hubset #0");

    uint32_t configuration = 0;

    if (enablePll) {
        configuration = 1 << 24;
    }

    if (64 > inputDivider) {
        inputDivider &= 0b11111;
        configuration |= inputDivider << 18;
    } else {
        return INVALID_INPUT_DIVIDER;
    }

    if (1024 > vcoMultiplier) {
        vcoMultiplier &= 0b1111111111;
        configuration |= vcoMultiplier << 8;
    } else {
        return INVALID_VCO_MULTIPLIER;
    }

    if (30 < finalDivider || finalDivider % 2)
        return INVALID_FINAL_DIVIDER;
    else if (finalDivider) {
        finalDivider = (finalDivider >> 1) - 1;
        configuration |= finalDivider << 4;
    } else {
        configuration |= 0b1111 << 4;
    }

    configuration |= xiStatus << 2;

    // enable crystal+PLL, stay in 20MHz+ mode
    __asm__ __volatile("hubset %[_configuration]" : :[_configuration] "r"(configuration));

    // wait ~10ms for crystal+PLL to stabilize
    waitx(200000);

    // now switch to PLL
    configuration |= clockSource;
    __asm__ __volatile("hubset %[_configuration]" : :[_configuration] "r"(configuration));
}

errot_t set_clock_pll (uint32_t inputDivider, uint32_t vcoMultiplier, uint32_t finalDivider) {
    return set_clock_mode(true, inputDivider, vcoMultiplier, finalDivider, XI_15PF, CLK_SRC_PLL);
}

uint32_t compute_clock (const uint32_t xi, const uint32_t inputDivider, const uint32_t vcoMultiplier,
                        const uint32_t finalDivider) {
    const uint32_t frequency =  xi * vcoMultiplier / inputDivider;
    if (finalDivider) {
        return frequency / finalDivider;
    } else {
        return frequency;
    }
}
David
PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
CI Server: http://david.zemon.name:8111/?guest=1

Comments

  • Does the Clock_Freq print correctly at 180,000,000 ?

    I'm certainly not a C expert, but if vcoMultiplier was higher (if can go to 1023 I think), couldn't it overflow 32 bit boundaries when multiplied by 20 000 000?

  • The code which calculates the PLL settings needs a -1 for inputDivider and vcoMultiplier. Also your code does not handle finalDivider=1. More info here:
    forums.parallax.com/discussion/comment/1452025/#Comment_1452025

    Also if you care about phase noise or spurs, it would be best to avoid input dividers 3-8. forums.parallax.com/discussion/comment/1459802/#Comment_1459802
    However, using inputDivider=4 should not cause any issues visible on the serial port or with a blinking LED.
    James https://github.com/SaucySoliton/

    Invention is the Science of Laziness
  • DavidZemon wrote: »
    Is anyone else seeing a relatively significant inaccuracy with the crystal oscillator? Or am I setting the clock incorrectly? The following code is blinking the LED at ~0.86 Hz (13 blinks in 15 seconds) and if I reset the board and run `58 blink` from TAQOZ, I get about 0.6 Hz (9 blinks in 15 seconds).

    And the print statements come across the terminal as garbage (surely a baudrate mismatch - haven't scoped it yet to see what baudrate is coming across the wire).
    I get the same results with the blinky code (about 13 blinks in 15 seconds).

    For the print examples, I set the baudrate in the loadp2 statement in your Makefile to 115200. And, I'm getting good text returned (chess, hello, malloctest, dry, etc...), but common.c doesn't return any text on execution (possibly my baud setting of 115200 in loadp2 is not helping in that example). I'm running on macOS, so my port ID was also changed:
    run-%: %.bin
        @echo --- LOADING $< ---
        loadp2 -v -t -p /dev/cu.usbserial-P2EEI8V -m 010c1f08 -b 115200 $<
    

    dgately
    Livermore, CA (50 miles SE of San Francisco)
  • TubularTubular Posts: 3,389
    edited 2019-01-10 - 09:19:34
    (what saucy said). In pasm, we do
    _SETFREQ      = 1<<24 + (_XDIV-1)<<18 + (_XMUL-1)<<8 + _XPPPP<<4 + _XOSC<<2
    
  • Bear in mind that with TAQOZ in ROM that it could only be tested in FPGA at 80MHz where we didn't know yet about the final clock details for silicon. So if we set a pin to 40 MHZ then it will output half the CPU clock which was assumed to be 80MHZ but RCFAST is more around 22MHz or so. This btw reveals the speed of RCFAST rather nicely since the pin will output RCFAST/2. The new version of TAQOZ knows all about how to set the clock and adjusts its timing accordingly.

    There are also functions in EXTEND which allow you to specify various clock configuration parameters so you can experiment interactively. You can do so safely if you work from the initial RCFAST, type in a line or a function and make sure that the last word to execute is RCFAST. Something like this:
    8 PIN 40 MHZ
    
    so that P8 is outputting half the CPU clock which you can monitor with a scope.
    15PF PLLEN 4 XIDIV 50 VCOMUL 1 PLLDIV USEXTAL 5 s RCFAST
    
    So that selects 15PF loading, XI divider of 4 = 5MHZ, 50 VCO multiplier (effectively) = 250MHZ, no PLL divide, then selects the external crystal as the CPU clock for 5 seconds before switching back to RCFAST and returning to the console.

    Tachyon Forth - compact, fast, forthwright and interactive
    useforthlogo-s.png
    --->CLICK THE LOGO for more links<---
    P2CHIP-1.jpg
    P2 +++++ TAQOZ INTRO & LINKS +++++ P2 SHORTFORM DATASHEET
    P1 +++++ Latest binary V5.4 includes EASYFILE +++++ Tachyon Forth News Blog
    Brisbane, Australia
  • Bear in mind that with TAQOZ in ROM that it could only be tested in FPGA at 80MHz where we didn't know yet about the final clock details for silicon. So if we set a pin to 40 MHZ then it will output half the CPU clock which was assumed to be 80MHZ but RCFAST is more around 22MHz or so. This btw reveals the speed of RCFAST rather nicely since the pin will output RCFAST/2. The new version of TAQOZ knows all about how to set the clock and adjusts its timing accordingly.

    There are also functions in EXTEND which allow you to specify various clock configuration parameters so you can experiment interactively. You can do so safely if you work from the initial RCFAST, type in a line or a function and make sure that the last word to execute is RCFAST. Something like this:
    8 PIN 40 MHZ
    
    so that P8 is outputting half the CPU clock which you can monitor with a scope.
    15PF PLLEN 4 XIDIV 50 VCOMUL 1 PLLDIV USEXTAL 5 s RCFAST
    
    So that selects 15PF loading, XI divider of 4 = 5MHZ, 50 VCO multiplier (effectively) = 250MHZ, no PLL divide, then selects the external crystal as the CPU clock for 5 seconds before switching back to RCFAST and returning to the console.

    Ah, I didn't realize that taqoz is using rcfast. Makes sense... It can't possibly know the crystal setting
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
  • The code which calculates the PLL settings needs a -1 for inputDivider and vcoMultiplier. Also your code does not handle finalDivider=1. More info here:
    forums.parallax.com/discussion/comment/1452025/#Comment_1452025

    Also if you care about phase noise or spurs, it would be best to avoid input dividers 3-8. forums.parallax.com/discussion/comment/1459802/#Comment_1459802
    However, using inputDivider=4 should not cause any issues visible on the serial port or with a blinking LED.

    Thank you! I missed the "-1" on those two completely. Makes sense of course. I don't understand the pppp calculation though... It doesn't seem to match up at all with what I read in the docs. I'll review the docs again later when I'm not on my phone
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
  • Excellent! My scope is now reading 50.01ms (when set to CLOCK_FREQ / 20) so I'm much happier. I'll play around with waitct1 tonight and see if that can be dialed in to 50.00 :smiley:
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
  • It took me a long time to finally understand the "+15" instead of my "-1" for PPPP, but I finally get it. What I still don't get is why my code works perfectly for 16/144/1 but no other combination.

    Here's what I changed: https://github.com/DavidZemon/HelloP2GCC/commit/0ce4d36f1eab78a7cdd44abbdfe364b2af97086c
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
  • jmgjmg Posts: 12,892
    DavidZemon wrote: »
    Excellent! My scope is now reading 50.01ms (when set to CLOCK_FREQ / 20) so I'm much happier. I'll play around with waitct1 tonight and see if that can be dialed in to 50.00 :smiley:

    Depends just how accurate you chase... in the middle setting, initial Xtal error should be under 10ppm as below..

    FYI, here is the pull range of the CAP changes, 20MHz xtal only, no > 100MHz heating (those numbers move ~ -15ppm as Xtal warms from faster P2
    %CC     XI status   XO status       XI/XO Z   XI/XO caps        Measures      ppm change  PLL                                   
    %00     ignored     float           Hi-Z      OFF
    %01     input       600-ohm drive   1M-ohm    OFF            -> 1000.1439Hz   +144   ppm      
    %10     input       600-ohm drive   1M-ohm    15pF per pin   ->  999.9933Hz   -6.700 ppm  -> -20ppm as it heats up
    %11     input       600-ohm drive   1M-ohm    30pF per pin   ->  999.9472Hz   -53 ppm       
    

    I'm not sure how safe it is to change cap settings on the fly, be interesting to see if it is tolerated ?
    Given the P2 fails to start if not enough time is allowed in RC-> XTAL, (no PLL) I'd say it is sensitive to noise on the Xtal pins, which would make runtime CL changes risky.

    For higher precision levels, you will need a TCXO or VCTCXO - GPS ones are cheap.
  • jmg wrote: »
    DavidZemon wrote: »
    Excellent! My scope is now reading 50.01ms (when set to CLOCK_FREQ / 20) so I'm much happier. I'll play around with waitct1 tonight and see if that can be dialed in to 50.00 :smiley:

    Depends just how accurate you chase... in the middle setting, initial Xtal error should be under 10ppm as below..

    FYI, here is the pull range of the CAP changes, 20MHz xtal only, no > 100MHz heating (those numbers move ~ -15ppm as Xtal warms from faster P2
    %CC     XI status   XO status       XI/XO Z   XI/XO caps        Measures      ppm change  PLL                                   
    %00     ignored     float           Hi-Z      OFF
    %01     input       600-ohm drive   1M-ohm    OFF            -> 1000.1439Hz   +144   ppm      
    %10     input       600-ohm drive   1M-ohm    15pF per pin   ->  999.9933Hz   -6.700 ppm  -> -20ppm as it heats up
    %11     input       600-ohm drive   1M-ohm    30pF per pin   ->  999.9472Hz   -53 ppm       
    

    I'm not sure how safe it is to change cap settings on the fly, be interesting to see if it is tolerated ?
    Given the P2 fails to start if not enough time is allowed in RC-> XTAL, (no PLL) I'd say it is sensitive to noise on the Xtal pins, which would make runtime CL changes risky.

    For higher precision levels, you will need a TCXO or VCTCXO - GPS ones are cheap.

    I don't actually care about the accuracy of the crystal or the PLL or anything - i assumed it was plenty accurate enough to correctly read "50.00ms" for T/2 of a wave and therefore assumed that the 0.01 ms difference was due to use of waitx instead of waitct1. Fixing my code to be as accurate as possible is of interest.
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
  • jmgjmg Posts: 12,892
    DavidZemon wrote: »
    I don't actually care about the accuracy of the crystal or the PLL or anything - i assumed it was plenty accurate enough to correctly read "50.00ms" for T/2 of a wave and therefore assumed that the 0.01 ms difference was due to use of waitx instead of waitct1. Fixing my code to be as accurate as possible is of interest.

    Makes sense.
    The smart pins have a mode of 'for X whole periods of a pin, measure SysCLKS' that it would be nice to see some working code for.
    If you set X there to 10, you could capture sysclks per period to 0.1 sysclks average precision. ie confirm any SW loop timing, right down to 1 tick.


  • SaucySolitonSaucySoliton Posts: 184
    edited 2019-01-11 - 02:28:20
    DavidZemon wrote: »
    It took me a long time to finally understand the "+15" instead of my "-1" for PPPP, but I finally get it. What I still don't get is why my code works perfectly for 16/144/1 but no other combination.

    Here's what I changed: https://github.com/DavidZemon/HelloP2GCC/commit/0ce4d36f1eab78a7cdd44abbdfe364b2af97086c

    Do you mean the old code, or the new code?

    I have blinky.c running and I can't duplicate the problem of needing 16/144/1. 2/16/2 and 2/32/4 were fine. The main issue I saw was that printf assumes an 80MHz clock. If it deviates, the data is not received properly. The printf before clock setting doesn't work for this reason.

    The initial 1 second delay operates at RCFAST, but uses timing for the PLL frequency. So it could be 4-10 seconds. That long delay made me think it wasn't working.

    A possible hunch on the 16/144/1: 20e6 * 144 = 2.88e9 > 2^31. It would overflow a signed int32. Although I think it would make that case fail, instead of all others.
    James https://github.com/SaucySoliton/

    Invention is the Science of Laziness
  • DavidZemon wrote: »
    It took me a long time to finally understand the "+15" instead of my "-1" for PPPP, but I finally get it. What I still don't get is why my code works perfectly for 16/144/1 but no other combination.

    Here's what I changed: https://github.com/DavidZemon/HelloP2GCC/commit/0ce4d36f1eab78a7cdd44abbdfe364b2af97086c

    Do you mean the old code, or the new code?

    I have blinky.c running and I can't duplicate the problem of needing 16/144/1. 2/16/2 and 2/32/4 were fine. The main issue I saw was that printf assumes an 80MHz clock. If it deviates, the data is not received properly. The printf before clock setting doesn't work for this reason.

    The initial 1 second delay operates at RCFAST, but uses timing for the PLL frequency. So it could be 4-10 seconds. That long delay made me think it wasn't working.

    A possible hunch on the 16/144/1: 20e6 * 144 = 2.88e9 > 2^31. It would overflow a signed int32. Although I think it would make that case fail, instead of all others.

    I may have typo'd something in my earlier post, or mis-remembered, or idk what. But it does work for 16/144/1. It does not work for 16/288/2, which makes sense now that you point out it is overflowing my 32 bit (unsigned) variable. I tried expanding to 64 bits, but loadp2 hangs (I think I might have seen other people saying the same thing over in the p2gcc forum, regarding "large" binaries?). I won't concern myself with that for the moment... i'm happy that my code seems to functional now, with the exception of the overflow. And thank you again for pointing out that the serial routine is expecting 80 MHz... makes sense doesn't it, since it's all based on P1? Anyway, I'm comfortable now that the logic is correct in my latest version of the code and will stick with simply 1/4/1 so that I have working serial output (until I get the smart pins working, that is).

    Now I can move forward with other hardware exploration :smile:
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
  • jmgjmg Posts: 12,892
    DavidZemon wrote: »
    I may have typo'd something in my earlier post, or mis-remembered, or idk what. But it does work for 16/144/1. It does not work for 16/288/2...

    That may be because the VCO is getting marginal ?
    /18 *288 is asking for 360MHz vco, which I think is more into a grey zone that may need active cooling.
    /2*16 is asking for 320MHz, just a little bit easier ?
  • David,
    I know this won't directly apply for you yet, but here's my current smartpin comport config preset:
    'asynconfig	long    (clock_freq * 64 / baud_rate * 1024) | 7          'bitrate format is 16.6<<10, 8N1 framing
    		                                                          'clock_freq max of 31 MHz
    'asynconfig	long    (clock_freq * 8 / (baud_rate / 8) * 1024) | 7     'bitrate format is 16.6<<10, 8N1 framing
    		                                                          'clock_freq max of 255 MHz
    asynconfig	long    (clock_freq * 4 / (baud_rate / 16) * 1024) | 7    'bitrate format is 16.6<<10, 8N1 framing
    		                                                          'clock_freq max of 511 MHz
    
    Note I've had to accommodate 32-bit signed integer maths of the assembler. Only just now readjusted it for >255 MHz sysclock rates. The baud_rate / 16 will impact precision of slower baud rates that don't cleanly divide by 16.

    BTW: I just used P2ES sysclock config of 16/288/2 (180 MHz) with diag reporting on this very smartpin preset without issue.
    "There's no huge amount of massive material
    hidden in the rings that we can't see,
    the rings are almost pure ice."
  • evanhevanh Posts: 6,085
    edited 2019-01-11 - 07:18:43
    Regarding the precision limit of very low baud rates. With high sysclocks, smartpins won't be able to go that low anyway because of the max divider of 65535. It would require either lower sysclock rates and therefore the first code line above can be used, or just bit-bash the low data rate.

    "There's no huge amount of massive material
    hidden in the rings that we can't see,
    the rings are almost pure ice."
  • evanhevanh Posts: 6,085
    edited 2019-01-11 - 11:24:12
    Here's a way to maintain full dynamic range on the calculation, albeit using runtime cordic.
    		setq    #clock_freq>>26           'upper 6 bits
    		qdiv    ##clock_freq<<6, ##baud_rate
    		getqx   asynconfig
    		shl     asynconfig, #10           'baudrate format is 16.6<<10
    		or      asynconfig, #7            '8N1 framing
    
    "There's no huge amount of massive material
    hidden in the rings that we can't see,
    the rings are almost pure ice."
  • I should have been more clear. The 16/288/2 config "wasn't working" because my compute_clock() function. The local variables in that method were overflowing and producing a bad output. I then used that output in my waitx() function, which therefore gave me a "strange" frequency. I suspect, if I had the equipment to measure the clock directly, it would have been running just fine at 180 MHz.
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
  • evanh wrote: »
    David,
    I know this won't directly apply for you yet, but here's my current smartpin comport config preset:
    'asynconfig	long    (clock_freq * 64 / baud_rate * 1024) | 7          'bitrate format is 16.6<<10, 8N1 framing
    		                                                          'clock_freq max of 31 MHz
    'asynconfig	long    (clock_freq * 8 / (baud_rate / 8) * 1024) | 7     'bitrate format is 16.6<<10, 8N1 framing
    		                                                          'clock_freq max of 255 MHz
    asynconfig	long    (clock_freq * 4 / (baud_rate / 16) * 1024) | 7    'bitrate format is 16.6<<10, 8N1 framing
    		                                                          'clock_freq max of 511 MHz
    
    Note I've had to accommodate 32-bit signed integer maths of the assembler. Only just now readjusted it for >255 MHz sysclock rates. The baud_rate / 16 will impact precision of slower baud rates that don't cleanly divide by 16.

    BTW: I just used P2ES sysclock config of 16/288/2 (180 MHz) with diag reporting on this very smartpin preset without issue.

    This definitely helps - thank you. I'll come back to it after I replace my bit-banged blinking LED with a smartpin.
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
Sign In or Register to comment.