USB Testing

11314151618

Comments

  • evanhevanh Posts: 6,880
    edited 2019-02-01 - 10:50:25
    Garry,
    I recommend testing with the new v33i FPGA files with the corrected SRAM dual-porting of lutRAM.

    EDIT: Added v33i.
    "There's no huge amount of massive material
    hidden in the rings that we can't see,
    the rings are almost pure ice."
  • evanh wrote: »
    Garry,
    I recommend testing with the new v33i FPGA files with the corrected SRAM dual-porting of lutRAM.

    EDIT: Added v33i.

    Yes, and look at the usb_turnaround.spin2 file for guidance on USB mode settings.

    If you are using LUT sharing, you definitely need to try v33i.
  • cgracey wrote: »
    evanh wrote: »
    Garry,
    I recommend testing with the new v33i FPGA files with the corrected SRAM dual-porting of lutRAM.

    EDIT: Added v33i.

    Yes, and look at the usb_turnaround.spin2 file for guidance on USB mode settings.

    If you are using LUT sharing, you definitely need to try v33i.

    Yeah, that's the place to start. The FPGA and silicon code is in separate branches, with the latter getting changes that eliminated the "shortcuts" I had taken due to 80MHz constraints. I've gotten behind in keeping the FPGA branch updated, so I guess I'd best get to it.

    Stupid question time regarding the LUT sharing issue: can the bug ramifications have an effect outside of LUT read/write? If my USB IRP request tokens passed via LUT share were getting mangled, things would breaking in an entirely different way.
    garryj
  • garryj wrote: »
    cgracey wrote: »
    evanh wrote: »
    Garry,
    I recommend testing with the new v33i FPGA files with the corrected SRAM dual-porting of lutRAM.

    EDIT: Added v33i.

    Yes, and look at the usb_turnaround.spin2 file for guidance on USB mode settings.

    If you are using LUT sharing, you definitely need to try v33i.

    Yeah, that's the place to start. The FPGA and silicon code is in separate branches, with the latter getting changes that eliminated the "shortcuts" I had taken due to 80MHz constraints. I've gotten behind in keeping the FPGA branch updated, so I guess I'd best get to it.

    Stupid question time regarding the LUT sharing issue: can the bug ramifications have an effect outside of LUT read/write? If my USB IRP request tokens passed via LUT share were getting mangled, things would breaking in an entirely different way.

    The LUT bug only occured when you read a location that was simultaneously being written.

    If you are not suffering from the LUT bug, to begin with, then keep developing on the existing silicon.

    It would be good to give v33i a quick check, though, to make sure that I didn't break anything in rearranging the USB modes, in order to accommodate the new ADC and scope modes.
  • jmg wrote: »
    garryj wrote: »
    .... There is something very mysterious happening during the rx->tx transition where the device returns the command status and the host sends another data block. If a bad CRC happens, it's always here.
    What is the appx failure rate you see, and does it change if you warm up the P2 ?
    What is the overall Bad/Good USB packet ratio, for that ~33.5GB non-stop ?
    The bad CRC issue rate rises as the sysclock frequency is increased. I don't think it's due to heat, as it starts showing up around 160MHz.
    jmg wrote: »
    garryj wrote: »
    .... When a CRC error did happen, it was always in the command block wrapper for a new 32KB of data (see attached image).

    It doesn't happen at every new block -- there could be runs of many consecutive blocks without a single error. Applying the same "fix" used with reads that added more wait cycles between the tx byte routine's AKPIN #DP and WYPIN #DM got the CRC errors to disappear.

    Could that be related to when bit-stuff action is taken ? Your highlighted 0x9d & 0x49 differ by a single inserted bit ?
    I hadn't given it being a bit stuff issue much thought, as the write test includes a sector of all $FF bytes, so if bit stuffing was an issue, I thought it would likely show up there, too?
    garryj
  • jmgjmg Posts: 13,351
    garryj wrote: »
    Stupid question time regarding the LUT sharing issue: can the bug ramifications have an effect outside of LUT read/write? If my USB IRP request tokens passed via LUT share were getting mangled, things would breaking in an entirely different way.
    cgracey wrote: »
    The LUT bug only occured when you read a location that was simultaneously being written.

    If you suspect LUT issues, and you have the cycles spare, you can 'read until two the same' which is a SW workaround of the same-sysclk RD & WR

  • evanhevanh Posts: 6,880
    edited 2019-02-01 - 19:40:05
    garryj wrote: »
    Stupid question time regarding the LUT sharing issue: can the bug ramifications have an effect outside of LUT read/write? If my USB IRP request tokens passed via LUT share were getting mangled, things would breaking in an entirely different way.

    Yes, those H_EVENT/D_EVENT states will be momentarily scrambled if both read and write coincide. WRLUT writes correct data (I wasn't sure of this detail at the time) but RDLUT doesn't see it correctly if they coincide on the same clock.

    One way to test for the issue is renumber them so that H_READY/D_READY become non-zero. I believe this will increase chance of mangling on both state change edges.
    "There's no huge amount of massive material
    hidden in the rings that we can't see,
    the rings are almost pure ice."
  • I found that the LUT bug didn't break my event-counter based buffer test - if you are waiting for a location to increment to a particular value you seem to be safe as the glitch is a mix of old and new bits and will never erroneously match a higher value than has been written during an incrementing sequence.
  • jmgjmg Posts: 13,351
    evanh wrote: »
    garryj wrote: »
    Stupid question time regarding the LUT sharing issue: can the bug ramifications have an effect outside of LUT read/write? If my USB IRP request tokens passed via LUT share were getting mangled, things would breaking in an entirely different way.

    Yes, those H_EVENT/D_EVENT states will be momentarily scrambled if both read and write coincide. WRLUT writes correct data (I wasn't sure of this detail at the time) but RDLUT doesn't see it correctly if they coincide on the same clock.

    One way to test for the issue is renumber them so that H_READY/D_READY become non-zero. I believe this will increase chance of mangling on both state change edges.

    Good idea, if you pass tokens as one-hot encoded, you can detect and flag any aperture effects, and if you know the order they should occur, code might even be able to proceed as it can decide the new bit.
  • Mark_T wrote: »
    I found that the LUT bug didn't break my event-counter based buffer test - if you are waiting for a location to increment to a particular value you seem to be safe as the glitch is a mix of old and new bits and will never erroneously match a higher value than has been written during an incrementing sequence.

    In your case all you needed was a trigger to say a change has occurred, I think. Corrupted trigger content almost didn't matter, just as long as it changed.
    "There's no huge amount of massive material
    hidden in the rings that we can't see,
    the rings are almost pure ice."
  • garryj wrote: »
    Edit: BTW, I received my ES accessory set today and the serial host add-on works like a charm! But maybe in a future revision, both host and device boards could have test points for D-/D+?

    If you can get some stackable headers, you can use these to extend the pins through the plug in ES boards, so you can still scope the P2 pins

    They sell 6 pin headers for arduino shields, there are 2 in a pack, such as
    https://www.freetronics.com.au/products/stackable-arduino-shield-headers

    3456 x 4608 - 2M
  • Got the silicon branch merged into the USB mass storage FPGA-based code. I started with the FPGA v32i/j build first to set a baseline.
    From the analyzer view, there appeared to be no functional differences between v32i and v33i.

    My three "flaky" v3.x usb parts were still flaky on both builds.
    The rest of the devices all enumerated and the FAT volume access tests all passed, so it's looking good regarding the USB smart pin mode changes :smiley:
    On v32i I hadn't been experiencing any symptoms of the LUT sharing same-clock read/write issue, and it was status quo on v33i. I guess that's good :wink:

    The silicon at 80MHz behaved the same, with one exception. I found that setting bit 16 to enable clocking when configuring the USB smart pins does have a positive affect, as one of my "flaky" devices that refused to enumerate on the FPGA did so nicely, and without error, with clocking enabled. As I expected, when I cranked silicon sysclock up to 180MHz, the OUT crc fail remains an issue. But maybe, since the clock bit has no effect on the FPGA, what Chip described in this post could still be a factor?
    https://forums.parallax.com/discussion/comment/1462620/#Comment_1462620

    Note that this particular CRC fail that can be reproduced on silicon has never popped up on the FPGA, but we're stuck at 80MHz there...
    garryj
  • Bugger. I thought that would resolve something. Oh well, I guess it's good that the lutRAM sharing flaw wasn't the cause of any headaches for you.
    "There's no huge amount of massive material
    hidden in the rings that we can't see,
    the rings are almost pure ice."
  • @garryj
    Tubular and I ran some USB tests today with your code "USBBootMouseKbdLite.spin" on a P2_ES.
    Using the accessory USB board 7 out of 8 keyboards worked Ok.
    3 mice (1 cordless) all worked Ok.

    Melbourne, Australia
  • Just ran "USBMS.spin2" abd has success!
    <Full-Speed device connected.>
    
    Vendor ID: SanDisk
    Product ID: Cruzer Facet
    Version level: 1.00
    Media is removable
    SCSI version is ANSI X3.131:1994 (SCSI-2) or higher
    Highest LBA: 15630335
    Sector size: 512
    Checking media for a FAT file system...
    
    Partition type: 0x0B
    Cluster size: 16384
    Volume base sector: 32
    Reserved sector count: 18
    FSInfo base sector (in reserved): 33
    FAT region base sector: 50
    Sector count of one FAT: 3815
    FAT region sector count: 7630
    RootDir base sector: 7680
    RootDir cluster#: 2
    Dir/file/data base sector: 7680
    Count of data region clusters: 488208
    Count of free clusters: 861
    FSInfo next free cluster: 64677
    Count of data region sectors: 15622656
    Count of volume sectors: 15630304
    FAT32 volume mounted.
    
    A:\>
    
    Melbourne, Australia
  • and here's 3 more. All Ok :)
    <Full-Speed device connected.>
    
    Vendor ID: SanDisk
    Product ID: Cruzer Facet
    Version level: 1.00
    Media is removable
    SCSI version is ANSI X3.131:1994 (SCSI-2) or higher
    Highest LBA: 123174911
    Sector size: 512
    Checking media for a FAT file system...
    
    Partition type: 0x0C
    Cluster size: 32768
    Volume base sector: 32
    Reserved sector count: 14
    FSInfo base sector (in reserved): 33
    FAT region base sector: 46
    Sector count of one FAT: 15033
    FAT region sector count: 30066
    RootDir base sector: 30112
    RootDir cluster#: 2
    Dir/file/data base sector: 30112
    Count of data region clusters: 1924137
    Count of free clusters: 869954
    FSInfo next free cluster: 32287
    Count of data region sectors: 123144800
    Count of volume sectors: 123174880
    FAT32 volume mounted.
    
    A:\>
    <Device disconnected>.
    
    #:\>
    <Full-Speed device connected.>
    
    Vendor ID: SanDisk
    Product ID: Cruzer Switch
    Version level: 1.27
    Media is removable
    SCSI version is ANSI X3.131:1994 (SCSI-2) or higher
    Highest LBA: 15431337
    Sector size: 512
    Checking media for a FAT file system...
    
    Partition type: 0x0B
    Cluster size: 16384
    Volume base sector: 32
    Reserved sector count: 20
    FSInfo base sector (in reserved): 33
    FAT region base sector: 52
    Sector count of one FAT: 3766
    FAT region sector count: 7532
    RootDir base sector: 7584
    RootDir cluster#: 2
    Dir/file/data base sector: 7584
    Count of data region clusters: 481992
    Count of free clusters: 401636
    FSInfo next free cluster: 15
    Count of data region sectors: 15423754
    Count of volume sectors: 15431306
    FAT32 volume mounted.
    
    A:\>
    <Device disconnected>.
    
    #:\>
    <Full-Speed device connected.>
    
    Vendor ID: Verbatim
    Product ID: STORE N GO
    Version level: PMAP
    Media is removable
    Device does not claim conformance to any SPC standard
    No data...
    Bulk-IN endpoint STALL...
    SCSI command error: Failed
    ASC: 0x28, ASCQ: 0x00
    Unit requires attention: not ready to ready change, or medium may have changed
    Command retry...
    Highest LBA: 15653375
    Sector size: 512
    Checking media for a FAT file system...
    
    Partition type: 0x0B
    Cluster size: 4096
    Volume base sector: 8064
    Reserved sector count: 2274
    FSInfo base sector (in reserved): 8065
    FAT region base sector: 10338
    Sector count of one FAT: 15247
    FAT region sector count: 30494
    RootDir base sector: 40832
    RootDir cluster#: 2
    Dir/file/data base sector: 40832
    Count of data region clusters: 1951568
    Count of free clusters: 1443573
    FSInfo next free cluster: 16534
    Count of data region sectors: 15612544
    Count of volume sectors: 15645312
    FAT32 volume mounted.
    
    A:\>
    <Device disconnected>.
    
    #:\>
    
    Melbourne, Australia
  • ooh nice work
  • Thanks so much for taking the time to do this. It's nice to see that the success:fail ratio is tilted to the good side :smiley:
    garryj
  • Yes, it is!

    Are there any lingering mysteries about USB viability? It seems there is maybe a need to insert delays at some junctures at high speeds?
  • Where can I get the test files?
  • cgracey wrote: »
    Are there any lingering mysteries about USB viability? It seems there is maybe a need to insert delays at some junctures at high speeds?
    The inter-packet and turn-around delays are specified in bit times, so the higher timing resolution due to the silicon's speed has made any packet separation issues that could/would pop up at 80MHz disappear. I'm pretty sure that having the extra cycles available in silicon was why my "flaky" 3.x parts began working reliably due to more accurate IP and TAT timings.

    I've only been working in the host USB mode, so the internal pull-up resistors for low/full speed device mode have not been directly tested, yet.

    The one big mystery left is the "why" behind the "bad CRC on OUT". A CRC error on transmit was something I never saw during FPGA development. So far, the only thing I know for sure is that it popped up when sysclock was increased. The "fix" of increasing the delay between the AKPIN and WYPIN at byte tx is something that's outside of the realm of "regular" USB bus timings, as the packet spacing when this glitch triggers is within acceptable limits, and it's happening in a place I can't get to via code (other than the "fix"). For media writes, if the CRC error is cleared using the USB transaction retry mechanism, the 512 OUT data packets that follow are transmitted without error via the same routine.

    What is the buffer size of the USB transmitter? I have been assuming that it is at least two bytes? TX is unique in that it's the only USB smartpin action where the upper pin is polled and acknowledge and the tx byte is written to the lower pin. The analyzer shows that bit(s) are definitely getting affected and the device does what it's supposed to do (ignore the corrupted packet). But I'm out of ideas regarding what I could possibly do in code other than the delay bandage?
    garryj
  • jmgjmg Posts: 13,351
    edited 2019-02-06 - 19:56:56
    garryj wrote: »
    What is the buffer size of the USB transmitter? I have been assuming that it is at least two bytes?
    Chip will elaborate, but I don't think the Smart Pins have a FIFO, so you cannot write 2 bytes very closely together.
    It can do continual sends, which means a shifter, and a queue buffer.
    The queue buffer will load into the shifter, usually on a next-baud-clock, so you have almost a whole char time to next load the queue buffer, before underflow occurs.
    You probably could load 2 bytes, spaced at least one baud-clock time apart, if you know the shifter was empty at the start.
    That does suggest a mechanism where faster SysCLKs could spawn failures, as the USB baud-clock is not changing.
  • The crc error always involves data byte #4, 5, 6 or 7. So the bus sequence would be:
    SOP + OUT packet (3 bytes)
    Inter-packet delay in bit periods (spec is 2.0 to 7.5)
    SOP + DATAx PID + data bytes (zero to 64) + packet CRC (2 bytes)

    Maybe something could be happening at the NCO roll-over point? But if that were the case, you'd think CRC errors would pop up in other OUTS of the transfer, and not just in the first OUT of the transfer. That's the part that's driving me nuts :zombie:
    garryj
  • jmgjmg Posts: 13,351
    edited 2019-02-06 - 22:14:24
    garryj wrote: »
    ..
    Maybe something could be happening at the NCO roll-over point? But if that were the case, you'd think CRC errors would pop up in other OUTS of the transfer, and not just in the first OUT of the transfer. That's the part that's driving me nuts :zombie:

    Using Chips NCO formula, finds these USB sweet spots (NCO error free)

    96M/(2^16/round(12M/96M* 0x10000)) = 12000000
    128M/(2^16/round(12M/128M* 0x10000)) = 12000000
    192M/(2^16/round(12M/192M* 0x10000)) = 12000000

    96 & 192 are clean binary so they have no jitter and no errors.
    They could be useful SysCLK speed check points, as the BUS timings should be identical for both, but the SW path delays will be 2:1 ?


    128M is MHz exact, but achieves that by NCO /11/11/10/11/11/10 repeating, so has jitter.

    180M is 9*20, for low VCO jitter, and has a 12MHz error of 15.25ppm, because one in every 4369 cycles, it divides by 16, adding 5.55ns (NCO artifact)
    170M is 17*10M, and has 12MHz eror of 15.26ppm, but here it /14 ~ 5/6th of the time, and /15 ~1/6th

  • ozpropdevozpropdev Posts: 2,473
    edited 2019-02-06 - 22:31:24
    BTW @garryj
    I made a few changes to your code to get things going on the P2_ES USB accessory board.

    I added a basepin constant for the desired IO group base pin.
    then a few changes to use the constant
            DM           = basepin+2                       ' DM is "The Brain"
            DP           = basepin+3                       ' DP is passive
    ..and
            HOST_ACTIVE_LED   = basepin ' Blinks while in the host's main processing loop
            DRIVER_ACTIVE_LED = basepin+4 ' Pulses during mouse activity
    
    and added 1 instruction to enable the USB.
    	drvh	#basepin+1	'enable usb
    
    
    Melbourne, Australia
  • Will incorporate. Thanks!
    garryj
  • Does this CRC error also occur on the P2D2 with 12MHz oscillator?
    Formerly known as TonyB
  • garryjgarryj Posts: 249
    edited 2019-02-07 - 01:50:38
    TonyB_ wrote: »
    Does this CRC error also occur on the P2D2 with 12MHz oscillator?
    Good question, but I don't have a P2D2 board to test with. If anyone can give it a try, this post explains how you should be able to make it break:
    https://forums.parallax.com/discussion/comment/1462709/#Comment_1462709
    Edit: and a 16GB FAT32 volume should be big enough to trigger it.
    garryj
  • FYI, for those wanting an easier time probing with scope leads, etc - I found the rear shell of the connector housing on the Accessory Set's Serial Host board pops off easily - then, at least the top port's leads can be latched onto.

    Cheers
    4048 x 3036 - 3M
    --
    WIP Spin drivers for various devices: LSM9DS1 IMU (SPI) | Newhaven 4x20 OLED (I2C) | MLX90621 (I2C) | SHT3x (I2C) | SSD1306 OLED (I2C; P1-SPIN, P2-SPIN2) | TCS3x7x (I2C) | MAX31856 (SPI) | BMP280 (I2C) | TMC2130 (SPI) | nRF24L01+ (SPI) | MLX90614 (I2C) | MAX9744 (I2C) | DS28CM00 (I2C) | TSL2591 (I2C)
Sign In or Register to comment.