Shop OBEX P1 Docs P2 Docs Learn Events
Working full-speed (12 Mb/s) bit-banging USB Host controller - Page 2 — Parallax Forums

Working full-speed (12 Mb/s) bit-banging USB Host controller

2456

Comments

  • TonyWaiteTonyWaite Posts: 219
    edited 2010-04-03 11:35
    Micah,

    I'm not really qualified to suggest this, but could the scheduler-kernel of Peter Van der Zee's PropRTOS be helpful?

    T o n y

    http://www.parallax.com/PropRTOS/tabid/852/Default.aspx
  • scanlimescanlime Posts: 106
    edited 2010-04-04 10:20
    TonyWaite said...
    Micah,

    I'm not really qualified to suggest this, but could the scheduler-kernel of Peter Van der Zee's PropRTOS be helpful?

    T o n y

    http://www.parallax.com/PropRTOS/tabid/852/Default.aspx

    Hmm.. something like this might help make use of idle time on other cogs to accelerate the encode/decode process.. but it wouldn't help reduce the cog count below three.

    The peak of 3 cogs is currently needed for receiving a packet. While receiving, there are a number of tasks that have to take place:
    • Receiving 1 bit every 8 clock cycles, and storing it somewhere. Should be able to store approximately 1 kilobyte total.
    • Noticing the end-of-packet, so we know when to stop receiving
    • Sending an ACK packet if we need to. This must happen with bounded latency after the end-of-packet signal.
    • Watchdog for the receiver cog(s), so we can wake them up if a packet never arrives.
    Currently these tasks are divided between three cogs, which I'm calling TX, RX1, and RX2:
    • TX: Receiver watchdog
    • TX: Poll in a tight loop for the end-of-packet, and send an ACK using the video generator
    • RX1: Store the first 16 bits of each 32-bit word, in hub memory.
    • RX2: Store the second 16 bits of each 32-bit word, in hub memory.
    • RX1/2: Both also have a higher-latency end-of-packet detector, for terminating their own receive loops.
    The only way I can think of to optimize this further is to combine the RX1 and RX2 cogs into one- but that requires receiving an entire packet using one fully unrolled loop. So you're limited to receiving pretty small packets. Even if you assume that the entirety of cog memory is used only for the receive loop and receive buffer, you would only have enough memory to receive a 31 byte packet. And that isn't enough for most USB devices.

    So, as best I can tell, there's no way to do better than 3 cogs without either introducing external hardware or severely restricting the kinds of USB devices you can talk to.

    --Micah
  • pjvpjv Posts: 1,903
    edited 2010-04-04 18:20
    Hi Tony/Micah;

    Let me jump in to comment here as this kernel is something I created.

    The kernel is based on a selectable tick time, somewhere in the 2 to 5 (or even 10) uSec neighborhood. The faster for higer performance and lower latencies, but fewer threads, and the slower for the respective opposites. Depending on the number of threads, the task context switch takes just under half of a microsecond (@80 MHz), so it really is not directly suitable for switching events that need performance faster than that.

    Cheers,

    Peter (pjv)
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-04-05 06:58
    micah: This is really neat. Get as much working as possible. We can all chime in later to get it faster and maybe reduce cogs, but we need to know what needs to be present first. This is one great little addition to the prop. I have long wanted to be able to use a USB memory stick and USB Bluetooth on the prop. Both these items are so cheap. I have a miniature USB that has a microSD socket for uSD cards, so it is an easy way to transfer files. Thanks.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
    · Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
  • scanlimescanlime Posts: 106
    edited 2010-04-06 06:34
    I just posted a new version from today's svn (20100405).

    There's a lot of cleanup and optimization in the host controller driver. Memory usage is 949 longs, and it's down to 3 cogs. The object is a singleton now, so you can declare it in the OBJ section of each USB device class driver, and they're all sharing the same instance of the host controller.

    The other big change is that this includes a basic USB storage class driver. The storage class driver should include enough functionality to be usable, though YMMV. It's seen very little testing so far, and I've seen certain data patterns trigger bugs in the host controller that will manifest as E_CRC errors. I had to change my demo to read sector 1 instead of sector 0 to work around this bug on my disk [noparse]:)[/noparse]

    Here's some sample output from test-storage.spin:

    Identified as USB storage
    
    Sector Size: 512
    Number of Sectors: 003A9FFF (1875 MB)
    
    SCSI INQUIRY:
    0000: 00 80 00 00 29 00 00 00 47 65 6E 65 72 69 63 20  ....)...Generic
    0010: 53 54 4F 52 41 47 45 20 44 45 56 49 43 45 20 20  STORAGE DEVICE
    0020: 39 34 30 37 00 00 00 00 22 00 00 00 00 00 00 00  9407....".......
    
    Disk sector 00000001:
    0000: 45 46 49 20 50 41 52 54 00 00 01 00 5C 00 00 00  EFI PART....\...
    0010: BE 9A 4B C3 00 00 00 00 01 00 00 00 00 00 00 00  ..K.............
    0020: FF 9F 3A 00 00 00 00 00 22 00 00 00 00 00 00 00  ..:.....".......
    0030: DE 9F 3A 00 00 00 00 00 1A 35 56 98 83 92 28 44  ..:......5V...(D
    0040: BA FC 02 3C 4C 14 8B A5 02 00 00 00 00 00 00 00  ...<L...........
    0050: 80 00 00 00 80 00 00 00 D2 BC 61 D3 00 00 00 00  ..........a.....
    0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00C0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00D0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00E0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00F0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01C0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01D0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01E0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01F0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    
    



    I'll keep trying to debug and polish this as time permits, but I figured that what I had was a lot nicer than the original version already [noparse]:)[/noparse]

    --Micah
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2010-04-06 14:30
    @Micah

    The implications of this are simply amazing! Three cogs makes this extremely enticing!
    I take it that a USB thumbdrive driver is only a FSRW modification away?

    OBC

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Are you Propeller Powered? PropellerPowered.com
    Visit the: PROPELLERPOWERED SIG forum kindly hosted by Savage Circuits.
  • scanlimescanlime Posts: 106
    edited 2010-04-06 16:33
    Oldbitcollector said...
    @Micah

    The implications of this are simply amazing! Three cogs makes this extremely enticing!
    I take it that a USB thumbdrive driver is only a FSRW modification away?

    OBC

    I hope so [noparse]:)[/noparse]

    I know how to fix the CRC problems I was hitting last night... I ran some more tests, and it looks like the failing packets were those which happened to have a string of "1" bits at the end of their CRC, and which happen to end with a zero on the D- pin. This confuses the pseudo-end-of-packet detector that I'm using now. But now that I have a real end-of-packet detector for sending ACKs, I can calculate the real packet length by taking timestamps at the beginning and end of the packet. Just need to write and debug that code, and hopefully it'll be a working block-level driver.

    I expect the first working version to be pretty slow, though. The host controller's packet encode/decode steps haven't been optimized at all yet.

    --Micah
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2010-04-06 18:49
    It's a shame I only have a 5 Mhz Xtal or I could try this. It is absolutely amazing. It also made Hackaday, BTW.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Check out my new website!!

    Use the Propeller icon!! Propeller.gif

    Follow me on Twitter! Search "Microcontrolled"
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2010-04-06 18:57
    dang it @micrcontrolled! If you had said this a day sooner! I would have included the required xtal in the box. [noparse]:)[/noparse]

    OBC

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Are you Propeller Powered? PropellerPowered.com
    Visit the: PROPELLERPOWERED SIG forum kindly hosted by Savage Circuits.
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2010-04-06 19:25
    6820computerheadbangd.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Check out my new website!!

    Use the Propeller icon!! Propeller.gif

    Follow me on Twitter! Search "Microcontrolled"
  • hover1hover1 Posts: 1,929
    edited 2010-04-06 19:35
    Is the code locked to 96Mhz? If it will run a 100Mhz I'll send microcontrolled a 6.25Mhz crystal.

    Jim
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2010-04-06 19:39
    I still have your address... will find a moment to drop one in the mail for when you return.

    OBC

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Are you Propeller Powered? PropellerPowered.com
    Visit the: PROPELLERPOWERED SIG forum kindly hosted by Savage Circuits.
  • scanlimescanlime Posts: 106
    edited 2010-04-06 19:41
    hover1 said...
    Is the code locked to 96Mhz? If it will run a 100Mhz I'll send microcontrolled a 6.25Mhz crystal.

    Jim

    It's locked to 96 MHz. I need an integer number of (2 or greater) instructions per USB bit period. So 96 MHz is the slowest speed it would work at, and 144 MHz is the next fastest one.

    So yeah.. sorry, but you do need a 6 MHz crystal [noparse]:([/noparse]

    --Micah
  • hover1hover1 Posts: 1,929
    edited 2010-04-06 19:53
    Time to order some 6 Meggers then!

    ·I guess Sapieha will be the only one running at 144 MHz. smile.gif

    Jim
    Micah Dowty said...
    hover1 said...
    Is the code locked to 96Mhz? If it will run a 100Mhz I'll send microcontrolled a 6.25Mhz crystal.

    Jim

    It's locked to 96 MHz. I need an integer number of (2 or greater) instructions per USB bit period. So 96 MHz is the slowest speed it would work at, and 144 MHz is the next fastest one.

    So yeah.. sorry, but you do need a 6 MHz crystal [noparse]:([/noparse]

    --Micah
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2010-04-06 20:12
    I can order one. I am was going to order an ethernet chip here soon anyway.
    Also, I technically HAVE a 6Mhz Xtal that came with a Gadget Gangster kit, but after 30 minutes+ of searching I guess it is lost. :-( Now back to looking......

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Check out my new website!!

    Use the Propeller icon!! Propeller.gif

    Follow me on Twitter! Search "Microcontrolled"
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2010-04-06 20:18
    WHAT?!?!?!?! My Ethernet chip has a 27 week lead time!!!!!!! Let's check DigiKey.....

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Check out my new website!!

    Use the Propeller icon!! Propeller.gif

    Follow me on Twitter! Search "Microcontrolled"
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2010-04-06 20:27
    OK, I'll be getting this crystal. It should work, right?

    Thanks again for this marvelous USB object!!!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Check out my new website!!

    Use the Propeller icon!! Propeller.gif

    Follow me on Twitter! Search "Microcontrolled"
  • scanlimescanlime Posts: 106
    edited 2010-04-06 20:29
    microcontrolled said...
    OK, I'll be getting this crystal. It should work, right?

    Thanks again for this marvelous USB object!!!

    Yep, that crystal looks good.

    Glad to provide something that might be useful if it works [noparse];)[/noparse] Good luck!

    --Micah
  • HollyMinkowskiHollyMinkowski Posts: 1,398
    edited 2010-04-06 22:08
    I ordered 10 of these 6.000 xtals
    price is $3.29 including shipping
    cBiz9Mpm.jpeg
    www.taydaelectronics.com/servlet/the-95/6.000-MHz-6-MHz/Detail
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-04-06 22:23
    You need an xtal with about 18-20pF. I use both 6MHz and 6.5MHz (104MHz) xtals from DigiKey (see RamBlade thread for the part number of the 6.5MHz). Note you will require special decoupling and pcb layout on the prop to run at these speeds. I also have 13.5MHz (108MHz, pll=8) but have not completed testing.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
    · Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2010-04-07 00:55
    @Holly: NICE!! AND they accept PayPal!! DigiKey, unfortunatly, does not, and I have no other online paying method. With the cheap shipping, I'll order a few other things as well. smile.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Check out my new website!!

    Use the Propeller icon!! Propeller.gif

    Follow me on Twitter! Search "Microcontrolled"
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2010-04-07 00:56
    @Cluso99: Thanks! I was wondering if that value was significant.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Check out my new website!!

    Use the Propeller icon!! Propeller.gif

    Follow me on Twitter! Search "Microcontrolled"
  • HollyMinkowskiHollyMinkowski Posts: 1,398
    edited 2010-04-07 03:37
    @Microcontrolled
    They had those nice little NO switches that can mount easily on a breadboard for 4 cents each.
    I ordered 100 of those, 100 1N4007 diodes, a couple of thousand resistors at 1 cent each, 100 PN4401.
    Most small parts like resistors, ceramic caps..etc were one or two cents each.

    I hate to pass up a deal smile.gif
    TC-0102-X.jpg
  • scanlimescanlime Posts: 106
    edited 2010-04-07 17:33
    Still some nasty bugs to work out, but there's an FSRW port in the svn repository now [noparse]:)[/noparse]
  • jazzedjazzed Posts: 11,803
    edited 2010-04-07 19:54
    If you're lucky the configuration of your 6MHz crystal is of no consequence. If you're not lucky, you have to follow recommended designs (definitely a requirement for high volume production).

    BTW, I found the RALINK 802.11b/g adapter linux driver source and I have a compatible device, but don't have time for it just yet.

    Good progress Micah [noparse]:)[/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    May the road rise to meet you; may the sun shine on your back.
    May you create something useful, even if it's just a hack.
  • HannoHanno Posts: 1,130
    edited 2010-04-08 02:47
    Good progress Micah!
    I almost forgot about a thought I had to reduce cog usage (I looked at your code before bedtime, had the thought, and forgot about it the next couple days- see what you think)
    You currently use 2 cogs to receive one bit at a time every 2 instructions. After receiving 16 bits with one cog that cog has some time to write the data to hub.
    Using the "mov x,ina" instruction, you can read multiple bits at the same time- provided that they're waiting for you on the Propeller's IO pins. Using delay lines with multiple pins, you can use this trick to read multiple bits at the same time. I would start with reading data into the cog's ram and spooling it back to hub ram when the receive is finished.
    Good luck!
    Hanno

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Co-author of the official Propeller Guide- available at Amazon
    Developer of ViewPort, the premier visual debugger for the Propeller (read the review here, thread here),
    12Blocks, the block-based programming environment (thread here)
    and PropScope, the multi-function USB oscilloscope/function generator/logic analyzer
  • scanlimescanlime Posts: 106
    edited 2010-04-08 03:11
    Hanno said...
    Good progress Micah!
    I almost forgot about a thought I had to reduce cog usage (I looked at your code before bedtime, had the thought, and forgot about it the next couple days- see what you think)
    You currently use 2 cogs to receive one bit at a time every 2 instructions. After receiving 16 bits with one cog that cog has some time to write the data to hub.
    Using the "mov x,ina" instruction, you can read multiple bits at the same time- provided that they're waiting for you on the Propeller's IO pins. Using delay lines with multiple pins, you can use this trick to read multiple bits at the same time. I would start with reading data into the cog's ram and spooling it back to hub ram when the receive is finished.
    Good luck!
    Hanno

    Thanks!

    Delay lines would definitely help trade cogs for pins. But part of the fun IMHO is to do this with no external active components. If I'm going to buy a delay line chip, might as well make it a USB host controller chip [noparse]:)[/noparse]
  • scanlimescanlime Posts: 106
    edited 2010-04-11 04:54
    It's still really rough, but I just checked in an FT232 driver. I'm still getting CRC errors occasionally (well, maybe a bit more than occasionally) but for the most part you can use this to talk to a Prop Plug over USB [noparse]:)[/noparse]

    I've been gradually bugfixing the host controller core and making it more robust. Found a couple fairly serious bugs while working on the USB storage driver. At this point, I think the FT232 and storage drivers are pretty much complete, it's just a matter of using those drivers to bugfix, polish, and optimize the host controller itself. Storage was a bit inconvenient for debugging purposes.. the FT232 driver should make it easier to test arbitrary packet lengths and contents.

    The Subversion repository is at:

    http://svn.navi.cx/misc/trunk/propeller/usb-host/

    --Micah
  • scanlimescanlime Posts: 106
    edited 2010-04-19 16:42
    I'm making progress on a Bluetooth stack built around this host controller. So far I have support for:

    - HCI (the low-level Host Controller Interface protocol that gets sent over USB)
    - Device discovery (Inquiry, setting local name/class)
    - ACL packets
    - Basic support for L2CAP connections, echo response
    - Work-in-progress SDP server (service discovery)

    I've been testing it mostly using tools from Linux's BlueZ stack, especially sdptool and l2ping. The l2ping performance seems reasonable- 10-20ms latency with about 2% packet loss. I suspect all of that packet loss is due to CRC errors on the received USB packets, due to the corners I had to cut in the bit-banging USB receiver.

    So, it's at about the point where I'm thinking about what other protocols to implement after I finish SDP. I think I'll create a sort of low-level socket interface that allows attaching hub-memory buffer lists to L2CAP or RFCOMM connections. I'm interested in hearing from you about what applications this stack might be useful for. My ideas so far:

    - Communicating with other Propellers or small embedded systems (L2CAP + Custom protocols)
    - Talking to the Wiimote (L2CAP + HID)
    - Serial port emulation (RFCOMM)
    - Sending/receiving files and messages? (OBEX)

    What do you think? Is anyone else likely to use this Bluetooth stack? If so, what for?
  • jazzedjazzed Posts: 11,803
    edited 2010-04-19 17:10
    @Micah,

    I have some BlueTooth dongles and other devices and can see some interesting possibilities with audio and generic PAN wireless device communications.

    Have you implemented a socket layer interface? Some Linux L2CAP examples use sockets (I haven't researched this much). Sockets would easily enable other devices such as RALINK WIFI. Are there alternatives to sockets?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    May the road rise to meet you; may the sun shine on your back.
    May you create something useful, even if it's just a hack.
Sign In or Register to comment.