confusing, Atmel SAM9260 does D- first on both host and device, using a micro usb connector not meant for that pcb side would give a quick fix.
NXP LCP1850 have full host features and they do D+ first.
Or it may be essentially random...
It may just come down to pick a default connector, (mini/micro?) and choose that ordering.
If series resistors are suggested (seems likely, for Rise times, and impedance as well as ESD) then you can swap on one layer
so the detail becomes more moot.
I mount my microUSB connectors on the top of the board even though they are meant for the underside.
IMHO the connector is not a valid reason for choosing position of D+ and D-.
The reason I chose for the preferred positioning was from a hardware and software mix as I tried to explain a few posts above. By configuring the circuit in a smart way, it is possible to utilise the microUSB connector and prop pins so they can perform the function of...
* USB
* Serial TX and RX
* PS2 (keyboard or mouse)
* I2C (SDA and SCL)
* 1-pin TV & Keyboard
I2C is used in the Wii controllers, so they can be hooked in via a modified cable.
All devices can be connected to the microUSB using a modified microUSB to xxx cable.
For this to work nicely, the governing factor for me was that the USB keyboard that can be used in PS2 mode, has CLK on D+ and DATA on D-. Having settled that DATA has to be on D- and CLK on D+, I looked into the Serial Port P30/31. Since P31 is SI (RXD) and P30 is SO (TXD) I use these in my 1-pin TV & Keyboard, and since P31 is an input I use this as the 1-pin Keyboard (DATA) pin.P31 is an output, so its logical to use this as the 1-PIN TV output pin.
Anyway, see the diagram in the post below.
Therefore my preference is for the D- to be on the Upper ODD pin and D+ to be on the Lower EVEN pin.
PostEdit FWIW there is probably as much argument to have it the other way around.
Chip,
I am presuming that because you are only having a USB receiver for one pin and a USB transmitter for the adjacent pin pair, that you are not going to be able to configure the pin/+1/+2/+3/-1/-2/-3 ?
As most likely, we will never require a large number of USB ports, might another way to reduce silicon might be to have only 1 usb tx and 1 usb rx per "n" pins (say 4/8?) and use the pin/+1/+2/+3/-1/-2/-3 mechanism to select the appropriate pin ???
Another possibility, could it be possible that the usb rx and tx be inside each cog ??? That would cut down from 32 sets to 16 sets. Could the clock then be fed back on the extra pin(s) you have for the smart pin comms?
Chip,
I am presuming that because you are only having a USB receiver for one pin and a USB transmitter for the adjacent pin pair, that you are not going to be able to configure the pin/+1/+2/+3/-1/-2/-3 ?
As most likely, we will never require a large number of USB ports, might another way to reduce silicon might be to have only 1 usb tx and 1 usb rx per "n" pins (say 4/8?) and use the pin/+1/+2/+3/-1/-2/-3 mechanism to select the appropriate pin ???
Another possibility, could it be possible that the usb rx and tx be inside each cog ??? That would cut down from 32 sets to 16 sets. Could the clock then be fed back on the extra pin(s) you have for the smart pin comms?
Interesting ideas. Right now, the even pin is the brain and the odd pin is passive and accepts its I/O control from its even companion pin when in USB mode. So, there already are 16 sets of USB circuits. Doing it this way, USB in the smart pins is shaping up to take only half the logic that the serial modes use.
Interesting ideas. Right now, the even pin is the brain and the odd pin is passive and accepts its I/O control from its even companion pin when in USB mode. So, there already are 16 sets of USB circuits. Doing it this way, USB in the smart pins is shaping up to take only half the logic that the serial modes use.
16 Sets ?
Does that mean only some pins (lower 32?) can do USB ?
Chip, maybe when you get a chance, perhaps a freehand scribbled drawing of the logic blocks might help get some suggestions to simplify the blocks.
BTW I think you could utilise the same concept (one Tx and one Rx per pair of pins) for Async. I cannot forsee a design requiring more than 32 serial ports, so provided you can just enable the Rx and Tx separately, this could save some die space.
Meanwhile I have progressed quite a bit with snooping the LS USB keyboard.
Anxiously waiting for a de-nano image to run at 96MHz so I can get USB running.
Chip, maybe when you get a chance, perhaps a freehand scribbled drawing of the logic blocks might help get some suggestions to simplify the blocks.
BTW I think you could utilise the same concept (one Tx and one Rx per pair of pins) for Async. I cannot forsee a design requiring more than 32 serial ports, so provided you can just enable the Rx and Tx separately, this could save some die space.
That could be a useful option, if the die area proves too tight.
Yes, it does save die area, but does force a design that wants eg 1 Rx and 8 or 16 Tx, to use every alternate pin, which is going to compromise byte/word-wide LCD screens & streamer.
Worst case would be the cry "Argh! I have enough pins, but I cannot access them"
It also limits Serial ports to 32, instead of eg 1 Rx 63 Tx
Anxiously waiting for a de-nano image to run at 96MHz so I can get USB running.
96MHz is not that critical anymore for LS (or even FS), and I'm not sure overclocking a part you are not sure works, is a good approach ?
80MHz should be ok, (certainly at LS) or 84MHz for less frequent residual jitter.
Chip, maybe when you get a chance, perhaps a freehand scribbled drawing of the logic blocks might help get some suggestions to simplify the blocks.
BTW I think you could utilise the same concept (one Tx and one Rx per pair of pins) for Async. I cannot forsee a design requiring more than 32 serial ports, so provided you can just enable the Rx and Tx separately, this could save some die space.
That could be a useful option, if the die area proves too tight.
Yes, it does save die area, but does force a design that wants eg 1 Rx and 8 or 16 Tx, to use every alternate pin, which is going to compromise byte/word-wide LCD screens & streamer.
Worst case would be the cry "Argh! I have enough pins, but I cannot access them"
It also limits Serial ports to 32, instead of eg 1 Rx 63 Tx
An application that required 1 rx and 8/16 tx or vice-versa along with byte/word-wide screen or more than 32 serial ports would be pretty rare so pairing pins like that would be worth it if the die space was tight.
Chip,
96MHz is just easy to manually snoop the bus without smart pins. Let's just forget this for now and I will see how it goes. I have the snooping and some decoding done for LS on P1 now.
When you have a de-nano release with a few smart pins, I will try out the read first, snooping and decoding at LS. Once running I will try FS.
When working I can switch to being a serial port responding as required.
I know this is the longer approach, but I want to fully understand the bus transactions as there are often quirks that specs don't tell you. Back in the 80's I did the low levels of an IBM 3270 emulator for Apple //e and ///, and there were a lot of quirks that we needed to implement.
BTW would a bemicro cv build have more cogs and hub memory than a de-nano? I have both. No point in doing bemicro cv if it doesn't give any more cogs.
BTW I think you could utilise the same concept (one Tx and one Rx per pair of pins) for Async. I cannot forsee a design requiring more than 32 serial ports, so provided you can just enable the Rx and Tx separately, this could save some die space.
Interesting. I suppose you could make the same argument for synchronous serial. In the half-duplex case, I think you could tie both the RX and TX cell to the same pin, then enable only one at a time.
Also, you could get rid of separate RX/TX mode selections, as the LSB of the cell index would provide the same information.
I wonder, though, how much of a difference limiting ASNYC and SYNC serial this way would make in circuit size...
Chip is already only planning USB one Tx and one RX per pin pair.
And my point was meant for both sync and async to do likewise although I only mentioned async. Likewise I think we could do only the lab one way and msb the other. Or perhaps just do async correctly both ways. Anyway, once we have a good handle on these we can offer suggestions to reduce the silicon.
I just fixed an intermittent bug in the USB smart mode that had been making things feel shaky since the beginning, though I wasn't sure if there was really a problem.
The receiver must resynchronize on any line-state change, so that it can sample the incoming state on the upcoming half-bit period. This part was fine, but there was a subtle problem when the re-sync occurred on the expected sample point. It would wind up reading the same state twice. This was easy to get around once I realized the problem - just sample the line states from the prior clock, instead of the current.
I had set up the test to wait a random amount of clocks before starting, so the NCO's phase relative to the receiver's edge sensor would hit every possible combination on repeated downloads.
I'm still working on getting everything together for a general release.
I had a panic moment the other day when the design wouldn't fit into the -A9! Thankfully, it was possible to let a seam out in the britches. I had the compiler on "performance-aggressive" mode, so it was taking more LE's than needed, so I switched it to "area-aggressive" mode and it came in at the ~same Fmax, but only 97% full. That should be enough breathing room to wrap things up.
I just fixed an intermittent bug in the USB smart mode that had been making things feel shaky since the beginning, though I wasn't sure if there was really a problem.
The receiver must resynchronize on any line-state change, so that it can sample the incoming state on the upcoming half-bit period. This part was fine, but there was a subtle problem when the re-sync occurred on the expected sample point. It would wind up reading the same state twice. This was easy to get around once I realized the problem - just sample the line states from the prior clock, instead of the current.
I had set up the test to wait a random amount of clocks before starting, so the NCO's phase relative to the receiver's edge sensor would hit every possible combination on repeated downloads.
In 'normal' mid-streamed data, the NCO should not creep enough between edges, to shift far enough to hit a sample point ?
- but I can see the very first edge has to be tolerant of any phase, so that is a good one to catch.
I had a panic moment the other day when the design wouldn't fit into the -A9! Thankfully, it was possible to let a seam out in the britches. I had the compiler on "performance-aggressive" mode, so it was taking more LE's than needed, so I switched it to "area-aggressive" mode and it came in at the ~same Fmax, but only 97% full. That should be enough breathing room to wrap things up.
The plus side of this, is you do have a natural ceiling on features-added !
I'm still working on getting everything together for a general release.
Did you get a chance to see if the nice feature of multi-pin reset-release can apply to other Pin control elements ? (Thinking here of Capture and Arm, but there could be others too)
I just fixed an intermittent bug in the USB smart mode that had been making things feel shaky since the beginning, though I wasn't sure if there was really a problem.
The receiver must resynchronize on any line-state change, so that it can sample the incoming state on the upcoming half-bit period. This part was fine, but there was a subtle problem when the re-sync occurred on the expected sample point. It would wind up reading the same state twice. This was easy to get around once I realized the problem - just sample the line states from the prior clock, instead of the current.
I had set up the test to wait a random amount of clocks before starting, so the NCO's phase relative to the receiver's edge sensor would hit every possible combination on repeated downloads.
In 'normal' mid-streamed data, the NCO should not creep enough between edges, to shift far enough to hit a sample point ?
- but I can see the very first edge has to be tolerant of any phase, so that is a good one to catch.
It was that first edge where the trouble was occurring.
I had a panic moment the other day when the design wouldn't fit into the -A9! Thankfully, it was possible to let a seam out in the britches. I had the compiler on "performance-aggressive" mode, so it was taking more LE's than needed, so I switched it to "area-aggressive" mode and it came in at the ~same Fmax, but only 97% full. That should be enough breathing room to wrap things up.
The plus side of this, is you do have a natural ceiling on features-added !
I'm still working on getting everything together for a general release.
Did you get a chance to see if the nice feature of multi-pin reset-release can apply to other Pin control elements ? (Thinking here of Capture and Arm, but there could be others too)
Wouldn't it be arm, then capture? I'm still not getting it.
Did you get a chance to see if the nice feature of multi-pin reset-release can apply to other Pin control elements ? (Thinking here of Capture and Arm, but there could be others too)
Wouldn't it be arm, then capture? I'm still not getting it.
Yes, for an external capture that is correct, Arm, then Capture.
The Arm part is needed as multi-pin same-edge, but the capture is external event based, so may be some number of edges later.
If 2 or more cells map to the same pin, then that capture is equal-sysclk-sync of the external edge.
For a Core-triggered Capture, (aka timer based) it would be just Capture, so I perhaps should have written
Capture (internal), and/or Arm(internal)+Capture(external)
I just made a little code to help test USB...
It assumes that for a little while, it's OK just to capture low speed data by waiting 53 clocks between samples.
It waits for a pause in the stream and the records to HUB.
Then, it processes a little to say if J or K or 0 and sends this over serial to Prop Serial Terminal.
A modified version of Chip's example is also running with data output to port A lower 16 bits.
This data is also recorded and shown.
Here's a sample of the data and the Spin file. It appears that the first packet is not recognized at all, but maybe one byte in the second one is. (see text file).
Note: I've got DM and DP on Pins 32 and 33 and also jumpered to 34 and 35 so I can see actual states with INB.
Taking a look at this data from low speed USB joystick, I see the first packet is an IN command, followed by 7 bit device#, 4-bit function#, and 5-bit CRC.
Hopefully, I have the USB pins configured wrong, because this is not recognized at all... The GetZ state output stays at $6E..
The next packet is a SETUP command followed by a bunch of data. In the middle somewhere, I do see changes in the state output for each byte, but doesn't seem to match the data at all...
It's a little tough with P123 board and USB device under test being on same PC...
Had to move cable around until was able to connect USB device and not have P123 be reset.
But, now I can capture very first packet. Looks like "SETUP" command, which I think it what is supposed to happen first.
But, my smart pin USB is not seeing anything there... Going to have to see what I did wrong.
Might move back to P30&31 so can use Chip's code as is...
I just made a little code to help test USB...
It assumes that for a little while, it's OK just to capture low speed data by waiting 53 clocks between samples.
You could unroll the loop slightly, to a repeating [53+53+54], to extend that 'for a little while', then handle it like a UART : Wait for SE0 end, delay 27 and sample until next SE0
Comments
NXP LCP1850 have full host features and they do D+ first.
It may just come down to pick a default connector, (mini/micro?) and choose that ordering.
If series resistors are suggested (seems likely, for Rise times, and impedance as well as ESD) then you can swap on one layer
so the detail becomes more moot.
IMHO the connector is not a valid reason for choosing position of D+ and D-.
The reason I chose for the preferred positioning was from a hardware and software mix as I tried to explain a few posts above. By configuring the circuit in a smart way, it is possible to utilise the microUSB connector and prop pins so they can perform the function of...
* USB
* Serial TX and RX
* PS2 (keyboard or mouse)
* I2C (SDA and SCL)
* 1-pin TV & Keyboard
I2C is used in the Wii controllers, so they can be hooked in via a modified cable.
All devices can be connected to the microUSB using a modified microUSB to xxx cable.
For this to work nicely, the governing factor for me was that the USB keyboard that can be used in PS2 mode, has CLK on D+ and DATA on D-. Having settled that DATA has to be on D- and CLK on D+, I looked into the Serial Port P30/31. Since P31 is SI (RXD) and P30 is SO (TXD) I use these in my 1-pin TV & Keyboard, and since P31 is an input I use this as the 1-pin Keyboard (DATA) pin.P31 is an output, so its logical to use this as the 1-PIN TV output pin.
Anyway, see the diagram in the post below.
Therefore my preference is for the D- to be on the Upper ODD pin and D+ to be on the Lower EVEN pin.
PostEdit FWIW there is probably as much argument to have it the other way around.
I am presuming that because you are only having a USB receiver for one pin and a USB transmitter for the adjacent pin pair, that you are not going to be able to configure the pin/+1/+2/+3/-1/-2/-3 ?
As most likely, we will never require a large number of USB ports, might another way to reduce silicon might be to have only 1 usb tx and 1 usb rx per "n" pins (say 4/8?) and use the pin/+1/+2/+3/-1/-2/-3 mechanism to select the appropriate pin ???
Another possibility, could it be possible that the usb rx and tx be inside each cog ??? That would cut down from 32 sets to 16 sets. Could the clock then be fed back on the extra pin(s) you have for the smart pin comms?
Interesting ideas. Right now, the even pin is the brain and the odd pin is passive and accepts its I/O control from its even companion pin when in USB mode. So, there already are 16 sets of USB circuits. Doing it this way, USB in the smart pins is shaping up to take only half the logic that the serial modes use.
Does that mean only some pins (lower 32?) can do USB ?
BTW I think you could utilise the same concept (one Tx and one Rx per pair of pins) for Async. I cannot forsee a design requiring more than 32 serial ports, so provided you can just enable the Rx and Tx separately, this could save some die space.
Meanwhile I have progressed quite a bit with snooping the LS USB keyboard.
Anxiously waiting for a de-nano image to run at 96MHz so I can get USB running.
That could be a useful option, if the die area proves too tight.
Yes, it does save die area, but does force a design that wants eg 1 Rx and 8 or 16 Tx, to use every alternate pin, which is going to compromise byte/word-wide LCD screens & streamer.
Worst case would be the cry "Argh! I have enough pins, but I cannot access them"
It also limits Serial ports to 32, instead of eg 1 Rx 63 Tx
96MHz is not that critical anymore for LS (or even FS), and I'm not sure overclocking a part you are not sure works, is a good approach ?
80MHz should be ok, (certainly at LS) or 84MHz for less frequent residual jitter.
An application that required 1 rx and 8/16 tx or vice-versa along with byte/word-wide screen or more than 32 serial ports would be pretty rare so pairing pins like that would be worth it if the die space was tight.
Any chance of a BeMicro build?
I could do it. Once I get it set up, it will be a 15 minute routine.
96MHz is just easy to manually snoop the bus without smart pins. Let's just forget this for now and I will see how it goes. I have the snooping and some decoding done for LS on P1 now.
When you have a de-nano release with a few smart pins, I will try out the read first, snooping and decoding at LS. Once running I will try FS.
When working I can switch to being a serial port responding as required.
I know this is the longer approach, but I want to fully understand the bus transactions as there are often quirks that specs don't tell you. Back in the 80's I did the low levels of an IBM 3270 emulator for Apple //e and ///, and there were a lot of quirks that we needed to implement.
BTW would a bemicro cv build have more cogs and hub memory than a de-nano? I have both. No point in doing bemicro cv if it doesn't give any more cogs.
Interesting. I suppose you could make the same argument for synchronous serial. In the half-duplex case, I think you could tie both the RX and TX cell to the same pin, then enable only one at a time.
Also, you could get rid of separate RX/TX mode selections, as the LSB of the cell index would provide the same information.
I wonder, though, how much of a difference limiting ASNYC and SYNC serial this way would make in circuit size...
And my point was meant for both sync and async to do likewise although I only mentioned async. Likewise I think we could do only the lab one way and msb the other. Or perhaps just do async correctly both ways. Anyway, once we have a good handle on these we can offer suggestions to reduce the silicon.
The receiver must resynchronize on any line-state change, so that it can sample the incoming state on the upcoming half-bit period. This part was fine, but there was a subtle problem when the re-sync occurred on the expected sample point. It would wind up reading the same state twice. This was easy to get around once I realized the problem - just sample the line states from the prior clock, instead of the current.
I had set up the test to wait a random amount of clocks before starting, so the NCO's phase relative to the receiver's edge sensor would hit every possible combination on repeated downloads.
I'm still working on getting everything together for a general release.
I had a panic moment the other day when the design wouldn't fit into the -A9! Thankfully, it was possible to let a seam out in the britches. I had the compiler on "performance-aggressive" mode, so it was taking more LE's than needed, so I switched it to "area-aggressive" mode and it came in at the ~same Fmax, but only 97% full. That should be enough breathing room to wrap things up.
- but I can see the very first edge has to be tolerant of any phase, so that is a good one to catch.
The plus side of this, is you do have a natural ceiling on features-added !
Did you get a chance to see if the nice feature of multi-pin reset-release can apply to other Pin control elements ? (Thinking here of Capture and Arm, but there could be others too)
It was that first edge where the trouble was occurring.
Wouldn't it be arm, then capture? I'm still not getting it.
Yes, for an external capture that is correct, Arm, then Capture.
The Arm part is needed as multi-pin same-edge, but the capture is external event based, so may be some number of edges later.
If 2 or more cells map to the same pin, then that capture is equal-sysclk-sync of the external edge.
For a Core-triggered Capture, (aka timer based) it would be just Capture, so I perhaps should have written
Capture (internal), and/or Arm(internal)+Capture(external)
It assumes that for a little while, it's OK just to capture low speed data by waiting 53 clocks between samples.
It waits for a pause in the stream and the records to HUB.
Then, it processes a little to say if J or K or 0 and sends this over serial to Prop Serial Terminal.
A modified version of Chip's example is also running with data output to port A lower 16 bits.
This data is also recorded and shown.
Here's a sample of the data and the Spin file. It appears that the first packet is not recognized at all, but maybe one byte in the second one is. (see text file).
Note: I've got DM and DP on Pins 32 and 33 and also jumpered to 34 and 35 so I can see actual states with INB.
Hopefully, I have the USB pins configured wrong, because this is not recognized at all... The GetZ state output stays at $6E..
The next packet is a SETUP command followed by a bunch of data. In the middle somewhere, I do see changes in the state output for each byte, but doesn't seem to match the data at all...
Had to move cable around until was able to connect USB device and not have P123 be reset.
But, now I can capture very first packet. Looks like "SETUP" command, which I think it what is supposed to happen first.
But, my smart pin USB is not seeing anything there... Going to have to see what I did wrong.
Might move back to P30&31 so can use Chip's code as is...
You could unroll the loop slightly, to a repeating [53+53+54], to extend that 'for a little while', then handle it like a UART : Wait for SE0 end, delay 27 and sample until next SE0
Are you testing with an HID USB peripheral, or generated waveforms, or something else ?
I like the 53:53:54 idea, jmg. That might allow long term capture...
But, I think we need to start capture on the KJKJ... sync sequence.
I often bring my Prop toys with me on travel, but very rarely actually use them. Same with my running gear...
I'm just looking at waveforms on a scope and logic analyzer.