As far as I know, a streamer can't lock onto an arbitrary external clock source, of any speed. It's principle is for the whole Propeller to be locked to the same master clock as the external devices.
I wonder what this means for ULPI for HS USB or GMII for gigabit Ethernet.
The AD7400 should be able to function with the smart pin synchronous serial mode.
/me looks up ULPI .... "Crystal Oscillator and PLL
When a crystal is attached to the PHY, the internal clock(s) and the external 60MHz interface clock are
generated from the internal PLL. When no crystal is attached, the PHY may optionally generate the internal
clock(s) from an input 60MHz clock provided by the Link."
So, host board (with Propeller) can provide a common 60 MHz reference clock. So no problem keeping that locked. Same story for the 125 MHz example I saw in a 1996 GMII overview pdf.
If wanting both together then it might be tricky since the Prop has to be sync'd to both. Dunno, the PHY's are likely quite forgiving of out-of-spec reference clocking. I'd probably run the USB ULPI at 62.5 MHz.
I've been focusing my analysis on filters constructed from moving averages. Why? Because moving average filters can be implemented cheaply. And because various shapes like a triangle, trapezoid, and Gaussian can be created by running the signals through repeated moving average filters. The sincN family uses all the same length averages, but interesting stuff happens when the lengths are not the same. I'll call them Cascaded Moving Average filters.
First stage averages 31 bits.
Second stage averages 11 numbers from the first stage.
Third stage averages 3 numbers from the second stage.
Convolution is sometimes expressed with the * symbol. We could describe this filter as CMA31*11*3.
I think it could be implemented as follows:
First stage 31 register bits, output 2 bit delta. The full sum would be 5 bits, but the output will only change by 1 count each clock.
Second stage 2x11 register bits, output 3 bit delta. The full sum output of this stage could go up or down by 2 counts or less, depending on the input.
Third stage 3x3 register bits, output 4 bit delta.
Follow it all by 3 accumulators. This compensates for the diffs we added after the sum to reduce the amount of memory needed.
It might be better to just sum the length 3 term directly.
It would lend it self well to a decimate by 3, but it's too late for me to figure out how tonight.
The sum is 1023. It turns out that the sum of the taps is the product of the lengths. So if you want to construct a filter with a certain sum, factor it and decide how to group the factors together.
Saucy, what have you got that we could implement?
I just achieved a logic reduction with the triple-integrating Tukey 17/32, but your way may even be smaller.
New Tukey 19(15)/34 using same number of taps (45) as now:
1, 2, 4, 6, 8,11,14,17,20,23,26,28,30,32,33 Ramp up total = 255 Tukey/Hann
34,34,34,34,34,34,34,34,34,34,34,34,34,34,34 Plateau total = 510 Tukey
33,32,30,28,26,23,20,17,14,11, 8, 6, 4, 2, 1 Ramp down total = 255 Tukey/Hann
Tukey total = 1020, /4 = 255
Hann total = 510, /2 = 255
Great! I will try this. This will get rid of the $FF clamping circuits. Just need to see how the output looks with our ADC bitstream. The Tukey 17/32 is pretty sweet.
New Tukey 19(15)/34 using same number of taps (45) as now:
1, 2, 4, 6, 8,11,14,17,20,23,26,28,30,32,33 Ramp up total = 255 Tukey/Hann
34,34,34,34,34,34,34,34,34,34,34,34,34,34,34 Plateau total = 510 Tukey
33,32,30,28,26,23,20,17,14,11, 8, 6, 4, 2, 1 Ramp down total = 255 Tukey/Hann
Tukey total = 1020, /4 = 255
Hann total = 510, /2 = 255
With this triple-integrating method, we are set free from some old strictures. All we do is define the slope-change points along the tap chain. We could have multiple filters for very little logic cost.
My head is rotating to simplify the explanation what a SigmaDelta ADC does. Now I believe to have such a simple explanation, even I can follow:
If you integrate a function, like: Y(x) = x, the result is 1/2*x^2. If now you differentiate, (1/2*x^2)' = x. That's easy, but what is it good for?
In calculus, delta x is going to zero, but when using a discrete delta x, we can choose different values for x at the integration and differentiation. So we can integrate the incoming signal (the bit stream) over a given time intervall to have a stream of lower frequency numbers. These numbers now can further be integrated (applying a scaled window) to receive a second, now filtered stream of values. If we like, we can take theses values as a result or we differentiate the data stream in a certain interval to get appropriate results, high frequency, low resolution or low frequency, high resolution.
After we got the principle, we are free to complicate implementation whatever way we want.
Separate from Erna, but on the theory side of things, I've been thinking that any clumpiness in the bitstream has to be in the closed loop lag of inverters, flip-flop and feedback.
Separate from Erna, but on the theory side of things, I've been thinking that any clumpiness in the bitstream has to be in the closed loop lag of inverters, flip-flop and feedback.
I have yet to see the evidence though.
That is exactly right. By reducing the integrator capacitors to 1/8 their current value, the next silicon's ADC's will not have clumpiness in their bitstreams at high clock frequencies and they will pass higher input frequencies much better.
My head is rotating to simplify the explanation what a SigmaDelta ADC does. Now I believe to have such a simple explanation, even I can follow:
If you integrate a function, like: Y(x) = x, the result is 1/2*x^2. If now you differentiate, (1/2*x^2)' = x. That's easy, but what is it good for?
In calculus, delta x is going to zero, but when using a discrete delta x, we can choose different values for x at the integration and differentiation. So we can integrate the incoming signal (the bit stream) over a given time intervall to have a stream of lower frequency numbers. These numbers now can further be integrated (applying a scaled window) to receive a second, now filtered stream of values. If we like, we can take theses values as a result or we differentiate the data stream in a certain interval to get appropriate results, high frequency, low resolution or low frequency, high resolution.
After we got the principle, we are free to complicate implementation whatever way we want.
And I can even understand everything you just said, for the most part.
New Tukey 19(15)/34 using same number of taps (45) as now:
1, 2, 4, 6, 8,11,14,17,20,23,26,28,30,32,33 Ramp up total = 255 Tukey/Hann
34,34,34,34,34,34,34,34,34,34,34,34,34,34,34 Plateau total = 510 Tukey
33,32,30,28,26,23,20,17,14,11, 8, 6, 4, 2, 1 Ramp down total = 255 Tukey/Hann
Tukey total = 1020, /4 = 255
Hann total = 510, /2 = 255
With this triple-integrating method, we are set free from some old strictures. All we do is define the slope-change points along the tap chain. We could have multiple filters for very little logic cost.
Yes, I mentioned taps but they no longer exist as separate entities in the logic now and the "same size of shift register" would have been more accurate.
I think it might be difficult to improve on this latest optimization. The objective is to find the area under the Tukey curve every clock and the way Chip has done it is by doing a double differentiation to find the differences between the slopes, then a triple integration. Very clever. It would be a great pity not to have this scope mode in rev B now and I am sure it could be done, in some form or another.
Here is comparison of Tukeys 17/32 and 19/34, showing the up slopes:
That is exactly right. By reducing the integrator capacitors to 1/8 their current value, the next silicon's ADC's will not have clumpiness in their bitstreams at high clock frequencies and they will pass higher input frequencies much better.
Huh, that seems more like hysteresis. I wasn't thinking of that. But I remember you commenting on success with the sims now. I didn't think you were going to actually make a change to the custom area.
New Tukey 19(15)/34 using same number of taps (45) as now:
1, 2, 4, 6, 8,11,14,17,20,23,26,28,30,32,33 Ramp up total = 255 Tukey/Hann
34,34,34,34,34,34,34,34,34,34,34,34,34,34,34 Plateau total = 510 Tukey
33,32,30,28,26,23,20,17,14,11, 8, 6, 4, 2, 1 Ramp down total = 255 Tukey/Hann
Tukey total = 1020, /4 = 255
Hann total = 510, /2 = 255
With this triple-integrating method, we are set free from some old strictures. All we do is define the slope-change points along the tap chain. We could have multiple filters for very little logic cost.
Yes, I mentioned taps but they no longer exist as separate entities in the logic now and the "same size of shift register" would have been more accurate.
I think it might be difficult to improve on this latest optimization. The objective is to find the area under the Tukey curve every clock and the way Chip has done it is by doing a double differentiation to find the differences between the slopes, then a triple integration. Very clever. It would be a great pity not to have this scope mode in rev B now and I am sure it could be done, in some form or another.
Here is comparison of Tukeys 17/32 and 19/34, showing the up slopes:
I just thought of two more optimizations in the shower:
First, rather than shift the 8-Bit result down for the shorter Hann filter, we can shift the 5-bit 'delta' up before adding it into 'inta'.
Second, we only need to clear every other bit in the tap chain on reset, because no cog can reset and release a pin in less than two clocks. So, the other bits will be cleared on the second clock. This will save 24 AND gates per smart pin.
That is exactly right. By reducing the integrator capacitors to 1/8 their current value, the next silicon's ADC's will not have clumpiness in their bitstreams at high clock frequencies and they will pass higher input frequencies much better.
Huh, that seems more like hysteresis. I wasn't thinking of that. But I remember you commenting on success with the sims now. I didn't think you were going to actually make a change to the custom area.
It's just a wiring change.
I will also decrease the gate lengths of the FETs in the VCO inverters to get the VCO to 450MHz.
Wendy at ON Semi is working on getting us to an official 200MHz.
getting rid of the cap: imagine, you have two current sources connected in series. This case, the current through both of the sources is identical.
Now connect a cap at the connection point and then the two currents can be different as some current can flow to the cap, but the voltage at the center point will change.
In our case the input current source is a resistor, so this resistor creates a current proportional to the input voltage. And here is the pit: if the caps voltage changes, the voltage across the resistor is no longer proportional to the input voltage alone. A little cap influences linearity, a larger cap make the whole circuit less sensitive, leading to hysteresis. If the input resistor is a real current source, we are fine, even if the center voltage changes a little. The output current source is a PWM-ed voltage to a resistor, so we face the same problem.
However, the world faces problems more severe than this ADC, let us solve those first ;-)
getting rid of the cap: imagine, you have two current sources connected in series. This case, the current through both of the sources is identical.
Now connect a cap at the connection point and then the two currents can be different as some current can flow to the cap, but the voltage at the center point will change.
In our case the input current source is a resistor, so this resistor creates a current proportional to the input voltage. And here is the pit: if the caps voltage changes, the voltage across the resistor is no longer proportional to the input voltage alone. A little cap influences linearity, a larger cap make the whole circuit less sensitive, leading to hysteresis. If the input resistor is a real current source, we are fine, even if the center voltage changes a little. The output current source is a PWM-ed voltage to a resistor, so we face the same problem.
However, the world faces problems more severe than this ADC, let us solve those first ;-)
That's not exactly how it works. The ADC side of the resistor is held at VIO/2 and the the resistor voltage is converted to current. So, current, not voltage, is used on both sides of the integrating capacitor, for input and feedback. This gets around that problem you are picturing. So, it doesn't matter what the voltage on the cap is, as long as it is not so close to the rails that the current sinks and sources become non-linear. Converting the input voltage to current made the ADC work as well as it does. It also enabled very high-speed feedback through extremely-low-parasitic-capacitance switches that are actually 1.8V NMOS devices, but biased in a way where they are never stressed in the the 3.3V environment.
My head is rotating to simplify the explanation what a SigmaDelta ADC does. Now I believe to have such a simple explanation, even I can follow:
If you integrate a function, like: Y(x) = x, the result is 1/2*x^2. If now you differentiate, (1/2*x^2)' = x. That's easy, but what is it good for?
In calculus, delta x is going to zero, but when using a discrete delta x, we can choose different values for x at the integration and differentiation. So we can integrate the incoming signal (the bit stream) over a given time intervall to have a stream of lower frequency numbers. These numbers now can further be integrated (applying a scaled window) to receive a second, now filtered stream of values. If we like, we can take theses values as a result or we differentiate the data stream in a certain interval to get appropriate results, high frequency, low resolution or low frequency, high resolution.
After we got the principle, we are free to complicate implementation whatever way we want.
Look up "Van Der Pol" oscillator on Wikipedia, as there is a type of oscillator which is intermediate between a so called relaxation oscillator, which is useful for pulse counting, reading potentiometers, etc., and the simple harmonic oscillator, which ideally puts out sine waves. In theory a Van Der Pol and oscillate in either mode; or it can be tricked into outputting pure noise - or what seems like pure noise; when actually it is a form of deterministic chaos; being somewhat less chaotic than Mandelbrot, Julia, etc. Which means that theoretically; it is possible to do successive approximation without an SAR register and associated DAC convertor; as if just somehow you could get the "noise" mode to magically oscillate in a wildly chaotic manner swinging from rail to rail with a frequency and duty cycle that varies in proportion to the current bias signal. Then ideally you would just somehow is if by magic, get serialized bitstream at 16 bits of resolution or more in just 16 clock cycles (impossible - but never say never) as the modulator in the analog part of the circuit "hunts" to achieve a balance, which when sampled gives a 1010101010 … 50% duty cycle pattern with a slight bias proportional to the average - long term DC. Now the Kolmogorov complexity of the information entropy contained in the bitstream would therefore suggest that it is possible to construct a Trellis in the digital domain that acts on the bitstream to recover latent high resolution data that is of a quality similar to the way early analog video tape worked by converting to FM. But that would mean that the bitstream has to contain three or four signals; which it does … one being the long term average which can be extracted by pulse counting, then there is the switching back and forth of a modulator, which is an FM like signal, but it is not FM - still it is easily removed by ;decimation; which would seem to imply that there is something else - ideally our of band dithering noise and the in-band signals that we want, because the EXACT analog signal is ideally contained in the delta t between zero crossings; which unfortunately, since the signal is not FM going onto ancient video tape; we don't sample the time of successive zero crossings with femto-second granularity.
O.K., so the mind reels here also; but I just thought of something; and that is that there are functions which are considered impossible to integrate with Newton's calculus, like the zeta function (maybe) or the case where f(x)=0 of x is rational, -1 if x is irrational, and +1 if x is transcendental. However, the simplest explanation that I know of something called Lebesgue integration (or) the Lebesgue measure is that there are methods analogous to integration by parts to apply various abstract transformations which in "effect", "pull out pieces of the puzzle one by one" until nothing is left - but he answer that is.
That's not exactly how it works. The ADC side of the resistor is held at VIO/2 and the the resistor voltage is converted to current. So, current, not voltage, is used on both sides of the integrating capacitor, for input and feedback. This gets around that problem you are picturing. So, it doesn't matter what the voltage on the cap is, as long as it is not so close to the rails that the current sinks and sources become non-linear. Converting the input voltage to current made the ADC work as well as it does. It also enabled very high-speed feedback through extremely-low-parasitic-capacitance switches that are actually 1.8V NMOS devices, but biased in a way where they are never stressed in the the 3.3V environment.
OK, so you already use current sources. I can imagine, they are more simple to implement then resistors ;-) in silicon
That's not exactly how it works. The ADC side of the resistor is held at VIO/2 and the the resistor voltage is converted to current. So, current, not voltage, is used on both sides of the integrating capacitor, for input and feedback. This gets around that problem you are picturing. So, it doesn't matter what the voltage on the cap is, as long as it is not so close to the rails that the current sinks and sources become non-linear. Converting the input voltage to current made the ADC work as well as it does. It also enabled very high-speed feedback through extremely-low-parasitic-capacitance switches that are actually 1.8V NMOS devices, but biased in a way where they are never stressed in the the 3.3V environment.
OK, so you already use current sources. I can imagine, they are more simple to implement then resistors ;-) in silicon
Resistors are easy enough. It's just that they have way too much parasitic capacitance at high MHz. These current switches couldn't be lower-parasitic-capacitance than they are, I think.
ErNa, maybe using (zero-capacitance) resistors could actually work because of this: When the integrator cap voltage gets off-center due to the input, causing the voltage-to-current converter resistor on the input to to error, might that error be compensated for by the feedback resistor, which passes more or less current, depending on the imbalance at the integrator cap? Maybe the errors on the input and feedback resistors cancel out?
If resistors could work, we could get rid of the analog front end which converts voltage to current, and the input bandwidth would soar! We would need to drop the input impedance, though. That would be okay for an RF mode.
While climbing to the top is always the way do go, we on the other side have to stay on the ground. The rock solid ground! What we have no to me seems by far to be enough! There is so much to imagine, what can be done with the P2, that the next silicon should be prepared to spread more chips to more people. Those will develop applications.
Only then the next silicon will be designed and you can do whatever you want. For now: lean back, close the shop in time, spend time with your family and wait for the things to come.
Thats my humble advise. I know from own experience, that is the hardest work to do. But I believe, you can!
For my purposes, what we have now is enough! In some aspects, I reached the limits of P1 (mostly: memory) and now, for the next few years, I see lot of headroom. Today I heard a saying kept me lol: if I transplant my brain into the P2, it will fly backward!
Comments
Besides, the SDRAM is already on the P123 board.
I wonder what this means for ULPI for HS USB or GMII for gigabit Ethernet.
The AD7400 should be able to function with the smart pin synchronous serial mode.
When a crystal is attached to the PHY, the internal clock(s) and the external 60MHz interface clock are
generated from the internal PLL. When no crystal is attached, the PHY may optionally generate the internal
clock(s) from an input 60MHz clock provided by the Link."
So, host board (with Propeller) can provide a common 60 MHz reference clock. So no problem keeping that locked. Same story for the 125 MHz example I saw in a 1996 GMII overview pdf.
If wanting both together then it might be tricky since the Prop has to be sync'd to both. Dunno, the PHY's are likely quite forgiving of out-of-spec reference clocking. I'd probably run the USB ULPI at 62.5 MHz.
So, instead of the NCO rolling over, we will use a pin's rise or fall, optionally.
We will find a way. The streamer is the last big thing I need to work on. Saving it for last.
Saucy, what have you got that we could implement?
I just achieved a logic reduction with the triple-integrating Tukey 17/32, but your way may even be smaller.
Great! I will try this. This will get rid of the $FF clamping circuits. Just need to see how the output looks with our ADC bitstream. The Tukey 17/32 is pretty sweet.
Oh, I had misunderstood. I thought you were saying I just had to set it correctly.
With this triple-integrating method, we are set free from some old strictures. All we do is define the slope-change points along the tap chain. We could have multiple filters for very little logic cost.
If you integrate a function, like: Y(x) = x, the result is 1/2*x^2. If now you differentiate, (1/2*x^2)' = x. That's easy, but what is it good for?
In calculus, delta x is going to zero, but when using a discrete delta x, we can choose different values for x at the integration and differentiation. So we can integrate the incoming signal (the bit stream) over a given time intervall to have a stream of lower frequency numbers. These numbers now can further be integrated (applying a scaled window) to receive a second, now filtered stream of values. If we like, we can take theses values as a result or we differentiate the data stream in a certain interval to get appropriate results, high frequency, low resolution or low frequency, high resolution.
After we got the principle, we are free to complicate implementation whatever way we want.
I have yet to see the evidence though.
That is exactly right. By reducing the integrator capacitors to 1/8 their current value, the next silicon's ADC's will not have clumpiness in their bitstreams at high clock frequencies and they will pass higher input frequencies much better.
And I can even understand everything you just said, for the most part.
Yes, I mentioned taps but they no longer exist as separate entities in the logic now and the "same size of shift register" would have been more accurate.
I think it might be difficult to improve on this latest optimization. The objective is to find the area under the Tukey curve every clock and the way Chip has done it is by doing a double differentiation to find the differences between the slopes, then a triple integration. Very clever. It would be a great pity not to have this scope mode in rev B now and I am sure it could be done, in some form or another.
Here is comparison of Tukeys 17/32 and 19/34, showing the up slopes:
Tukey 19/34 has slightly smoother transitions at the start and end of the ramps.
Huh, that seems more like hysteresis. I wasn't thinking of that. But I remember you commenting on success with the sims now. I didn't think you were going to actually make a change to the custom area.
I just thought of two more optimizations in the shower:
First, rather than shift the 8-Bit result down for the shorter Hann filter, we can shift the 5-bit 'delta' up before adding it into 'inta'.
Second, we only need to clear every other bit in the tap chain on reset, because no cog can reset and release a pin in less than two clocks. So, the other bits will be cleared on the second clock. This will save 24 AND gates per smart pin.
It's just a wiring change.
I will also decrease the gate lengths of the FETs in the VCO inverters to get the VCO to 450MHz.
Wendy at ON Semi is working on getting us to an official 200MHz.
Now connect a cap at the connection point and then the two currents can be different as some current can flow to the cap, but the voltage at the center point will change.
In our case the input current source is a resistor, so this resistor creates a current proportional to the input voltage. And here is the pit: if the caps voltage changes, the voltage across the resistor is no longer proportional to the input voltage alone. A little cap influences linearity, a larger cap make the whole circuit less sensitive, leading to hysteresis. If the input resistor is a real current source, we are fine, even if the center voltage changes a little. The output current source is a PWM-ed voltage to a resistor, so we face the same problem.
However, the world faces problems more severe than this ADC, let us solve those first ;-)
That's not exactly how it works. The ADC side of the resistor is held at VIO/2 and the the resistor voltage is converted to current. So, current, not voltage, is used on both sides of the integrating capacitor, for input and feedback. This gets around that problem you are picturing. So, it doesn't matter what the voltage on the cap is, as long as it is not so close to the rails that the current sinks and sources become non-linear. Converting the input voltage to current made the ADC work as well as it does. It also enabled very high-speed feedback through extremely-low-parasitic-capacitance switches that are actually 1.8V NMOS devices, but biased in a way where they are never stressed in the the 3.3V environment.
Look up "Van Der Pol" oscillator on Wikipedia, as there is a type of oscillator which is intermediate between a so called relaxation oscillator, which is useful for pulse counting, reading potentiometers, etc., and the simple harmonic oscillator, which ideally puts out sine waves. In theory a Van Der Pol and oscillate in either mode; or it can be tricked into outputting pure noise - or what seems like pure noise; when actually it is a form of deterministic chaos; being somewhat less chaotic than Mandelbrot, Julia, etc. Which means that theoretically; it is possible to do successive approximation without an SAR register and associated DAC convertor; as if just somehow you could get the "noise" mode to magically oscillate in a wildly chaotic manner swinging from rail to rail with a frequency and duty cycle that varies in proportion to the current bias signal. Then ideally you would just somehow is if by magic, get serialized bitstream at 16 bits of resolution or more in just 16 clock cycles (impossible - but never say never) as the modulator in the analog part of the circuit "hunts" to achieve a balance, which when sampled gives a 1010101010 … 50% duty cycle pattern with a slight bias proportional to the average - long term DC. Now the Kolmogorov complexity of the information entropy contained in the bitstream would therefore suggest that it is possible to construct a Trellis in the digital domain that acts on the bitstream to recover latent high resolution data that is of a quality similar to the way early analog video tape worked by converting to FM. But that would mean that the bitstream has to contain three or four signals; which it does … one being the long term average which can be extracted by pulse counting, then there is the switching back and forth of a modulator, which is an FM like signal, but it is not FM - still it is easily removed by ;decimation; which would seem to imply that there is something else - ideally our of band dithering noise and the in-band signals that we want, because the EXACT analog signal is ideally contained in the delta t between zero crossings; which unfortunately, since the signal is not FM going onto ancient video tape; we don't sample the time of successive zero crossings with femto-second granularity.
O.K., so the mind reels here also; but I just thought of something; and that is that there are functions which are considered impossible to integrate with Newton's calculus, like the zeta function (maybe) or the case where f(x)=0 of x is rational, -1 if x is irrational, and +1 if x is transcendental. However, the simplest explanation that I know of something called Lebesgue integration (or) the Lebesgue measure is that there are methods analogous to integration by parts to apply various abstract transformations which in "effect", "pull out pieces of the puzzle one by one" until nothing is left - but he answer that is.
Resistors are easy enough. It's just that they have way too much parasitic capacitance at high MHz. These current switches couldn't be lower-parasitic-capacitance than they are, I think.
ErNa, maybe using (zero-capacitance) resistors could actually work because of this: When the integrator cap voltage gets off-center due to the input, causing the voltage-to-current converter resistor on the input to to error, might that error be compensated for by the feedback resistor, which passes more or less current, depending on the imbalance at the integrator cap? Maybe the errors on the input and feedback resistors cancel out?
If resistors could work, we could get rid of the analog front end which converts voltage to current, and the input bandwidth would soar! We would need to drop the input impedance, though. That would be okay for an RF mode.
Only then the next silicon will be designed and you can do whatever you want. For now: lean back, close the shop in time, spend time with your family and wait for the things to come.
Thats my humble advise. I know from own experience, that is the hardest work to do. But I believe, you can!
For my purposes, what we have now is enough! In some aspects, I reached the limits of P1 (mostly: memory) and now, for the next few years, I see lot of headroom. Today I heard a saying kept me lol: if I transplant my brain into the P2, it will fly backward!