Bit-Banged Ethernet
hippy
Posts: 1,981
Is it possible to bit-bang Ethernet UDP ( transmission only ) with a Propeller Chip ?
It would need banging at 20MHz so usual bit-banging techniques are out of the question, but could the video hardware / WAITVID be utilised in some way to read a pre-built bit-stream 'packet' from Hub and churn them out ?
Obviously an ENC28J60 or similar solution would be much more useful but just UDP on a one-to-one link isn't without its uses.
It would need banging at 20MHz so usual bit-banging techniques are out of the question, but could the video hardware / WAITVID be utilised in some way to read a pre-built bit-stream 'packet' from Hub and churn them out ?
Obviously an ENC28J60 or similar solution would be much more useful but just UDP on a one-to-one link isn't without its uses.
Comments
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
"... one of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to indicate successful termination of their C programs." -
"If Python is executable pseudocode, then perl is executable line noise."
"The best accelerator available for a Mac is one that causes it to go at 9.81 m/s2."
"My software never has bugs. It just develops random features."
"Windows isn't a virus, viruses do something."
"Programmers are tools for converting caffeine into code."
"Enter any 11-digit prime number to continue."
en.wikipedia.org/wiki/IPv4#Packet_structure
Though, I know someone on here had a bare-bones TCP/IP implementation that you could probably use parts of.
I don't see a problem in building up the bit-stream to send, just in getting them sent out fast enough. It can be done with AVR and any micro / FPGA fast enough ...
http://www.cesko.host.sk/IgorPlugUDP/IgorPlug-UDP (AVR)_eng.htm
http://www.fpga4fun.com/10BASE-T2.html
I did think of doing it with a Cog sequence of just "MOV OUTA,#%10 / MOV OUTA,#%01" but that only gives 494 bits. The FPGA implementation shows 55 bytes for a single byte of UDP data but with Manchester encoding that's 880 bits.
Getting NLP/FLP pulses using "MOV INA,#" is probably the first place to test anything.
To start a new 32-bit chunk, you can just move from some other register. So if you can afford one instruction
per bit, that will get you 20MHz for 400+ bits or about 50 bytes.
If you're careful you can use two cogs to alternate sending a 32-bit chunk; that will give each cog plenty of time to
load the next word. You just need to time things carefully.
Since you are doing Manchester encoding, each bit will transition, so you could perhaps set FRQA to 0x8000_0000
to get the in-bit transitions, and then use the ROL PHSA to get the actual data transitions, and then you have
every other instruction to do other things like load up the next data value to ship.
With that the shifter outputs %10 for every 0-Bit in a long, or %01 for every 1-Bit in a long that is written to the shifter with WAITVID.
The Pre-built bit stream can stay in HubRAM and must be read by the cog with a rate of 20MHz / 32 = 625 kHz, this is no problem !
In this Bit-Stream you have 2 Bits for every Data-Bit to send (one for every state of the Manchester codeing). So for 55 Byte packets you need 55 words in HubMemory (28 longs).
Andy
Unfortunately I'm not very clear on using the Video Shifter and don't have a scope so cannot just throw things at it and see what happens; it's going to have be done 'theoretically' and by trial and error.
Is there any decent, detailed explanation as to how the Video Shifter, associated registers and commands work ? I know there's the Hydra book but I don't have that, and I don't really want to buy third-party books to get what other manufacturers would include in their datasheets.
www.parallax.com/dl/docs/prod/prop/PropellerDSv0.3.pdf
What I would try:
Set CntrA in PLL Mode to 20 MHz
VCFG Register:
- VideoMode to %01 (VGA)
- VGroup to the desired PinGroup
- VPins = %00000011 to enable only the lowest 2 Pins of the Group
VSCL Register:
- PixelClocks = 1
- FrameClocks = 32
WAITVID:
- The ColorPart (source) fix to %xxxxxxxx_xxxxxxxx_xxxxxx10_xxxxxx01
- the PixelPart (destination) holds the Data that is shifted out: a 0 Bit outputs 'Color 0' %xxxxxx01, a 1 Bit 'Color 1' %xxxxxx10
The harder Part is to generate the pre-built Bitstring
Andy
"The harder Part is to generate the pre-built Bitstring" -- I'm glad you say that, because I think that's the easy part ( famous last words ! ), so looks like this could be quite straight forward in total.
Do you have a spare Propeller board? Viewport might provide a solution for you then. It's not an oscilloscope, but it is able to do logic analysis as high speeds.
hope this helps,
Marty
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Lunch cures all problems! have you had lunch?
This is a fine application, I think mostly unknown, as the use of the counters is not very popular - or so it seems.
When I was looking for fast input some time ago I - obviously - tried conter mode 50, FRQA set to 1. This would add to PHSA every quarter of an instruction, so shifting PHSA 3 places should lead to a 20 MHz raw-input. This was when I discovered the "shadow registers"
So I finally came up with the conservative
@10 Mhz
Now to get the Assembler Cog doing something useful with the 'bit array' it gets given. Then on to working out how to do the hardware interfacing. Moving out of my comfort zone now so progress may be a little slower than for the Spin side of things. When I have the code tidied up I'll publish what I currently have.
@ Marty : Viewport looks like it will be very useful. Thanks for the reminder.
@ deSilva : Your tutorial makes a lot more sense now than when I read it before, so I guess I'm starting to get to grips with it. Still slowly edging into the Assembler side of things, and I have to admit that I don't really understand the CTR, FRQ and PHS register usage yet. The problem with ROL PHSA for each bit looks to be that it will still be too tight to be able to fetch the data from Hub without disrupting the bit-stream timing. Off-loading 32-bits at a time for waitvid to churn out still looks the best bet.
But the loop will not allow incrementing
You can either unroll it (for 512 bits will need 128 loops x 3 instructions = 384 instructions)
As you output 8 bits in parallel you can also think of a PISO shiftregister at the output, shifting 8 @ 20MHz = 160 MHz!!! The clock for the shift register can be generated by the second counter...
@deSilvas Tutorial: I was concentrating on an introduction for the Machine Code; I assumed at that time that everybody was acquainted with Video, Counters, COG operations etc. I have lerned in the meantime that this was a total misjudgment. But don't fear the counters, AN001 is excellent, e.g. this sketch gives you 99% of the information you need. The 1% missing is the table of the 64 timer/counter modes
Post Edited (deSilva) : 10/13/2007 2:11:30 PM GMT
I have found the Propeller easier to understand than other devices but the devil is always in the detail. For example, I know I can set a 20MHz clock for CTRA to drive the video shifter, I know that's done by setting PLLDIV in CTRA, but I have to determine what value PLLDIV to use. I know it is to create a division of "VCO", but nowhere in the Propeller Manual can I see anything which says what frequency "VCO" runs at or what "VCO" actually is ( although I know what a generic VCO is ). Reference to "VCO" only appears within the CTRA/CTRB description, same in the Data Sheet. Figure 7 of the Counter App Note shows VCO driven by a "Clock In", but it's not clear where that comes from or what frequency it is either.
It may be very obvious to others and will be to myself once I have done it ( and if I had a scope I could no doubt set PLLDIV, experiment and see what happens to fill in the missing gap of knowledge, but I don't have that option ). Those are the hurdles I find myself coming up against.
Like "How can one eat an entire elephant ?" ... one bite at a time, I'm slowly but surely getting there.
So with the counters: It is BIT 31 of PHSA driving the PLL nothing else -have a look at the sketch above.
This "BIT 31 trick" is also used for the PWM mode, which needs some more attention from the programmer but realizes one (!) PWM pulse just in the same way.
But back to the PLL. When you load the CTRA with the value to be added each system clock (80 MHz), then you will find that the sign bit 31 will not toggle as regularly as you should need.. This is only the case when CTRA consists of just one bit. This is due to "remainders", as adding to PHSA is equivalent to finding out "how often" CTRA goes into 2^32. A remainder will cause a "jitter" of bit 31.
The PLL is used (and has to! Especially when you have large values for CTRA) to equalize those iregularities. So after the PLL has stabilized you will get a very smooth frequency out of the timer.
Note also that PLL = 1 does not mean: "no PLL", but that PLL output is devided by 16, after being multiplied with 16.
Post Edited (deSilva) : 10/13/2007 2:26:04 PM GMT
With FRQA/PHSA + PLLx16 as the "VCO" that makes more sense, although why "VCO" is used instead of referring to PLLx16 output I don't know; maybe I'm not intuitive enough or just don't like guessing what unclear things may mean.
So ... to get my 25nS bit times ( 20MHz clocking frequency for waitvid ), with 5MHz+PLL16x system clock (80MHz), I set FRQA=$8000_0000>>4 (5MHz), after the PLLx16 that is
80MHz, so I use PLLDIV=%101 ( divide by 4 ) and that gives me the 20MHz I'm after.
Have I got that right ?
Added : I got brave. Connected direct to the hub ( 2 x 68R, no magnetics ) the Hub Link Led comes on, the Rx Led flashes once a second so something is getting through. Collision Led also comes on so it's not quite right. Now it's time for poke 'n' hope. In the meantime, if anyone with a scope wants to take a look at what's coming out it would be very helpful.
Post Edited (hippy) : 10/13/2007 5:14:55 PM GMT
We're sending bytes as bits, and the "waitvid colours,#bits" gives a mapping of one byte per one Cog instruction so potentially allowing a 496 packet less overhead in setting things up. It's even better than deSilva's un-rolling the Hub Fetch Loop, just copy/convert the packet bytes at leisure into a sequence of waitvids and then let them run. That's rokicki idea but using immediates not registers.
BTW: This is exactly what Ariba suggested...
Post Edited (deSilva) : 10/13/2007 7:05:02 PM GMT
-Phil
The two parallel Bits are the TX+ and TX- that must be invertet to each other. For that the Color0 is %01 and Color1 %10.
Andy
Post Edited (Ariba) : 10/14/2007 1:59:46 AM GMT
I expand each bit of the raw bit-stream to two bits Manchester encoded ( 16 original bits in a long / 1 word per output byte as you say ), stream them out ( as 32 sequential bits ) and the two colours 0/1 automatically give the bi-phase TX+/TX- lines.
I have to admit that when not thinking straight it is easy to confuse the Manchester encoding with the bi-phase signal production.
Yup, I completely misunderstood the intent of those two bits. It didn't occur to me that it was for the biphase lines. When I saw the word "inverted", I was thinking phase instead of polarity!
-Phil
"A Token for Ship-Boys, or Plain-Sailing, made more plain."
The first use of it in a figurative sense, meaning simple 'easy and uncomplicated' comes in Fanny Burney's Camilla, 1796:
"The rudiments, which would no sooner be run over, than the rest would become plain sailing."
From http://www.phrases.org.uk/meanings/plain-sailing.html
I think about also receiving Ethernet with a Propeller (with bitbanging). The question is: How precise is the 10MHz timing of a sending device, is there a norm for that? If it is precise enough it should be possible to sample the RX+ signal with only 10MHz exact in the middle of the second Halfwave of the Manchester coded bits. The Cog has to wait for the Preamble and samples then a fixed number of bits into the Cog-RAM. After the whole packet is received, it can be decoded and transfered to the Hub-RAM.
Perhaps somebody knows the answer for this questions:
1) Is the source of the Manchester modulation timing always a crystal?
2) What is the minimal Vpp level on the RX+,RX- signals?
3) What is the maximal size of a packet for tcp/ip?
4) How long is the minimal time between 2 packets?
Andy
All modulation timing is always crystal modulated.
Vpp ??
Packets size and time is variable (sender from prop can control it)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.
Sapieha
Maybe it helps
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.
Sapieha