Help parsing this data
Brian Carpenter
Posts: 728
I am using the extended full duplex serial object
i need to try to figure out how to parse the following data.· The stuff in bold is what i an looking to store.
IF is UP
DHCP=ON
IP=192.168.0.6:2000
NM=255.255.255.0
GW=192.168.0.1
HOST=70.40.212.215:80· two seperate varialbles in this row.· one before the colon and one after
PROTO=TCP,
MTU=1460
BACKUP=0.0.0.0
any ideas.· right now i am using the = as the delimiter but even that doesnt seem to be working
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
It's Only A Stupid Question If You Have Not Googled It First!!
i need to try to figure out how to parse the following data.· The stuff in bold is what i an looking to store.
IF is UP
DHCP=ON
IP=192.168.0.6:2000
NM=255.255.255.0
GW=192.168.0.1
HOST=70.40.212.215:80· two seperate varialbles in this row.· one before the colon and one after
PROTO=TCP,
MTU=1460
BACKUP=0.0.0.0
any ideas.· right now i am using the = as the delimiter but even that doesnt seem to be working
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
It's Only A Stupid Question If You Have Not Googled It First!!
Comments
What's wrong with using the "=" to delimit the first part of each line that you don't want to keep?
Why not use CR to delimit the last part of the line that you want to keep?
If you know which lines contain a ":", why can't you use that to delimit the IP address, then use CR to delimit the port?
You can switch the delimiter character every time you call rxDec, so you could pick apart the IP addresses once you've gotten past the "=".
You didn't say what you wanted to do with the information, if you wanted them as strings or as numbers. The code above makes them strings.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
It's Only A Stupid Question If You Have Not Googled It First!!
This has the advantage of not caring which order the arguments come.· Note that it's not protected as is from buffer overruns; you'd probably want to limit the buffer++ statement at the end of getstring to prevent this, either by just limiting it to 20 or issuing an abort if it goes over.
Post Edited (localroger) : 8/30/2009 11:52:34 PM GMT
i have looked this over several times. i am sorry i have not responded, but i just dont understand it.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
It's Only A Stupid Question If You Have Not Googled It First!!
what section(s) of the code are you having trouble with? The case statement, what's in the getstring() repeat loop ?
If you break it down, we can clarify it.
- Howard
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
1. The CON block is a little known but very useful SPIN trick to generate arbitrary state values; as written next_dhcp will take the value 1, next_ip the value 2, etc. This "enum" facility is used to create values that have to be different but you don't care just what they are. We'll use them shortly.
2. Skipping to the end, getstring does what SRLM's getstring does, but without his 'ignore' facility. It accepts characters until it finds one that is in the endstr parameter; characters that aren't in endstr get stuffed into the buffer (byte[noparse][[/noparse]buffer] := rxchar / buffer++). In order to tell whether each char is in endchr, we must scan endchr until we reach its null terminator, comparing each byte stored within to the rxchar we just received. If we find one, we stick a null terminator at the end of buffer so the prop's string handlers will know where it ends, and we return the term char so the calling code will know what it is.
3. Back in main, we call getstring with an endstr consisting of the characters "=", ":", and 13 (carriage return). This will terminate input NO MATTER WHAT if we're on a string we find interesting.
4. Once we have a string, we do CASE by tc (the terminating char).
5. If the tc was "=" then what came just before was the data tag -- "DHCP", "IP", etc. We check for various interesting values and set nextis appropriately. Note that at this point we don't have a value yet, but we know what should be coming on the rest of the line; that's what nextis determines.
6. If the tc was ":" or CR, we do CASE by nextis. For example, if nextis is next_dhcp it means we just detected DHCP= at the start of the line, so we know the string sitting in BUFFER is the string ON; we would want to then check its value or strcopy it somewhere useful.
7. The big wrinkle here is HOST, which has two arguments; once NEXTIS is set to next_host (there is an error there in the listing, sorry, it's not _hostip), if we get a string with termchar ":" that means we just got the IP, but if we get termchar=CR it means we just got the port. Process BUFFER appropriately.
If we receive a tag string we don't care about such as PROTO or MTU, nextis will get set to next_undefined which won't match anything when we check for following strings.
If your transmitting device sends LF after CR you might also need to add a line to eliminate those, because they will mess up your detection; this is most profitably done before putting the byte in the buffer by replacing byte[noparse][[/noparse]buffer] := rxchar with if rxchar <> 10 / byte[noparse][[/noparse]buffer] := rxchar. The other logic will still work with that in place. It would also be better practice to use a counter and limit the growth of buffer; I put c in the local vars and didn't use it so you can start before the initial repeat with c := 0 (you need this since local vars aren't initialized to zero), then before replace buffer++ with c++ / if c < 20 / buffer++. This will keep you from overwriting your program if you connect the input to a source of garbage data that doesn't have terminating characters.
Let me know if I can clarify anything else...
on edit -- edited the example to·add some of these bells and whistles
Post Edited (localroger) : 8/30/2009 11:53:07 PM GMT