SNTP suddenly returning 2/1/2014 00:01:58 all the time
xanatos
Posts: 1,120
It may be an issue with only the Smile network I am on with these things, but after months of running perfectly, the SNTP time thing is suddenly only returning 2/1/2014 as the date, and something like 00:58:08 (it varies). I've tried MANY different servers, all the same.
I'm guessing that somebody loaded some firewall filter or something... maybe blocked UDP?
Has anyone seen this 2/1/2014 thing before?
I'm in a location where I do not have access to a different network, so I can't check from outside their network.
All the units I have running started exhibiting the same behavior at the same time. All on the same network.
Thanks for any thoughts.....
Dave
I'm guessing that somebody loaded some firewall filter or something... maybe blocked UDP?
Has anyone seen this 2/1/2014 thing before?
I'm in a location where I do not have access to a different network, so I can't check from outside their network.
All the units I have running started exhibiting the same behavior at the same time. All on the same network.
Thanks for any thoughts.....
Dave
Comments
Not sure what you mean by not being able to "check from outside their network."
Are you using a time server they have deployed inside their firewall? And if you have multiple spinnerets using SNTP, are they all trying to connect to the same time server? If so, it's probably the time server that has issues, but that doesn't seem very likely.
If the RTC loses it's time for some reason (like an extended power outage that lasts long enough for the super-capacitor to die off) it will start up at some default date/time. If you're not successfully connecting to the ntp server and getting the current date/time, then that could explain it.
By "can't access from an outside network", I mean:
All of my "boxes" are on an intranet (192.168.0.xx) within a large corporate network that spans a few states.
When I develop my boxes, I do so from my home internet connection which has a regular "real" IP address that isn't in the 192.168..... range. That would be what I mean by an "outside network".
I suspect that the SNTP stuff would work just fine from my home network. I believe the issue I am having may be specific to the corporate intranet my boxes are installed on.
Unfortunately, where I am physically located is a 5 hour drive from my home network, and I know of no other network upon which I can place my box with its dedicated IP address, in order to test.
I am assuming that anyone else running SNTP 2.01 is NOT seeing 2/1/2014 as a returned date from the time servers...
Mainly, I am wondering where that specific date is coming from. What is populating those fields with that data? I do not see it in the httpserver.spin code, nor in the SNTP spin code, so it has to be originating externally. I'm thinking if I can find out where that 2014 date is originating, I can fix the issue....... or not.
Help appreciated.
D
I even tried a few different SNTP servers from the link below with the same results. I do know for a fact that the original Perl code did work for the Month of December and didn't 'jump' to January reporting negative date values. ...So something has changed it seems.
http://tf.nist.gov/tf-cgi/servers.cgi
Original Perl Script: USED to DEBUG
Latest Perl Script: USED to DEBUG
Output of latest Perl Code:
I'm not sure if there is something goofy with the receive buffer or just a matter of formatting the SNTP packet.
Off to work. I'll continue troubleshooting later.
The Day 229 and Year 222 values from Mike G's debug results ring a bell for some reason. I had a problem like that way back... and I'm having trouble remembering what the deal was. I'm thinking it had something to do with padding values, but I'm not sure. I'll dig through notes and see what I come up with.
I'm travelling now in fact, so I can't yet see if anything's changed at the NIST site's time server output - but I have some units that send me emails and are scheduled to update their time settings at 2:05am, so I'll see what they do at that time.
Right now, the time seems correct on them, just the date is now 1/31/2014. The date is going backward!
Surely The Doctor is afoot somewhere in all this!
Dave
I see from Beaus' Perl output image that it got 3595111077 as the ntp time value. That corresponds to 2013,dec,04, at 01:57:57 GMT. Ok. is GMT - 6 so looks like he ran that at 19:57 on dec 3rd (local time, not adjusting for DST) - Dunno when he ran the code, but it's probably the correct time value. It's something in the display human time method.
I'll take a look at the SNTP and DisplayHumanTime methods and see if anything stands out to me - but it's not been obvious so far.
Dave
The ntp value at
Dec 02, 2013 @ 23:59:59 is 3595017599 or %1101_0110_0100_0111_1001_1101_0111_1111
and
Dec 03,2013 @ 00:00:00 is 3595017600 or %1101_0110_0100_0111_1001_1101_1000_0000
The LSB is all that changed.
I see that the "transmit timestamp" begins at byte 44 thru 51 of the udp response packet from the server.
I don't seem to be having any issues with time or data (re-synchronized several times today to check) and my SNTP object uses byte 44 thru 51 of the received packet to do it's magic.
But the SNTP version 2.01 on the google repository is using bytes 48 thru 55. That's confusing me.
"... It's something in the display human time method. " - Agreed, but I am not sure at the moment to get my head around it... perhaps this weekend I can iron it out.
No worries at all. Errr.. for me, at least! (Just teasin' xanatos!)
@Dave,
Wait- scratch my previous response. The transmit time stamp should be in bytes 40 thru 47 of the udp response packet from the server. I didn't look at the RFC 4330 close enough. And if that's the case, I'm REALLY confused because my Smile shouldn't be working and yours shouldn't either.
Here's the rfc: http://tools.ietf.org/html/rfc4330
Here's my DisplayHumanTime method:
And my SNTP 2.01 is attached...
Oh, and this is in my VAR section of httpserver.spin:
And this is in my initialization section:
Scratching my head still...
I am in awe, is all I can say.
I'm home from my tour and back in a development environment where I can play with this stuff again. For now, all but one of the systems I have in the field have been converted to manual time setting. I have code from two of you right now to check out, but until I understand better how the SNTP packet works and how "Human Time" is derived from the packet, I am hesitant to re-enable SNTP on client devices. Of course when I do, it'll be setting the clock once a month, with an end-user control for them to select manual or automatic time setting, just in case the implemented solution fails under some as-yet unforeseen condition again.
If I find anything useful (unlikely) I'll post it here.
Thanks all for your help so far, especially in verifying I was NOT the only one who was seeing this. I spent HOURS debugging my own program changes assuming I had messed something up, since the time stamping had been working for MONTHS prior to the day the timestamping stopped working... :-)
If anyone suddenly has a brilliant thought on these codes - it'd sure be nice!
Dave
Here's the link to the RFC editor website- every RFC you can think of is there. http://www.rfc-editor.org
And here's RFC 5905. http://www.rfc-editor.org/info/rfc5905
In the code I sent you, check the comments in each of the methods starting with method CreateUDPtimeheader. It's called before ANY of the other methods are called in the object.
The SNTP object (all of our versions) has two purposes.
First- the CreateUDPtimeheader method is the method you call to "build" a UDP packet with the appropriate stuff in it to SEND TO a time server. That's ALL this method does. Elsewhere in your program, you're using another method to actually transmit this packet through the wiznet chip on the spinneret.
Second- ALL the remaining methods except HumanTime in this object pull data out of the UDP packet response that we get back from the time server once we contacted it. You're program calls GetTransmitTimestamp to pull the appropriate bytes of data out of the UDP response packet and "do" something with it. You could call GetReceiveTimestamp if you'd prefer, and the idea behind it is the same- to extract specific bytes from the UDP response packet from the server and "do" something with it.
So... once you call GetTransmitTimestamp, what do you "do" with the data it pulls out of the response packet from the time server? Well, the bytes it pulls out of the UDP response packet from the time server have a specific format to the data. It's 64 bits (8 bytes) of data that represent the NTP TIME value at the time the server responded to the initial packet you sent to it. Basically, bytes 40 thru 47 contain the 64 bits of data. The first 32 bits (4 bytes) represent the SECONDS value. The last 32 bits (4 bytes) is the FRACTION value. We're mainly interested in the SECONDS value. It's the number of seconds that has elapsed since Jan 01, 1900 at 00:00:00.
If you look at your GetTransmitTimestamp method, you'll see it's doing some stuff with bytes 40 thru 47...........
Once we get those bytes, we still have to "do" something with them.
The GetTransmitTimestamp does something with those bytes by calling the Humantime method.
The sole purpose of the Humantime method is to take that seconds value and "decode" it to several bytes. Those bytes represent the year, day, month, etc. etc... and their values are numbers we can identify with.
The problem you're having with this object is in the Humantime method or the GetTransmitTimeStamp method. Either the GetTransmitTimeStamp method is using the wrong bytes, or Humantime isn't decoding those bytes correctly, or both.
The version I sent you has thus far worked flawlessly for me, including the daylight savings time feature. I re-synchronize with 1 of 3 timer servers every 24 hours (If the first one doesn't respond, try the next, then the next if needed.) But, it's not as savvy as the other versions in the way it decodes the time- it's a little slower and uses brute force all the way through. It's reliable, but not elegant.
Keep these "broad based concepts" in mind while you're studying the various versions of the sntp object. When in doubt, just ask yourself what the goal is, and these concepts will help keep you straight.
(Look at page 18 of RFC 5905 to see what bits and bytes represent what in the packet we get from the time server.)
Oh, and please accept my sincerest apologies if I've written anything incorrectly in this response. I hope Mike or Beau (or anyone else!) will quickly correct me if I've gotten something wrong here. For your sake AND mine!
The new method is currently running on the Spinneret manual site. We'll see how it performs over the next week. If all goes well, I'll refactor and update the repo.
That's made me want to make a demo program to pull the bytes out of the whole packet, just to be able to comprehend stuff a little better.
First, in SNTP 2.01, am I misreading something in the GetTransmitTimeStamp Method: Long1 and Long2 get written to by bytes 48-51 and 52-55 respectively, but only Long1 gets passed to HumanTime. Yes/no?
And lastly, I think I found what amounts to probably an utterly meaningless bug, but in SNTP 2.01,
PUB GetPoll uses byte[BufferAddress][10]
and
PUB GetPrecision ALSO uses byte[BufferAddress][10]
From what I read at the head of SNTP 2.01 - GetPrecision should be using byte 11.
I changed this in my code, but of course it made no difference to anything that I could see, anyway. Am I correct in this, or is there a reason that they should both use byte 10?
Thanks,
Dave
So after trying a bunch of different stuff and seeing how it changes the output date string (the time string has been good), I found that just adding a -1 to the line that calculates years to get a line of Years := (Days / 365) - 1 results not only in the year going from 2014 back to 2013, but also the month and date are now good!
Now I in no way believe that this is a "fix" - because I suspect that this will work until some future date in which the timestamp returned by the server will calculate out wrong with that -1 in there. But it does make the whole string good - for now - so I feel, perhaps wrongly, that I am getting closer to understanding where stuff is happening in these codes.
In previous posts I have seen actual return data specified as it would appear in the Transmit Time Stamp, such as:
Dec 02, 2013 @ 23:59:59 is 3595017599 or %1101_0110_0100_0111_1001_1101_0111_1111
and
Dec 03,2013 @ 00:00:00 is 3595017600 or %1101_0110_0100_0111_1001_1101_1000_0000
If I were to manually assign/overwrite these values into the values received for TransmitTimeStamp in SNTP 2.01, would I be able to manually simulate various times and dates to test code for proper function at various random future dates?
Thanks.
Here's the full HumanTime Method from SNTP 2.01:
long1 = yes. Only long1 gets passed to humantime.
GetPrecision= yes, it should be using byte[11]. I sent you an email earlier about this.
If you're using that method, it matters. If you're just using GetTransmitTimestamp, it doesn't. The method has the same goal as the GetTransmitTimestamp: to pull specific data out of the udp packet response from the time server and let you do something with it.
From what you've said about your application, you are NOT using that method. The only three methods you use at all from the sntp object are the CreateUDPTimeHeader, GetTransmitTimestamp and Humantime.
The rest of the methods can be commented out- they aren't needed at all to get the rtc clock updated.
Lastly- you can easily create a test program for your HumanTime method.
I'm trying to bull my way through this code... I'm determined to figure out HOW it works, rather than rely on someone else to fix it for me :-) So as yet I haven't yet really looked at the other examples - I'm saving those for when I run out of options.
What I did is as I described above - I manually substituted the TransmitTimestamp value for the date code of 3595017600. I checked the binary you had given for that date/time and verified it as correct, so I replaced the BufferAddress values with the binary as follows:
And the date returned by HumanTime is 12/02/2013 18:59:59 - which should be Dec 03,2013 @ 00:00:00, correct?
I verified the date as 12/3 by my own calculations as shown below. So the issue must be in the HumanTime method - would you concur?
My math (numbers in parentheses are actual calculator values, shown as truncated since the Propeller doesn't pay any attention to anything after the decimal point):
Dec 03,2013 @ 00:00:00 is 3595017600 or %1101_0110_0100_0111_1001_1101_1000_0000
3595017600 / 31,536,000 = 113(.9972602739726) years = 113 years in Propeller Terms
LYrs = Years/4 = 28(.49931506849315) = 28 in Propeller Terms
Days = seconds / 86400 secs/day = 41,609
Days THIS year = Days - (years * 365) = 113 * 365 = 41,609 - 41,245 = 364,
Less Leap Year day count = 364 - 28 = 336
Day 336.
January 31
F 28
M 31
A 30
M 31
J 30
J 31
A 30
S 30
O 31
N 30 333
D 31
336 is Dec 3
Add years to 1900 = 1900 + 113 = 2013
3595017600 is December 3 2013.
The date and time you got of 12/02/2013 18:59:59 is almost correct. It should have returned 12/02/2103 @ 19:00:00 because I'm pretty sure you have your time zone of -5 which is the offset parameter passed to the humantime method. That's why it didn't return Dec 03. It subtracts 5 hours for you.
Looking through your math stuff now. Going over and over the various revisions and trials is getting old
If you add something like
you'll have your remaining seconds that are still unaccounted for. That value can be used to figure out the hour, minute, second.
Thanks, that's where I was looking to head next, your timing is perfect.
On that 86400 issue, this line:
Is shifting right 7 bits and dividing by 675 the exact same as dividing by 86,400? When I tried dividing by 86400 straight, I got a date of 1899. I'm assuming this is because 86,400 is greater than a 16 bit word, or...???
Thanks. I appreciate your watching this thread. I've been going through this pretty much all day now. I know it gets OLD fast! :-)
Dave
And uhhh... no. They're not quite the same.
Suppose seconds = 20_000_000
seconds / 86400 = 231
(seconds >>= 7) / 675 = 231 .... and +1 = 232