Details
Description
Users have occasionally reported spurious crashes due to Kudu thinking that another node has a time stamp from the future. After some debugging I realized that the issue is that we currently capture the flag 'STA_NANO' from the kernel only at startup. This flag indicates whether the kernel's sub-second timestamp is in nanoseconds or microseconds. We initially assumed this was a static property of the kernel. However it turns out that this flag can get toggled at runtime by ntp in certain circumstances. Given this, it was possible for us to interpret a number of nanoseconds as if it were microseconds, resulting in a timestamp up to 1000 seconds in the future.