Windows Azure 上的时钟同步质量?时钟、质量、Windows、Azure

2023-09-07 16:21:36 作者:年華已逝ァ

我正在寻找有关 Windows Azure 上虚拟机之间时钟偏移的定量估计 - 假设所有虚拟机都托管在同一个数据中心.我猜测一个 VM 和另一个 VM 之间的平均时钟偏移低于 10 秒,但我什至不确定它是否是 Azure 云的保证属性.

I am looking for quantitative estimates on clock offsets between VMs on Windows Azure - assuming that all VMs are hosted in the same datacenter. I am guesstimating that average clock offset between one VM and another is below 10 seconds, but I am not even sure it's guaranteed property of the Azure cloud.

有没有人对此进行定量测量?

Has anybody some quantitative measurements on that matter?

推荐答案

我终于决定自己做一些实验了.

I have finally settled to do some experiments on my own.

关于实验方案的一些事实:

A few facts concerning the experiment protocol:

我没有寻找到 参考时钟 的偏移量,而是简单地检查了 Azure 虚拟机 和 Azure 存储之间的时钟差异.已使用下面粘贴的 HTTP hack 检索到 Azure 存储的时钟时间.已在 Azure 的北欧数据中心内使用 250 个小型 VM 进行了测量.使用 Stopwatch 测量的存储和虚拟机之间的延迟对于极简的未经身份验证的请求始终低于 1 毫秒(基本上 HTTP 请求返回时出现 400 个错误,但仍然带有 Date:在 HTTP 标头中可用). Instead of looking for offset to an reference clock, I have simply checked clock differences between Azure VMs and the Azure Storage. Clock time of the Azure Storage has been retrieved using the HTTP hack pasted below. Measurements have been done within the North Europe datacenter of Azure with 250 small VMs. Latency between storage and VMs measured with Stopwatch was always lower than 1ms for minimalistic unauthenticated requests (basically HTTP requests were coming back with 400 errors, but still with Date: available in the HTTP headers).

结果:

大约 50% 的虚拟机与存储的时钟偏移大于 1 秒.大约 5% 的虚拟机与存储的时钟偏移大于 2 秒.不到 1% 的时钟偏移观测值接近 3 秒.少数接近 4s 的异常值.单个虚拟机和存储之间的时钟偏移量通常会从一个请求到下一个请求变化为 +1/-1 秒.

所以从技术上讲,我们距离 2s 容差目标并不太远,尽管对于数据中心内同步,您不必将实验推得太远就能观察到 接近 4s 的偏移量.如果我们假设时钟偏移为正态(又名高斯)分布,那么我会说依赖任何低于 6 秒的时钟阈值必然会导致调度问题.

So technically, we are not too far from the 2s tolerance target, although for intra-data-center sync, you don't have to push the experiment far to observe close to 4s offset. If we assume a normal (aka Gaussian) distribution for the clock offsets, then I would say that relying on any clock threshold lower than 6s is bound to lead to scheduling issues.

/// <summary>
/// Substitute for proper NTP (Network Time Protocol) 
/// when UDP is not available, as on Windows Azure.
/// </summary>
public class HttpTimeChecker
{
    public static DateTime GetUtcNetworkTime(string server)
    {
        // HACK: we can't use WebClient here, because we get a faulty HTTP response
        // We don't care about HTTP error, the only thing that matter is the presence
        // of the 'Date:' HTTP header
        var tc = new TcpClient();
        tc.Connect(server, 80);

        string response;
        using (var ns = tc.GetStream())
        {
            var sw = new StreamWriter(ns);
            var sr = new StreamReader(ns);

            string req = "";
            req += "GET / HTTP/1.0
";
            req += "Host: " + server + "
";
            req += "
";

            sw.Write(req);
            sw.Flush();

            response = sr.ReadToEnd();
        }

        foreach(var line in response.Split(new[] { '', '
' }, StringSplitOptions.RemoveEmptyEntries))
        {
            if(line.StartsWith("Date: "))
            {
                return DateTime.Parse(line.Substring(6)).ToUniversalTime();
            }
        }

        throw new ArgumentException("No date to be retrieved among HTTP headers.", "server");
    }
}
 
精彩推荐