延长时间范围

Extending the range of time

延长时间范围

Monday May 02, 2022 by Eddy | Comments

​2022年5月2日星期一 艾迪 | 评论

Until Qt 6.2, QDateTime's ability to take time-zone adjustments – both seasonal daylight-saving time and occasional changes (on the whims of politicians) to a zone's standard offset from UTC – into account was limited to the years 1970 through 2037, with some kludges in place to extrapolate beyond 2037. As the default assignee for time-related bugs, I'd long wanted to fix this, to use such information as we do have available. So – finally, once Qt 6 had been released – I made that change and discovered what it broke.

​在Qt 6.2之前,QDateTime对时区调整的能力——包括季节性夏令时和偶尔(政界人士一时兴起的)对时区与UTC之间的标准偏移量的变化——仅限于1970年至2037年,并在2037年之后进行了一些预测。作为与时间相关的bug的默认受让人,我早就想解决这个问题,使用我们现有的信息。所以——最后,在Qt6发布后——我做了那个改变,发现了它的缺陷。

I was, of course, expecting things to break: the restriction wouldn't have been there if there weren't problems with accessing time-zone data outside that range and, in any case, extending the range meant some code would be exercised in ways it never had been before. So I was ready for breakage and fixed various things before unleashing the change on unsuspecting colleagues and users. None the less, of course, there were subtler problems that eluded my testing and ended up surfacing as bugs once this change was released.

当然,我希望事情会有所突破:如果访问该范围之外的时区数据没有问题,那么限制就不会存在,而且,无论如何,扩展该范围意味着一些代码将以前所未有的方式运行。因此,在对毫无戒心的同事和用户进行更改之前,我已经做好了破坏和修复各种东西的准备。当然,尽管如此,还是有一些更微妙的问题没有经过我的测试,一旦这个更改发布,这些问题最终会以bug的形式出现。

I discuss all of the following in terms of local time (using Qt::LocalTime as the spec parameter to QDateTime's constructor; this is the default spec), as that's more commonly used, but some of it applies also to times specified with respect to a particular time-zone (passing a QTimeZone parameter to the constructor, instead of optional spec and offsetSeconds). In particular, as some local time code paths now exercise time-zone code paths as fall-backs, bugs in the time-zone code have also shown up (and I've fixed them).

我从本地时间(使用Qt::LocalTime作为QDateTime构造函数的spec参数;这是默认规范)的角度讨论了以下所有内容,因为这是更常用的,但其中一些也适用于针对特定时区指定的时间(将QTimeZone参数传递给构造函数,而不是可选的spec和offsetSeconds)。特别是,由于一些本地时间代码路径现在使用时区代码路径作为后备,时区代码中的错误也出现了(我已经修复了它们)。

Why was it limited ?

为什么会受到限制?

Before I embark on describing the consequences of the change, I'd better explain why the range used to be limited.

在我开始描述这一变化的后果之前,我最好解释一下为什么这个范围过去是受限的。

There's a type called time_t used by system APIs for mapping between local time and UTC (Coordinated Universal Time, the modern replacement for Greenwich Mean Time); it counts the number of seconds since the start of 1970 in UTC. There's another type, called struct tm, that describes a broken-down time, with various fields for the year, month, day, hour, minute and second, along with a flag to indicate whether the daylight-saving status of the time is known and, if so, whether it's in daylight-saving time or in standard time. There are various standard functions to convert between these two representations of a point in time. While time_t is always a UTC time, a struct tm can represent either local time or UTC.

​系统API使用一种称为time_t的类型来映射本地时间和UTC(协调世界时,格林威治标准时间的现代替代物);它以UTC计算自1970年开始以来的秒数。还有另一种类型,称为struct tm,它描述了一个细分时间,包括年、月、日、时、分和秒的各种字段,以及一个标志,以指示该时间的夏令时状态是否已知,如果已知,则是在夏令时还是在标准时间。有各种标准函数可以在时间点的这两种表示形式之间进行转换。虽然time_t始终是UTC时间,但struct tm可以表示本地时间或UTC时间。

Back in olden times, when time_t was a 32-bit signed integer, it could represent any value from −2^31 to 2^31−1. It counts seconds. Now, 2^31 seconds is 68 years, 18 days, 3 hours, 14 minutes and 8 seconds. So the old 32-bit time_t could only represent date-times in the range from 1901-12-13 20:45:52 UTC to 2038-01-19 03:14:07 UTC (in the only sane time format). Thus, back when time_t was a 32-bit type, system APIs couldn't tell us about local time's offset from UTC (much) beyond the end of 2037.

​回到较早的时间,当time_t是32位有符号整数时,它可以表示−2^31至2^31−1.计算秒数。现在,2^31秒是68年18天3小时14分8秒。因此,旧的32位时间只能表示从1901-12-13 20:45:52 UTC到2038-01-19 03:14:07 UTC范围内的日期时间(以唯一合理的时间格式)。因此,当time_t是32位类型时,系统API无法告诉我们2037年底之后本地时间与UTC的偏移量(远)。

These days, most computer systems support 64-bit integral types and an obvious solution to the so-called 2038 problem was for them to use such a type as time_t, thereby extending its range. Every major operating system has by now made that change, so the only thing holding Qt back from reaching beyond it was simply that the code assumed it was a limitation and didn't call the system functions for times beyond the end of 2037. Instead, it tried to extrapolate what would happen beyond that date by looking at the time-zone rules in 2037. That's in principle a reasonable thing to do (still), given that no time-zone has yet committed to changing its daylight-saving rules after 2037; indeed, the politicians responsible for such changes seldom give those of us who maintain software even one year's advance notice of such changes, much less fifteen or more. I'll come back later to how that extrapolation didn't quite work robustly.

如今,大多数计算机系统都支持64位整数类型,而所谓2038问题的一个明显解决方案就是使用time_t这样的类型,从而扩展其范围。到目前为止,每一个主要的操作系统都做出了这样的改变,所以阻止Qt超越它的唯一原因就是代码认为这是一个限制,并且在2037年底之后的时间里没有调用系统函数。相反,它试图通过查看2037年的时区规则来推断该日期之后会发生什么。考虑到2037年后还没有哪个时区承诺改变夏令时规则,这在原则上是合理的(尽管如此);事实上,负责此类变更的政客很少提前一年通知我们这些维护软件的人,更不用说提前15年或更长时间。稍后我会回到这个推断是如何不太有效的。

In any case, the relevant system functions' specifications don't require them to work for times before the start of 1970: indeed, some of them are specified to return a time_t value of −1 as their way of reporting an error, which makes it tricky to represent the last second of 1969. Most Unix implementations of these functions do, in fact, cope with times before 1970, but Microsoft has always held to the 1970 start, so we couldn't have cross-platform support for correct handling of local time's offset from UTC before 1970.

在任何情况下,相关系统功能的规范都不要求它们在1970年初之前工作几次:事实上,其中一些功能被指定为返回时间值−1作为他们报告错误的方式,这使得它很难代表1969年的最后一秒。事实上,这些函数的大多数Unix实现都能处理1970年之前的时间,但Microsoft一直坚持从1970年开始,因此我们无法在1970年之前获得跨平台支持,以正确处理本地时间与UTC的偏移。

The problem with extrapolating

推断的问题

As mentioned above, the prior code extrapolated forward beyond 2037; and lifting the 1970 boundary was going to require doing similar, at least on Microsoft, into the past.

如上所述,先前的代码向前延伸到2037年之后;而且,取消1970年的边界将需要类似的做法,至少在微软方面是如此。

The code that extrapolated forward was just using 2037's data: it took the same time and date within the year and found what offset 2037 was using on that date. Unfortunately, there's a problem with that: a given date's day of the week shifts from one year to the next and many zones' daylight-saving rules put the transitions on Sundays (for example, the last Sundays of March and October). That means that a given date near the transition may be before or after it, depending on what day of the week it falls on.

向前推算的代码只使用了2037年的数据:它在一年内的同一时间和日期找到了2037年在该日期使用的偏移量。不幸的是,这有一个问题:一周中的某一天从一年换到下一年,许多地区的夏令时规则将转换时间定在周日(例如,3月和10月的最后一个周日)。这意味着,过渡期附近的某个给定日期可能在过渡期之前或之后,具体取决于它发生在一周中的哪一天。

So using the same date in 2037 was wrong for at least some dates. Likewise, using a date in 1970 wouldn't be a robust way to determine the time-zone's state on the corresponding date in earlier years (even if the rule in effect in 1970 did date back all the way to the year in question).

因此,至少在某些日期使用2037年的同一日期是错误的。同样,使用1970年的日期也不是确定早年相应日期的时区状态的可靠方法(即使1970年生效的规则确实可以追溯到相关年份)。

Fortunately, the day of the week for any given date can be determined entirely from the day of the week of the first day of its year and how many days later it is, which at most depends on whether it's in a leap year. It is easy enough to construct simple tables mapping day of the week to a year that starts with that day of the week, one for leap years and one for ordinary years. We can then use those to select a year in which (assuming the same rule is in effect) the system functions will give the right offset to apply for a date-time in a year outside their supported range.

幸运的是,任何给定日期的一周中的哪一天都可以完全从一年中第一天的一周中的哪一天以及它的后几天来确定,这最多取决于它是否在闰年。构造简单的表格很容易,将一周中的某一天映射到从该天开始的一年,一个用于闰年,一个用于普通年份。然后,我们可以使用它们来选择一个年份(假设相同的规则有效),在该年份中,系统函数将给出正确的偏移量,以申请超出其支持范围的年份中的日期时间。

The problem with that, in turn, is that the range of leap years needed to find one starting on each day of the week is rather long – at least 24 years between the first and last in the table – which increases the risk that the zone had a change of rules in that interval.

反过来,问题是,从一周的每一天开始寻找一个闰年所需的闰年范围相当长——从表中的第一天到最后一天之间至少有24年——这增加了欧元区在这段时间内规则发生变化的风险。

Further complications

进一步的并发症

In principle, a 64-bit time_t reaches about 292 gigayears, or about 21 times the current age of the universe, to either side of 1970. Since struct tm represents the year with a 32-bit int field, it can't represent the whole of that range. So even with a 64-bit time_t, the relevant system functions can only represent times within a little over 2 gigayears either side of 1900 – the offset that's applied to that year field. (At the start of this range, the Oxygen produced by cyanobacteria was causing iron dissolved in Earth's oceans to settle out as iron oxide, almost wiping out life, which was still all single-cell.) That would be plenty wide enough for QDateTime, though – it represents time by a 64-bit count of milliseconds since the start of 1970, so only reaches 292 megayears either side of that moment (a range that starts before the rise of the first dinosaurs), less than a seventh of the range struct tm can represent with its 32-bit year field.

​原则上,一个64位的时间到达1970年左右的292千兆年,或者说是宇宙当前年龄的21倍。由于struct tm用32位整型字段表示年份,因此它不能代表整个范围。因此,即使使用64位时间,相关的系统函数也只能表示1900年(应用于该年字段的偏移量)两侧略多于2G年的时间。(在这个范围开始时,蓝藻产生的氧气导致溶解在地球海洋中的铁以氧化铁的形式沉淀,几乎消灭了仍然是单细胞生物的生命。)不过,这对于QDateTime来说已经足够宽了——自1970年初以来,它以64位毫秒为单位表示时间,因此在这一时刻的任何一边都只能达到292兆年(这一范围始于第一只恐龙出现之前),而struct tm的32位年份字段所能代表的范围还不到七分之一。

So system types can now represent the full range of QDateTime values and we could naïvely hope our problems would be solved. However, as we've already seen, operating systems don't necessarily arrange for relevant system functions to cover the full range of their types. Just as Microsoft cuts off at the start of 1970 (when time_t is zero), Apple cuts off at the start of 1900 (when the year field of struct tm is zero); and Microsoft also has a cut-off in year 3000, beyond which its system functions decline to provide useful results.

因此,系统类型现在可以代表QDateTime值的全部范围,我们可以天真地希望我们的问题能够得到解决。然而,正如我们已经看到的,操作系统不一定会安排相关的系统功能来覆盖其类型的全部范围。正如微软在1970年初(time_t为零时)的开始时间一样,苹果在1900年(struct tm的年份字段为零时);微软在3000年也有一个截止日期,超过这个截止日期,它的系统功能就会下降,无法提供有用的结果。

So it remains necessary to extrapolate outside the system-supported range. In the case of Apple, this is fairly straightforward, as I happen to know that no zone actually used daylight-saving time until 1908, so we can just use the standard time offset at some point between 1900 and 1908, trusting that it was in force for all prior times. (That isn't strictly accurate, as most places used local solar mean time until the arrival of the railways, which is what prompted the invention of time zones in the first place, but it's good enough for most practical purposes and certainly the best we can hope to do under the circumstances.)

因此,仍有必要在系统支持的范围之外进行推断。在苹果的例子中,这是相当简单的,因为我碰巧知道,在1908年之前,没有一个区域实际使用夏时制,所以我们可以在1900年到1908年之间的某个时间点使用标准时间偏移,相信它在之前的所有时间都有效。(严格来说,这并不准确,因为在铁路到达之前,大多数地方都使用当地的太阳平均时间,这正是促使时区发明的原因,但对于大多数实际用途来说,它已经足够好了,而且肯定是在这种情况下我们所能希望做到的最好。)

In the case of Microsoft, the situation is trickier. As well as the system functions excluding times before 1970, the sytem only comes with time-zone information reaching back a few years, so what it'll tell you for the offsets even in the 1970s isn't reliable – it's extrapolated backwards from the rule in effect at some later date, regardless of the historical reality. None the less, when that's the data we have, the best we can do is use it and, likewise, extrapolate backwards from it to all of history before 1970. We also have to do the same for the future beyond 3000, but at least that forward extrapolation is reasonable (as noted above).

就微软而言,情况更为棘手。除了系统功能不包括1970年之前的时间外,该系统只有几年前的时区信息,因此即使在1970年代,它也会告诉你的偏移量是不可靠的——它是从后来生效的规则向后推断的,而不管历史现实如何。尽管如此,当这是我们拥有的数据时,我们能做的最好的事情就是使用它,同样地,从它向后推断1970年之前的所有历史。对于3000年以后的未来,我们也必须做同样的事情,但至少向前推断是合理的(如上所述)。

One other thing we can do, aside from extrapolating backwards based on what the system functions tell us, is to use QTimeZone. As long as this is able to discover and represent the system time-zone, we can bypass the system time_t / struct tm functions. That saves us the need to extrapolate in most cases. Then again, when QTimeZone is obliged to use Microsoft's system APIs for time-zone information (it prefers the ICU library when avaiable), it's subject to the same limitation on past zone information as the time_t functions. (If Microsoft's data claims a recurrent daylight-saving rule dating back forever, we cut it off at 1900, using its standard time before then, because no zone used DST before 1908.) We do have plans (see QTBUG-68812) to add a new C++20 time-zone backend, based on <tz.h>, that'll give us consistent cross-platform time-zone information – and let me delete a lot of code – but Qt 6 currently only relies on C++17 features.

​除了根据系统功能告诉我们的信息向后推断之外,我们还可以做的另一件事是使用QTimeZone。只要它能够发现并表示系统时区,我们就可以绕过系统time_t/struct tm函数。在大多数情况下,这使我们无需进行推断。此外,当QTimeZone必须使用微软的系统API获取时区信息时(如果可用,它更喜欢ICU库),它对过去的时区信息的限制与time_t功能相同。(如果微软的数据声称一条可以追溯到永远的循环夏令时规则,我们在1900年使用之前的标准时间将其切断,因为1908年之前没有任何地区使用DST。)我们确实有计划(参见QTBUG-68812)基于<tz.h>添加一个新的C++20时区后端。这将为我们提供一致的跨平台时区信息——让我删除很多代码——但Qt 6目前只依赖C++17功能。

Expanding the range

扩大范围

So there was a lot of preparation to be done, before the old limitation to the years from 1970 through 2037 could be lifted, but once we understood the problems it was time to set about fixing them. The 2037 boundary was the one we had the most pressing need to lift, since that's closer to the present than 1970 and getting closer with every passing year: that was QTBUG-73225 and, all things considered, it was the easier boundary to fix, since all major platforms already do in fact have time_t functions that work beyond that boundary.

​因此,在取消1970年至2037年的旧限制之前,有很多准备工作要做,但一旦我们了解了问题,就应该着手解决它们了。2037年的边界是我们最迫切需要解除的边界,因为这比1970年更接近现在,而且一年比一年更近:那是QTBUG-73225,而且,从所有方面考虑,这是更容易修复的边界,因为所有主要平台实际上都已经具有超越该边界的时间功能。

Somewhat more challenging was QTBUG-80421, the lifting of the 1970 restriction on Windows (and, in the process, the previously unnoticed 1900 boundary on Apple). For this I had to extrapolate available time-zone information to before 1970. When the sytem time-zone information was available via QTimeZone, this was easy enough; but the fall-back from that required extrapolation from the time_t functions. As noted above, this is tricky and involves some risk for leap years. Fortunately, we only have to exercise it when we can't represent the system zone as a QTimeZone. On the bright side, removing the logic to detect whether to take time-zone information into account did make the code simpler, allowing some clean-ups.

​更具挑战性的是QTBUG-80421,即取消1970年对Windows的限制(以及在此过程中,苹果此前未被注意到的1900年边界)。为此,我不得不推断出1970年以前的可用时区信息。当系统时区信息可以通过QTimeZone获得时,这就足够容易了;但是,从这一点上的倒退需要从时间函数中进行推断。如上所述,这是一个棘手的问题,涉及闰年的一些风险。幸运的是,我们只需要在无法将系统区域表示为QTimeZone时使用它。好的一面是,删除检测是否考虑时区信息的逻辑确实使代码更简单,允许进行一些清理。

Those both made it into 6.2 in good time for its Feature Freeze (after all, they were significant behaviour changes), along with fixes for the first few bugs they exposed. Fortunately, we have quite robust testing of the date-time and time-zone code these days, which found most of the problems nice and quickly (and we now have more testing, for the bugs found later). None the less, we've had a few bugs reported by those who use Qt and I can guess we'll see one or two more. The QTimeZone code is now being exercised a lot more than it was before, due to use as a fall-back, which has been where most of the long-hidden bugs have been lurking. There have also been some bugs with arithmetic overflow, since we do have test-cases close to the boundaries of the range of values QDateTime can represent. Some bugs remain (of course) – QML's Date type has issues on some platforms, getting the wrong time-zone offset for local time, for example – which shall get attention as time permits.

这两个版本都在功能冻结的好时机进入了6.2版本(毕竟,它们是重大的行为变化),并修复了它们暴露的前几个bug。幸运的是,最近我们对日期、时间和时区代码进行了非常健壮的测试,很快就发现了大部分问题(现在我们有更多的测试,针对后来发现的错误)。尽管如此,使用Qt的人还是报告了一些bug,我猜我们还会看到一两个。QTimeZone代码现在的使用量比以前大得多,这是因为它被用作了一种退路,而大多数长期隐藏的bug都潜伏在这里。算术溢出也有一些错误,因为我们的测试用例接近QDateTime可以表示的值范围的边界。当然,仍然存在一些错误——QML的日期类型在某些平台上存在问题,例如,本地时间的时区偏移错误——如果时间允许,应该引起注意。

The net upshot of it all is that both QDateTime and QTimeZone are now more robust – and ready for the Epochalypse.

​这一切的最终结果是,QDateTime和QTimeZone现在都更加强大了——并且为划时代做好了准备。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值