Identifying leap years
One of my colleagues was recently looking at some date-based calculations that needed to know whether a year is a leap year or not, on the hot path, and he found an 8% speed improvement by reworking the leap year predicate a bit. The JVM compiles this down to a few fast integer operations, avoiding slow things like division entirely. I thought it’d be interesting to dig into how to calculate the leap year predicate without division.
Note that your compiler may already do these optimisation steps for you, so I don’t recommend obfuscating your leap-year predicate implementation like this without some careful contemplation of its worth. This is really just an exercise in understanding what a sufficiently clever compiler might do here.
I looked at what GCC does with the naive calculation when compiling to AVR
assembler (as used by Arduino microcontrollers) and it came up with something
much slower than this even with -O2
. So if you need a high-performance leap
year calculation on an Arduino for some reason then maybe this’ll be what you
need.
Version 0
A year is a leap year (in the Gregorian calendar) if it is divisible by 4 and is not divisible by 100 unless it is also divisible by 400. So 1996 is a leap year (it’s divisible by 4 and not by 100) as is 2000 (it’s divisible by 4 and 400) but 1900 is not (it’s divisible by 4 and 100 but not 400).
In symbols, where a dvd b
means that a
divides exactly into b
:
isLeapYear y == 4 dvd y ∧ (¬ 100 dvd y ∨ 400 dvd y)
Version 1
It’s pretty easy for a computer to tell whether an integer is divisible by a
power of two since this only needs bitwise operations. For instance y
is
divisible by 4 iff its last two bits are zero, i.e. y & 3 == 0
. Checking for
divisibility by 100 and 400 isn’t so easy however.
It’s hopefully clear that y
is a multiple of 100 iff it is a multiple of 4
and of 25, since 4 and 25 are coprime and their product is 100. Similarly y
is a multiple of 400 iff it is a multiple of 16 and of 25. This helps because
rather than needing to calculate divisibility by 4, 100 and 400 it reduces the
problem to needing to calculate divisibility by 4, 16 and 25, only one of which
isn’t a power of two.
Naively substituting into the definition above gives a rather complicated formula which pleasingly simplifies to either of these:
isLeapYear y == 16 dvd y ∨ (4 dvd y ∧ ¬ 25 dvd y)
== 4 dvd y ∧ (16 dvd y ∨ ¬ 25 dvd y)
It’s not clear which one of these is better. I think the first one can be
implemented with fewer instructions since 16 dvd y
is implemented with a
bitmask as y & 15 == 0
and if that test fails then you can compute
divisibility by 4 using the intermediate value, i.e. (y & 15) & 3 == 0
saving
a second load of y
. On the other hand this involves more checks, on average,
since most numbers aren’t divisible by 16.
Version 2
Computers typically use fixed-width (i.e. bounded) integers for calculations.
If an operation exceeds the bound then the most-significant bits are discarded,
so integer calculations are effectively performed modulo 2ⁿ
where n
is the
width of the integer type. We’ll only use unsigned integers here, and in this
case the width n
is often 64, 32 or 16 bits.
Since 25 is odd it is coprime with 2ⁿ
. This means it has a multiplicative
inverse, i.e. there is a number a
such that a * 25 = 1 (mod 2ⁿ)
. You can
find this using Euclid’s algorithm. For instance, if you’re working with 16-bit
integers:
2^16 = 65536 = 2621 * 25 + 11
25 = 2 * 11 + 3
11 = 3 * 3 + 2
3 = 1 * 2 + 1
2 = 2 * 1 + 0 => gcd(65536,25) = 1
1 = 3 - 1 * 2
= 3 - 1 * (11 - 3 * 3)
= -1 * 11 + 4 * 3
= -1 * 11 + 4 * (25 - 2 * 11)
= 4 * 25 - 9 * 11
= 4 * 25 - 9 * (65536 - 2621 * 25)
= -9 * 65536 + 23593 * 25
Therefore 23593 * 25 = 1 (mod 2^16)
.
Now consider the effects of multiplying by a
mod 2ⁿ
on the multiples of 25:
a * 0 = 0 (mod 2ⁿ)
a * 25 = 1 (mod 2ⁿ)
a * 50 = 2 (mod 2ⁿ)
a * 75 = 3 (mod 2ⁿ)
...
Since our integers are bounded we can only have finitely many multiples of 25.
Define b = 2ⁿ div 25
, then 25b
is the largest multiple of 25 less than 2ⁿ
and therefore there are b+1
multiples of 25 in total, and multiplying them by
a
(mod 2ⁿ
) maps them onto the numbers [0..b]
.
However a
itself has a multiplicative inverse, namely 25, and therefore
multiplying by a
is an injective operation, which means that no other numbers
can map onto [0..b]
. This means we can use the result after multiplying by
a
as a cute way to test for divisibility by 25:
y dvd 25 ⟷ (y * a) mod 2ⁿ ≤ b
Note that a
and b
depend only on the width of our integers n
and not on
the input y
so you (or the compiler) can compute them in advance. This means
that for fixed-width integers:
isLeapYear y == 16 dvd y ∨ (4 dvd y ∧ b < (y * a) mod 2ⁿ)
Note that now this calculation only uses bitwise operations and a single multiplication so is pretty cheap.
Prove it!
The divisibility check mentioned above is quite a general thing that really only needs the divisor to be coprime to the modulus. However there’s a few corner cases that I didn’t cover in the sketch above. Here’s an Isabelle/HOL formalisation of its full correctness proof:
theorem dvd_mod_mult_equiv:
fixes m d :: nat
fixes a assumes a_prop: "(a * d) mod m = 1"
fixes b assumes b_prop: "b = m div d"
fixes n assumes n: "n < m"
shows "(n * a) mod m ≤ b ⟷ d dvd n"
proof -
consider (m0) "m = 0"
| (bdm) "m ≠ 0" "b * d = m"
| (general) "m ≠ 0" "b * d ≠ m"
by linarith
thus ?thesis
using n a_prop
proof cases
case bdm
from a_prop have "(a * d mod m * b mod m) mod m = b mod m" by simp
with bdm have "b mod m = 0" by auto
with bdm show ?thesis by auto
next
case general
from a_prop have d_pos: "0 < d" using neq0_conv by fastforce
have "{ n. n < m ∧ d dvd n } = { n. n < m ∧ (n * a) mod m ≤ b }"
proof (intro card_seteq)
show "{ n. n < m ∧ d dvd n } ⊆ { n. n < m ∧ (n * a) mod m ≤ b }"
proof (intro subsetI CollectI conjI, simp, elim CollectE conjE)
fix n assume n: "n < m" and "d dvd n"
then obtain k where k: "n = d * k" unfolding dvd_def by blast
from k have "k ≤ n" using d_pos by auto
with n have km: "k < m" by simp
have "(n * a) mod m = (a * d * k) mod m" by (metis k mult.assoc mult.commute)
also have "… = ((a * d) mod m * k) mod m" by (simp add: mod_mult_left_eq)
also have "… = k" using km a_prop by simp
also from k n have "d * k ≤ m" by simp
hence "(d * k) div d ≤ m div d" using div_le_mono by blast
hence kb: "k ≤ b" using d_pos b_prop by simp
finally show "(n * a) mod m ≤ b" .
qed
have "card {n. n < m ∧ n * a mod m ≤ b} ≤ card {0..b}"
proof (intro card_inj_on_le inj_onI, elim CollectE conjE)
fix n1 n2
assume "(n1 * a) mod m = (n2 * a) mod m"
hence "(d * (a * n1)) mod m = (d * (a * n2)) mod m" by (metis mod_mult_eq mult.commute)
hence "((a * d) mod m * n1) mod m = ((a * d) mod m * n2) mod m"
by (simp add: mod_mult_left_eq mult.assoc mult.left_commute)
with a_prop have "n1 mod m = n2 mod m" by simp
moreover assume "n1 < m" "n2 < m"
ultimately show "n1 = n2" by auto
qed auto
also have "card {0..b} = card ((*) d ` {0..b})"
apply (intro sym [OF card_image] inj_onI) using d_pos by auto
also have "((*) d) ` {0..b} ⊆ {n. n < m ∧ d dvd n}"
proof (intro subsetI CollectI conjI, elim imageE, simp)
fix n assume "n ≤ b" hence "d * n ≤ b * d" by simp
also have le: "b * d ≤ m" unfolding b_prop using div_times_less_eq_dividend by blast
with general have "b * d < m" by simp
finally show "d * n < m" using a_prop general by auto
qed auto
hence "card ((*) d ` {0..b}) ≤ card {n. n < m ∧ d dvd n}" by (intro card_mono, simp_all)
finally show "card {n. n < m ∧ n * a mod m ≤ b} ≤ card {n. n < m ∧ d dvd n}" .
qed simp
with n show ?thesis by blast
qed auto
qed
Examples
Here are the relevant constants for the common bit widths, in hexadecimal:
Width (bits) a b
16 23593 or 0x5c29 0x0a3d
32 0xc28f5c29 0x0a3d70a3
64 0x8f5c28f5c28f5c29 0x0a3d70a3d70a3d70
Signed integers
The story is similar with signed integers, except that now half of the
multiples of 25 are negative, so the largest multiple of 25 is now 25b
where
b = 2ⁿ⁻¹ div 25
. This means there are 2b+1
multiples of 25 in total, and
multiplication by a
maps them injectively onto [-b..b]
, so the divisibility
predicate becomes:
y dvd 25 ⟷ |(y * a) mod 2ⁿ| ≤ b
That said, the Gregorian calendar has only existed for a few hundred years anyway so leap years in BC don’t really work this way. Maybe signed integers aren’t the right choice for this application.