Monday, August 1, 2011

Hey Unix, How About a Date?

The Problem.

I needed a way to warn users that their passwords were about to expire. To give them a sporting chance, I wanted to give them 2 weeks warning.

I could get the current date with no problem. Getting the password expire date was a bit of a pain, but I could figure it out. The hard part was figuring out how close to the 2 week warning date they were.

In a nut shell, doing math with dates is a royal pain. You have abstract concepts like “July 4th” permanently glued to a large spinning rock which is whizzing around the Sun.

Also, certain concepts that we're used to, such as 1 + 1 = 2, don't always hold up when doing date math. January 31st + 2 months = March 31st. No surprise. How about January + 1 month. That would be February 28th? Or would it be March 3rd (31 days after January 31st)? Add a month to February 28th and you get March 28th. March 31st doesn't equal March
28th. Oh boy!

The Simple (Linux) Solution
.

The most common way to deal with dates is to convert them to a number, do some math, and then convert the number back to a date. Unix (and Linux) system time is based on this concept. Unix dates are implemented as the number of seconds since the January 1st, 1970. In Unix parlance 1/1/1970 is called “the epoch”.

If you have a version of Unix that uses the GNU version of the date command (almost all versions of Linux do), then date math becomes trivial. The date command can convert to and from the epoch with relative ease.

To convert “August 1st 2011” to seconds from the epoch use:

date -u -d "8/1/2011" +%s

You should get 1312156800.

To convert it back use:

date -u -d "1970-01-01 1312156800 seconds" '+%m/%d/%Y'

You should get “08/01/2011”.

The problem here is 2 fold:

One, you're limited to dates between 1970 and 2038 for 32 bit computers. If you have a 24 bit computer then you're good until somewhere around the year 292,277,026,296 so it's not really a restriction.

The second problem is that most older OS's aren't running GNU date. They have their own propriety versions of date that won't let you work with arbitrary dates.

What I needed was a date converter that would work on many versions of *old* Unix. Things like Solaris 5 and HP-UX 10. These are nasty little beasts that barely have Bourne shell. I also wasn't allowed to add more advanced scripting languages to the system so Perl and Python solutions were both out.

I poked around on the Internet tubes and found a dearth of solutions. Most of them used other languages. Some gave example code that didn't handle leap years properly. Others were built around precomputed tables.

Time to step up to the plate.

My Solution.

Below is my solution. Its date format is the number of days after January 1st, 1582. That's the beginning of the Gregorian calendar and very few of my users are that old.

Internally it's mostly AWK scripts glued together by Bourne shell. It can handle dates up to 1/1/9794 and can probably go higher. I've tested it and think it's pretty bullet proof.

#!/bin/sh -

# Convert a date to/from the number of days after 1/1/1582 using only
# basic Unix commands. By "basic" I mean commands available on an
# HP-UX 10 box.
#
# 1/1/1582 is the start of the Gregorian calendar.

# Note: The Gregorian rules for leap years is:
#
# If the year is a factor of 400
# It's a leap year.
# Else If the year is a factor of 100
# It's not a leap year.
# Else If the year is a factor of 4
# It's a leap year
# Else
# It's not a leap year.

# This mostly uses awk because awk is much faster than using raw
# Bourne shell.

# To get a date from the Unix "seconds from the epoch" time use
# int($utime / (24 * 60 * 60)) + date_as_days(1970 1 1)
#
# date_as_days(1970 1 1) = 141714 by the way.

#
# Return the number of days in the previous months.
#
# For example the second entry is the number of days in January. The
# third entry is the combined number of days in January and
# February.
#
# The only parameter is the 4 digit year.
#
days_prev_month()
{
  echo $1 | awk '{
    year = $1

    # Pick the number of days depending of whether its a leap year.
    if ((year % 4 == 0) && (year % 100 != 0 || year % 400 == 0)) {
      print "0 31 60 91 121 152 182 213 244 274 305 335 366"
    } else {
      print "0 31 59 90 120 151 181 212 243 273 304 334 365"
    }
  }'
}

#
# Convert a date in to the number of days after 1/1/1582.
#
# The parameters are YYYY MM DD.
#
date_as_days()
{
  dad_month=$2; dad_day=$3

  # Get the number of days consumed by the years and the number of
  # days remaining in the current year.
  set - `echo $1 | awk '{
    year = $1

    # The modern calendar started in 1582.
    year_days = int((year - 1581) * 365.25) - 365

    cents = int((year - 1501) / 100)
    year_days -= cents

    cents_400 = int((cents + 3) / 4)
    year_days += cents_400

    print year, year_days
  }'`
  dad_year=$1; dad_year_days=$2

  # Now add the month and day contributions.
  days_prev_month $dad_year | awk "
    BEGIN { day=$dad_day; month=$dad_month; year_days=$dad_year_days }"'
    { whole_month_days=$month
      print year_days + whole_month_days + day - 1 }'
}

#
# Take the number of days since 1/1/1582 and convert it to
# year, month, day
#
days_as_date()
{
  df_days=$1

  # This awk script computes the year contributions to the date
  # and removes the effects of those years from df_days.
  set - `
  echo $df_days | awk '{
    df_days = $1;

    # The first 400 year leap year in the Gregorian calendar is
    # 1600 so we normalize our calculations from the first block
    # of 400 years that ends on 1600. That year is 1201.
    # There are 139157 days between 1/1/1201 and 1/1/1582
    #
    # Note: We use 1201, not 1200 because we want the leap year to be the
    # *last* year of the 400, 100 or 4 year block.
    n_days = df_days + 139157

    # There is one leap year every 4 years.
    days_per_quad_year = (365 * 4) + 1
    # Years that end in 00 arent leap years.
    days_per_cent = (days_per_quad_year * 25) - 1
    # Unless its divisible evenly by 400.
    days_per_quad_cent = (days_per_cent * 4) + 1

    # Calculate the contributions of each year block.
    quad_cents = int(n_days / days_per_quad_cent)
    n_days -= quad_cents * days_per_quad_cent

    cents = int(n_days / days_per_cent)
    if (cents == 4) { cents = 3 }
    n_days -= cents * days_per_cent

    quad_years = int(n_days / days_per_quad_year)
    n_days -= quad_years * days_per_quad_year

    years = int(n_days / 365)
    if (years == 4) { years = 3 }
    n_days -= years * 365

    df_year = 1201 + (400 * quad_cents) + (100 * cents) \
      + (4 * quad_years) + years

    print n_days, df_year
  }'`
  df_n_days=$1; df_year=$2

  # Get the day and month from the given year.
  set - `days_prev_month $df_year | awk "
    BEGIN{n_days=$df_n_days}"'
    { df_month = 1
      while (n_days >= $df_month) {
        df_whole_month_days = $df_month
        df_month++
      }
      df_day = 1 + n_days - df_whole_month_days
      print (df_month - 1), df_day }'`

  df_month=$1; df_day=$2

  echo $df_year $df_month $df_day
}

# Some test code. Feed it a number and get back a date.
# Feed it a m/d/yyyy date and get back a number.
#
# Note the complete lack of error checks.
if echo "$1" | egrep '/' >/dev/null 2>&1; then
  set - `echo $1 | tr '/' ' '`
  date_as_days $3 $1 $2
else
  days_as_date $1 | awk '{ printf "%02d/%02d/%02d\n", $2, $3, $1 }'
fi

2 comments:

Anonymous said...

Very impressive. My, but you're smart, then again, I knew that. - Adam (yes, that Adam... how's life treating you?)

Dale Wiles said...

Hey Adam!

I'm surprised that I couldn't find this code out there. It seems to be useful so I hath proclaimed it.