Debating tradeoffs

A few weeks ago, a we fielded a potential request to validate Social Security Numbers.  The social security administration has a few rules for numbers that they deem invalid and that they will never assign:

  1. The first three numbers are {000, 666, 900-999}
  2. The fourth & fifth numbers are 00
  3. The last four numbers are 0000

Being one for code reuse, I wanted to do this in as general a way as possible.  This was my initial solution (in Mumps/Cache):

isValidSSN(ssn)   ;
  s ssn=$tr(ssn,"-","")    ; remove dashes
  q:ssn'?9n 0              ; ssn is not nine numbers
  q:$e(ssn,1,3)="000" 0    ; 000-**-****
  q:$e(ssn,4,5)="00" 0     ; ***-00-****
  q:$e(ssn,6,9)="0000" 0   ; ***-**-0000
  q:$e(ssn,1,3)="666" 0    ; 666-**-****
  q:$e(ssn,1)=9 0          ; starts with 900-* through 999-*
  q 1

Since Mumps/Cache is a functional language, I enjoy the pattern of using short functions to encapsulate specific business logic.  However, one of my co-workers threw out the question of using regular expressions due to their universality – thereby making it easy to copy/paste from our Database environment to our web code written in Cold Fusion.

This made me stop and think about the full stack.  Sure anyone in the DB environment could just call my function, but what about the web side should this logic be needed there too?  Normally, I don’t like copy/paste, especially for something that can be as unwieldy as regular expressions.  However, if my function can’t be called directly, is there benefit of using something that can just be copy/paste dropped into another language?  What if a customer wants additional rules like all digits are the same?  There are still 2 places to update.

I guess that in my ideal world, one could have a rule engine that was flexible and accessible from the entirety of the application, or at the very least, be able to call between languages to have one authoritative copy.  In this situation, I don’t think that relying on copy/paste would be too bad a solution, provided all the locations were adequately commented and crossed referenced.  However, this still leaves you open to manual errors, updates getting out of sync, and duplicate test code (if you even have test coverage in both places).

In the end, the client decided to suppress SSN’s all together, but I’m glad I had the chance to stop and think about this kind of thing.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s