AnsweredAssumed Answered

Custom function to validate Danish CPR-number

Question asked by Fimano on Apr 17, 2015
Latest reply on Apr 23, 2015 by Fimano

I have destilled my question to be one of 'non'numeric' recursion:

We have ID numbers in Denmark called CPR - Det Centrale PersonRegister- which has been in use since my birth year, 1968. Back then, bytes were expensive, so the biggest problem was that full years (four digits) were not used.

 

There are validation criteria that are well described:

http://da.wikipedia.org/wiki/CPR-nummer#Kontrol_af_personnummer

 

A smart person, Peter, has done this online validation:

https://regis.ter.dk/

If you put in 0707771119, it will validate, that this is a man (odd last digit), born on the 7th July 1977.

I trust that this produces correct results.

 

I have written a CF myself, however not recursive:

ValidateCPR

theNumber

All accounts

// source: http://da.wikipedia.org/wiki/CPR-nummer // parameters: // theNumber: not really a number, but a text string in the form ddmmyy-cccc// method: considered as 2nd parameter, but possibly it is more useful to have a return-separated result // result // 1st line: validation of date part: an error code if the format is not valid, otherwise 1 // - validation of full date will not fail, as the 31st in a 30-day month is the 1st of the next month// 2nd line: modulus validation: 0 or 1 // 3rd line: the full date, if it passed validation // 4th line: gender: male or female Let ( // variables [ digits = "1234567890" ; // to avoid typing many times numAsTxt = GetAsText ( theNumber ) ; // to get the input as text error = Case ( Length ( numAsTxt ) ≠ 11 ; "Forkert længde" ) ; // first error type: wrong length datePart = Left ( numAsTxt ; 6 ) ; controlPart = Right ( numAsTxt ; 4 ) ; error = Case ( not Filter ( datePart ; digits ) ; "Ikke-numerisk dato" ) ; // more error checking... error =Case ( not Filter ( controlPart ; digits ) ; "Ikke-numeriske tjekcifre" ) ; // more error checking... dd = Middle ( datePart ; 1 ; 2 ) ; mm = Middle ( datePart ; 3 ; 2 ) ; yy = Middle ( datePart ; 5 ; 2 ) ; digit7 = Left ( controlPart ; 1 ) ; digit10 =Right ( controlPart ; 1 ) ; yyyy = Case ( digit7 ≥ 0 and digit7 ≤ 3 ; 19 & yy ; digit7 = 4 ; Case ( yy ≥ 0 and yy ≤ 36 ; 20 & yy ; 19 & yy ) ; digit7 ≥ 5 and digit7 ≤ 8 ;Case ( yy ≥ 0 and yy ≤ 57 ; 20 & yy ; 18 & yy ) ; digit7 = 9 ; Case ( yy ≥ 0 andyy ≤ 36 ; 20 & yy ; 19 & yy ) ; // same as 4, but kept here to conform with section 1.1 in the source ) ; ddmmyyyy = Date ( mm ; dd ; yyyy ) ; error =Case ( Day ( ddmmyyyy ) ≠ GetAsNumber ( dd ) or Month ( ddmmyyyy ) ≠GetAsNumber ( mm ) or Right ( Year ( ddmmyyyy ) ; 2 ) ≠ GetAsNumber ( yy ) ; "Ugyldig dato" ) ; // more error checking... productSum = Middle ( datePart ; 1 ; 1 ) * 4 + Middle ( datePart ; 2 ; 1 ) * 3 + Middle ( datePart ; 3 ; 1 ) * 2 +Middle ( datePart ; 4 ; 1 ) * 7 + Middle ( datePart ; 5 ; 1 ) * 6 + Middle ( datePart ; 6 ; 1 ) * 5 + Middle ( controlPart ; 1 ; 1 ) * 4 + Middle ( controlPart ; 2 ; 1 ) * 3 + Middle ( controlPart ; 3 ; 1 ) * 2 + Middle ( controlPart ; 4 ; 1 ) * 1 ; valid = not Mod ( productSum ; 11 ) ; gender = Case ( Mod ( digit10 ; 2 ) ; "Mand" ; "Kvinde" ) ; result = Case ( not IsEmpty ( error ) ; error ; IsValid ( ddmmyyyy ) & "¶" & valid & "¶" & ddmmyyyy & "¶" & gender ) ] ; // calculationresult )

--

or, in more readable format:

ValidateCPR ( theNumber ) =

// source: http://da.wikipedia.org/wiki/CPR-nummer

 

// parameters:

// theNumber: not really a number, but a text string in the form ddmmyy-cccc

// method: considered as 2nd parameter, but possibly it is more useful to have a return-separated result

 

// result

// 1st line: validation of date part: an error code if the format is not valid, otherwise 1

// - validation of full date will not fail, as the 31st in a 30-day month is the 1st of the next month

// 2nd line: modulus validation: 0 or 1

// 3rd line: the full date, if it passed validation

// 4th line: gender: male or female

 

Let (

// variables

[

  digits = "1234567890" ;

  // to avoid typing many times

  numAsTxt = GetAsText ( theNumber ) ;

  // to get the input as text

  error = Case ( Length ( numAsTxt ) ≠ 11 ; "Forkert længde" ) ;

  // first error type: wrong length

  datePart = Left ( numAsTxt ; 6 ) ;

  controlPart = Right ( numAsTxt ; 4 ) ;

  error = Case ( not Filter ( datePart ; digits ) ; "Ikke-numerisk dato" ) ;

  // more error checking...

  error = Case ( not Filter ( controlPart ; digits ) ; "Ikke-numeriske tjekcifre" ) ;

  // more error checking...

  dd = Middle ( datePart ; 1 ; 2 ) ;

  mm = Middle ( datePart ; 3 ; 2 ) ;

  yy = Middle ( datePart ; 5 ; 2 ) ;

  digit7 = Left ( controlPart ; 1 ) ;

  digit10 = Right ( controlPart ; 1 ) ;

  yyyy = Case (

  digit7 ≥ 0 and digit7 ≤ 3 ;

  19 & yy ;

  digit7 = 4 ;

  Case ( yy ≥ 0 and yy ≤ 36 ; 20 & yy ; 19 & yy ) ;

  digit7 ≥ 5 and digit7 ≤ 8 ;

  Case ( yy ≥ 0 and yy ≤ 57 ; 20 & yy ; 18 & yy ) ;

  digit7 = 9 ;

  Case ( yy ≥ 0 and yy ≤ 36 ; 20 & yy ; 19 & yy ) ;

  // same as 4, but kept here to conform with section 1.1 in the source

  ) ;

  ddmmyyyy = Date ( mm ; dd ; yyyy ) ;

  error = Case (

  Day ( ddmmyyyy ) ≠ GetAsNumber ( dd )

  or

  Month ( ddmmyyyy ) ≠ GetAsNumber ( mm )

  or

  Right ( Year ( ddmmyyyy ) ; 2 ) ≠ GetAsNumber ( yy )

  ; "Ugyldig dato" ) ;

  // more error checking...

  productSum =

  Middle ( datePart ; 1 ; 1 ) * 4

  +

  Middle ( datePart ; 2 ; 1 ) * 3

  +

  Middle ( datePart ; 3 ; 1 ) * 2

  +

  Middle ( datePart ; 4 ; 1 ) * 7

  +

  Middle ( datePart ; 5 ; 1 ) * 6

  +

  Middle ( datePart ; 6 ; 1 ) * 5

  +

  Middle ( controlPart ; 1 ; 1 ) * 4

  +

  Middle ( controlPart ; 2 ; 1 ) * 3

  +

  Middle ( controlPart ; 3 ; 1 ) * 2

  +

  Middle ( controlPart ; 4 ; 1 ) * 1 ;

  valid = not Mod ( productSum ; 11 ) ;

  gender = Case ( Mod ( digit10 ; 2 ) ; "Mand" ; "Kvinde" ) ;

  result = Case ( not IsEmpty ( error ) ; error ; IsValid ( ddmmyyyy ) & "¶" & valid & "¶" & ddmmyyyy & "¶" & gender )

] ;

// calculation

result

)

 

I was hoping to reformulate the problem:

Input:

A CPR-number, possibly valid

Output:

Validity, Gender, DateOfBirth.

As you can see, I return the result as four lines:

1 - boolean, True if format is valid AND date is valid

2 - boolean, True if modulus test is passed

3 - date, the full date of birth, in FMP format

4 - text, male/female or similar string.

 

The processing could be recursive:

A:

Is the string correct? We have it stored as a number field, even if it contains a dash at position 7. I must take care not to treat the 'string' as a calculation/subtraction! So GetAsNumber would be dangerous... GetAsText will result in:

- 11 chars, most often, like ddmmyy-cccc

- 10 chars, if the leading zero is omitted, like '070777-1119' becoming '70777-1119'

- 9 chars, same case as above, but both omission of leading zero and dash: '707771119'.

B:

If the string given is (or can be can be converted to) ddmmyy-cccc format, we are in business...

So we want to go on - could this be a second function call?

If we actually have 6 digits, then a dash, then 4 digits, and no other leading/trailing spaces or other trash:

Determine the full 4-digit year from the algorithm given in the wiki-article,

then check if the day part, month part and year part combine to make a valid date.

C:

Perform the modulus check with the weighing numbers given.

D:

If all preceding checks are passed, tell if it is really a man or woman.

 

Now, if my non-recursive function worked all of the time, I'd be fine. But for 130600-5738, it fails! That is a perfectly valid number, which can be confirmed on the ter.dk site mentioned above.

 

So, in essence:

ValidateCPR ( "130600-5738" )

- fails (but should validate properly)

ValidateCPR ( "070777-1119" )

- validates properly.

 

If this function could be revised to evaluate properly, recursively or not, I would be a happy developer.

 

Thanks,

Jens

Outcomes