Summary

Transformation rules are applicable to Data elements, and may be divided into two broad categories:

  • Presentation - refers to how data is presented to, or obtained from, a user. The presentation may include a user interface or on a static report.
  • Storage - refers to how data is physically stored and retrieved.

 Applicability

An element of data may need to be transformed in specific circumstances:

  • A user may enter data in a specific manner that may be stored or presented in the specific manner.
  • Examples include:
    • A phone number may be entered in a specific fashion, such as area code and phone number for a North American phone number or including the country code for an International Phone Number. Transformation rules would be required to format the number for the display of the phone number to the user and for the physical storage of that number on a physical database. If that data element is retrieved from the physical storage, then transformation rules may be applied before the data is presented to the user.
    • A Canadian Social Insurance Number, which may be stored as a sequence of digits but displayed with a dash between the third and fourth digits and a dash between the sixth and seventh digits.
    • A US Social Security Number, which may be stored as a sequence of digits, but displayed with a dash between the third and fourth digit and a dash between the sixth and seventh digits.
    • On a text field that allows for the entry of alphabetic characters, those alphabetic characters may need to be transformed to uppercase for both presentation and storage.

Common Examples

The following are common examples that I have encountered in the course of my career:

  • Canadian Social Insurance Numbers
    • When there is provision for data entry, the Canadian Social Insurance Number may be entered as 9 digits, or 9 digits with a dash or a space between the third and fourth digits and the sixth and seventh digits.
    • The data may be stored as 9 digits with no spaces and no dashes.
    • When retrieved from data, it will be often displayed as 9 digits, with a dash between the third and fourth digits and between the sixth and seventh digits.
  • US Social Security Numbers
    • When there is provision for data entry, the US Social Seccurity Number may be entered as 9 digits, or 9 digits with a dash or a space between the third and fourth digits and the fifth and sixth digits.
    • The data may be stored as 9 digits with no spaces and no dashes.
    • When retrieved from data, it will be often displayed as 9 digits, with a dash between the third and fourth digits and between the fifth and sixth digits.
  • Addresses
    • One of the best guides that I have found to address formats is FRANK'S COMPULSIVE GUIDE TO POSTAL ADDRESSES.
    • Address formats differ significantly from country to country.
      • In Canada and the United States, the general format is:
        • Person or Company name on line 1,
        • The Unit Number, Street Number, Street Name on line 2. Note that the unit number is separated from the street number by a dash.
        • The City, Province or State (Abbreviated), and the Postal or Zip Code on Line 3.
        • The Country on line4.
      • In Germany, as an example, the general format is:
        • Person or Company name on line 1,
        • The Unit Number, Street Number and Street Name on lines 2 and 3. Note that the unit number (if required) is on line 2 and the Street Number and Street Name are on line 3 if required.
        • The Postal Code and City name on the line following the Street Name and Street Number
        • The State, if required, on the line following the Postal Code and City Name. For larger cities in Germany, State Names are not required.
        • The Country following the previous line.
    • Efficiently handling addresses
      • Within your own country, you may wish to format addresses as required for your country.
      • Where feasible, for other countries, you may wish to allow for separate formatting of structured and label addresses.
        • The answer to the 'feasible question' will depend largely on what the system will support.
          • If the system will support separate entries for a structured and label address, then this should be documented.
          • If the system will not support separate entries for a structured and label address, then a judgement call is required - which is more important and what is supported.