Data Quality and Enhancement
As well as providing high quality business data, we also help businesses looking for advanced data intelligence and analytical insight.
The steps below describe a typical overview of the process involved in cleansing, matching, and enhancing data.
Data cleansing can be carried out as a stand-alone exercise, or as part of a larger project to achieve a higher match rate.
Depending on the project, we’d ensure that:
- Post codes are standardised to UK format
- Correction of telephone fields such as 0181 to 0208, the addition of missing leading “0”s, removal of non-digit text
- Common abbreviations are all expanded consistently
- Trading styles are separated from other business names
- Forward filling addresses, such as where address line 1 is empty but line 2 is populated
- Street descriptors can be corrected to the full word i.e. RD to Road, ST to Street etc.
- Non-name and address information is removed and stored separately where possible
- Use of official address elements such as Town or County (where applicable) as well as non postally required elements such as localities or regions
- Appropriate capitalisation of town and postcode fields
- Non-UK records are consistently identified
- Poor quality records that might not justify matching and or inclusion in campaigns are flagged
- Where supplied, contact names can be cleansed, returning a correct salutation, envelope, title and gender (where possible)
Manual checks on the data are carried out to ensure the best quality data cleanse.
Experian’s bespoke matching process ensures the best matching rate, and accuracy, giving you a stronger foundation for all additional data projects.
During data matching, as many of the records from your initial customer database are matched against the pH Megafile (the Experian business database) as possible. In order to maximise match rates, the team employ a number of different types of matches independently, or as a combination.
- Company name & address match
- Contact match
- Match to self (where we can’t match records to the Megafile, we match them to each other to identify further links and to establish unique site, company and contact level identifiers across the entire base)
- Email match
- Consumer match (linking the business contact to Experian’s Consumer database)
Having matched and de-duplicated the client bases, a link to the Megafile and its data sources will have been established.
A number of data attributes can be appended, this includes:
Suppressions – identifying the following:
- Companies that are no longer trading
- Sites that are no longer trading
- Emails that are no longer valid
Company level attributes:
- SIC code – either 2003 or 2007
- Total number of employees
- Number of employees based in a certain region
- Company Registered Number
- Director linkages
Other attributes include other company size measures such as turnover, asset size, age of business, number of locations, growth profile, directory classifications. corporate group identification and linkages, site level employment, Mosaic type, financial strength indicators, a range of performance measures (productivity etc).
Experian’s Global Information Security Policy is based on the ISO27001 standard.