Operation Clean Data

Buy-In from Owners of the Data

Similar to Carlson, Cendant - owner of car rental company Avis and realtor Century 21 as well as hotel chains Days Inn, Howard Johnson, Ramada and Travelodge - would love a single, enterprisewide view of all its customers. But five years work on building a data warehouse delivered virtually nothing. That's because no one was using it. By now you can guess the culprit: dirty data.

"Basically, the data warehouse was being used for list generation by two people in marketing," says Vincent Kellett, senior director of data services who was hired in 2002 to see if the project could be revived. "Because of data quality issues, the project was dying on the vine."

To make the system viable, Kellett realized the company would have to throw out a bunch of hard-to-maintain custom code, spend money on cleaning up some truly horrible data and institute formal processes for data maintenance. Even basic procedures such as subscribing to the national change-of-address database maintained by the US Postal Service had been overlooked by the project team. "They'd been so mired down in day-to-day problems that they just hadn't got round to it," he says.

Data-cleaning software from Trillium Software was pressed into service. The database originally contained 132 million records, a number that was eventually boiled down to 90 million "that at least had a name and a street address", Kellett says. At each cycle of the data-cleaning process, his team formulated new rules, which were then subjected to a trial experiment to both detect duplicates and correct them. Further winnowing, by matching against the latest information on address changes, eventually reduced the number to closer to 80 million cleaned records that were loaded into the data warehouse.

When a customer checks into a hotel or picks up a rental car and a new record is created, the system asks: Do we know this person? If so, load any new information - such as change of address or phone number - and then update their transaction data with another stay or car rental. And that information is automatically integrated with the rest of the customer database.

Key to the project's completion was a decision to closely involve the business owners of the data (in this case, the marketing department) in developing the data-cleaning standards. "I'd advise [anyone working on data quality] to work closely with the business users to define the matching rule," Kellett says. "What constitutes a match? Last name, or last name and first name? Or these, plus a matching credit card? And when a duplicate is detected, what rules determine which record will be the survivor record? For instance, is Bob Smith the same as Robert Smith? And is the new address revealed by a car rental due to a house move, or the acquisition of a summer cottage? Or just the wrong address completely?"

The turnaround in the project's fortunes has been so complete, Kellett says, that Cendant has been able to launch a loyalty program across nine of its chains - including Days Inn, Howard Johnson, Ramada, Super 8 Motel and Travelodge. Customers can now collect points (much like frequent flier miles) every time they stay in a Cendant hotel. Such a program would have been impossible without the single customer view that the cleaned-up data warehouse provides.

Show Comments