Operation Clean Data

Define the Rules in Advance

While the British defence ministry is still cleaning up its data warehouses to generate a more consistent view of military supply items, commercial companies are employing much the same technology to develop an enterprisewide view of their customers. In some cases, the demand for consistently clean data comes from the customers themselves, who want to see how their business is performing in real time. For example, in these economically constrained times, the corporate customers at Carlson Wagonlit Travel, one of the largest travel agencies in the world, are eager for good quality data on exactly how their travel and expenses budgets are being spent. Indeed, building a data warehouse that can deliver such information has become a competitive differentiator in the industry, says Jay Vetsch, senior director of information delivery at Carlson.

The task for Vetsch and his team was daunting. With annual sales of $US10.5 billion and operations spread over 140 countries, the agency has high data volumes: 14 million airline tickets per year, 12 million hotel nights booked every year and so on. While the raw number of transactions per day (around 60,000) is doable, each record often equates to a trip with several flights, hotels and rental car reservations. Thus, the record size is massive, around 400 fields.

Worse, the data must be extracted from a number of different back-office systems spread across the business. What's more, the data is subject to the inputting vagaries of the front-office operators in those 140 countries - not just human vagaries, but also differences in legal, tax and accounting regulations. And from the point of view of the people generating the data, Vetsch's task is not mission-critical.

"You have to remember that the information is being generated for the purpose of getting a traveller a ticket - not for an MIS system to provide reports to clients," he says.

As a result, the data can contain errors - an invalid supplier code, client code or a fare discrepancy - not major enough to prevent a ticket from being issued, but flawed enough to foul up an analysis. Vetsch relies on software that acts as a gate guardian to the data warehouse. If a record meets defined data quality criteria, it's allowed to proceed. If it doesn't, it's kicked back to the originating office for correction.

Data from Europe, where the company has offices in most countries, is already being used on a limited basis to generate client reports. Company agents in North America and the remainder of the countries in which Carlson has offices should be able to generate such reports by early 2005. Vetsch declined to disclose the projected ROI for the cleansing effort. However, if good quality client reports have become the price of getting corporate business, then it's a bold manager who'd argue that the investment was nothing other than the price of survival.

Show Comments