Active returns from the warehouse
An active data warehouse is an evolutionary step forward
from a traditional data warehouse. Stephen Brobst, the Chief Technology
Officer of Teradata, talks about the benefits of this technology. by Soutiman
How should an organisation go about building a data warehouse?
Successful data warehouse implementations deliver business value on an iterative
and continuous basis. Each iteration builds upon its predecessor to increase
the business value for information delivery. This explains the combined role
of traditional and active data warehousing. Rather than two different poles,
traditional and active data warehousing are integral parts of the data warehousing
In the early days, data warehousing focused almost entirely on providing strategic
decision-making capability to knowledge workers in the corporate ivory tower.
The underlying philosophy for implementing an active
data warehouse is to increase the speed and accuracy of business decisions.
The goal is to achieve decision-making as near real-time as necessary
to deliver maximum value
End users for the data warehouse were traditionally from areas such as marketing,
strategic planning and finance.
There is a lot of buzz around active data warehousing.
How does this differ from traditional data warehousing?
Data warehouse deployments improve the execution of a business strategy and
this evolution imposes an ever-increasing set of service levels upon a data
warehouse architect helping make data warehousing an active process.
The underlying philosophy for implementing an active data warehouse is to increase
the speed and accuracy of business decisions. The goal is to achieve decision-making
as near real-time as necessary to deliver maximum value. The key point is to
let the business application drive the data freshness and performance service
Is an active data warehouse more efficient?
An active data warehouse provides a new breed of decision support. The active
data warehouse differs from a traditional data warehouse on many levels.
First, the active data warehouse is a business-critical system for an organisation.
Since it supports operational decision-making, downtime cannot be tolerated.
In the traditional world of data warehousing, downtime is certainly undesirable
and decreases productivity for knowledge workers, but it is not usually considered
When a traditional data warehouse is unavailable, decisions are deferred until
it comes back up. Yet because those decisions are long-term in nature, the bottom-line
cost of deferring a decision is not overly significant. On the other hand, the
opportunity cost for downtime in an active data warehouse is very high because
operational decisions cannot be deferred.
Does downtime have a substantial impact on an active data
When an active data warehouse is unavailable, operational decisions are made
without the benefit of quantitative decision support. The impact of downtime
on an active data warehouse is the difference between optimised decisions and
In an enterprise deployment, the cost of downtime is very high and cannot
be easily recovered. Like a traditional data warehouse, a best-of-breed active
data warehouse spans functional and departmental boundaries within an organisation.
It provides a single source of truth for both tactical and strategic decision
What makes the design of an active data warehouse special?
An active data warehouse is designed to support enterprise-level business objectives,
and typically reaches further into the organisation than traditional data warehousing.
This often means integration with multiple channels across the organisation
such as the Web, call centre, and other customer touch points.
A key feature of an active data warehouse is to reduce the time between critical
business events and resultant actions. It is essential that the data analysis
that takes place in an active data warehouse be translated to actionable decisions
to maximise the value proposition from its deployment.
An active data warehouse enables whatever service levels are appropriate, and
should be able to scale to an enterprise level in doing so. Despite this, when
designing an application, it is important to be prudent in matching desired
service levels with the application requirements.
An active data warehouse, when architected properly, allows
each workload to be assigned its own service levels in order to optimise the
economic equation between business value delivery and capacity requirements
for delivering to the desired service levels.
What role do ETL and EAI tools play in a data warehouse?
From a business perspective, Enterprise Application Integration (EAI) means
providing unrestricted sharing of data and business processes among connected
applications and data sources in the enterprise.
To realise this goal, architects must provide a technical infrastructure capable
of combining business processes, software and hardware platforms, and standards
to allow seamless integration of two or more enterprise systems so that they
operate as one or at least provide the illusion of doing so.
A variety of industry buzzwords all focus on this goal. Web services, message
brokers, application servers, and middleware tools all provide aspects of the
infrastructure necessary to realize the EAI vision.
Extract Transform Load (ETL) is generally used to move large sets of data, transform
it mid-stream and load it in the target system. ETL is usually a pull system;
however, some vendors are heading toward push/pull ETL.
ETL has become a commodity in the marketplace. Nearly every data-warehousing
vendor offers it, and most databases understand what it takes to prepare data
for loading into a data warehouse.
Is the difference between ETL and EAI tools blurring?
Thanks to active data warehousing, data is bypassing ETL completely and being
deposited by EAI and other mechanisms directly into the enterprise data warehouse,
requiring that transformation be embedded in the DBMS systems. This may however
not be true for data warehouses that still operate in batch mode.
Volume, latency and functionality have hit a convergence point. Businesses no
longer have the time to perform substantial transformation on massive data volumes
before making tactical decisions. Traditional ETL methods create bottlenecks
because in some cases they offer a single-point solution for either batch or
near-real time, but do not offer a complete view of the data.
In order to survive going forward, ETL engines are shifting their focus to either
Extract Transform Load Transform (ETLT) or Extract Load in Real-time with Dynamic
restructuring capabilities (ELRD).
EAI infrastructure provides a bridge between the world
of bookkeeping and decision-making
Will these technologies transform a companys information
An active data warehouse deployment relies upon EAI infrastructure for both
data acquisition and decision delivery. It requires extremely up-to-date data
from the transactional processing systems within an organisation.
Advanced EAI infrastructure can facilitate real-time or near real-time data
acquisition. The EAI infrastructure provides a bridge between the world of bookkeeping
and decision-making. When a business event is recorded in the bookkeeping systems,
the EAI infrastructure updates the decision-making environment (active data
warehouse) on a real-time basis.
The bridge works in reverse as well. When analytic applications in a tactical
decision-support implementation detect the need for an action, the EAI infrastructure
is used to deliver decisions to the OLTP systems that will be responsible for
the associated bookkeeping activities to make each proposed action a reality.
EAI with process integration allows for closed loop decision-making.
Data fed from the bookkeeping environment into the active data warehouse will
cause event-based triggers to fire based on business rules, and initiate decisions
that are fed back into the operational bookkeeping systems for execution.
Give us an example of a closed-loop decision-making
For e.g., consider a retail environment where purchase transactions are captured
using an electronic Point-of-Sale (ePOS) network in thousands of stores distributed
across an extensive geography.
Transactions from the ePOS systems are published to an EAI message bus as they
occur in the stores. These business events are then delivered to all appropriate
subscribers, including the active data warehouse.
Under certain conditions, such as when sales trends indicate a rapid depletion
of inventory, the business rules embedded in the analytic capabilities of the
active data warehouse will arrive at a decision to order additional items for
delivery to those stores that would otherwise end up with empty shelves.
This inventory ordering decision is published using the EAI message bus and
divisions such as ERP and general ledger systems subscribe in order to be involved
in the realisation of the inventory re-order decision. EAI provides the glue
to facilitate the closed loop cooperation between the bookkeeping and decision-making
Soutiman Das Gupta can be reached