|
| |
EXCERPTS FROM A WHITE PAPER (formal,
Fortune 500 style)
In the 19th century, human ingenuity
created the Industrial Revolution, where machines replaced physical labor.
In
the
20th century, it created the Information Revolution, where
computers replaced manual calculations.
In the 21st century,
it is
creating the Knowledge Revolution, where autonomic systems will replace tedious
decisions based on rules and real-time
input from the environment.
In its own
way, each revolution increased the intellectual potential of every human being.
The first freed us from
backbreaking labor, the second from mind-numbing mental
tasks. The third is freeing us from rote
decision-
making and environmental
monitoring, enabling entrepreneurs and Fortune 500 firms alike to flourish as
never
before.
…
Data comes in two fundamental types.
Structured data is stored in conventional flat file or relational
databases
and accessible
through standard database functions. Unstructured data
comprises most data
that exists; it is data produced by standard,
non-database
applications, such as Adobe Acrobat, Microsoft
Office (Word, Excel, PowerPoint),
email, web logs, multimedia
files, voice mail, non-digitized data on forms
or
off-line (90% of unstructured data falls into this category) and more. The
existence of such disparate types
of data drives the requirement for information
integration. For example, a database
application can handle
structured data and
a content management application unstructured data. Then, an information
integration application puts it all together.
….
DATA MANAGEMENT SYSTEMS
The fact is that conventional databases do a good
job of managing structured data, but a notoriously poor
job of understanding it,
translating it into information and leveraging it into actionable knowledge.
They also
completely ignore unstructured, content-rich information. [Company X]
believes that the corporate imperative
is no longer one of managing homogeneous
data, but of fusing
together structured and unstructured data
into this
actionable information. This new type of data management system would
dissolve
the now artificial
boundaries between business and technology strategy. The
system would provide:
1.
The ability to treat all databases
as one, regardless of vendor. Thus, you need an open resilient
database
management system that supports flat files and relational data and structured
queries
across multiple database platforms.
2.
An application that accesses data
beyond the databases. In other words, you need access to the
data in digital
multimedia
files, rich content, email, documents, spreadsheets and more. This
means a
content management system that searches,
captures, categorizes,
integrates, stores, retrieves and uses
heterogeneous, unstructured data.
3.
A unified, single view of all
relevant data. This requires an information integration architecture that
supports traditional databases and content managers and that enables specialized
applications for
business intelligence, data mining and
knowledge management.
Since the real world contains
heterogeneous systems and applications, the system
must vigorously support open standards,
platforms and applications. With open
computing, DBAs must spend more time creating new
solutions and less time having
to work in “emergency mode.” Therefore, each element of a proper
information
architecture
must work with whatever hardware and software you have, from
mainframes
to workstations to desktops, from MVS to
Linux to Windows, from DB2
to Oracle to Sybase to SQL Server.
4.
Federated capabilities that give
business intelligence applications all the relevant data they need on
demand.
Business
intelligence requires a full view of the data; without that, your
decisions are more like
guesswork. The most important
federated capability is
simultaneously querying multiple, heterogeneous,
distributed data sources and
viewing your results
though a single filter or view. (Other federated
capabilities include searching, storing and securing data.) The alternative
to a
unified query is a series of
sub-queries, but their results have to be manually
merged and purged. This is particularly problematic
when business insight is needed in the context of mergers,
acquisitions and divestures.
What is needed is a way to ensure high data
quality for both structured and unstructured data. Today, data
that purport to
describe the same attribute (e.g., inventory levels, customer returns, and
geographic sales forecasts) often directly conflict with each other. The data
management system
must eliminate these islands and produce consistent answers to
queries.
|