Where an Application Enterprise landscape, viewed from a Data Integration perspective, is comprised of a classic point-to-point integration model, it “does not represent an effective application integration solution.” (Linthicum, 2004). An alternative, referred to as a Centralised Architecture, which is known as Data Virtualisation, aims to supplant the point-to-point integration architecture, remedy the point-to-point woes, by providing a single access point to data, for applications.
Business agility within an Information Technology enabled enterprise, is entwined with its expediter, the Information Technology Department. Where Information Technology can’t perform, business suffers. Where Information Technology is non-optimal, business loses revenue. The goal to meet is the total quality management concept of Edwards W. Deming, i.e. to reduce waste through the optimisation of production.
The reduction of waste is achieved by implementing and running lean systems. These systems should use enterprise and industry standard practices and technologies to optimise the enterprise architecture. For Data Integration, such concept is known as Data Virtualisation. The expressed enterprise solution for Data Integration is the Data Delivery Platform, as described by author Rick L. van der Lans, of the book: “Data Virtualization for Business Intelligence Systems”. The main goal of the Data Delivery Platform is to offer a single point of data access for all applications. Being constrained to the single access concept, all services, technologies, modelling and profiling of data are controlled and availed through one edifice. This consolidates the business logic and allows for many beneficial features to result, such as Data Quality, Master Data Management, Meta Data Management, and logical enterprise data consolidation, over and beyond the benefits of the single access point. Implementing Data Virtualisation does not imply that all systems be moved away from the current integration architecture at once, but the organisation is able to do this over a time that allows for comfortable and sustainable change. The organisation may take this opportunity to define its data landscape more thoroughly, by learning along with the change.
Through Data Virtualisation, it is envisaged that Application Integration is facilitated by its Data Integration, as a logical subset. The Enterprise Architecture is optimised by reducing integration points, potentially consolidating infrastructure to cut that foot print, by saving the spend. As the enterprise identifies business optimisation opportunities, at the hand of deductions forthcoming from the Data Virtualisation benefits, physical data consolidation may be had. This implies that further optimisations may result in extra savings not only from an analysis impact perspective, but again to infrastructure. This means that specific data characteristics can be more adequately matched to hardware and disk formats, providing more performance enhancements to reporting requirements.
The core performance benefit of Data Virtualisation lies in the technology. It can be implemented by augmenting the implementation with an in-memory data store that serves as a data cache. Along with the intrinsic ability of Data Virtualisation software, i.e. its own query engine by which to optimise queries over heterogeneous data sources, the in-memory structure may serve the query engine with reference data, cached transactional data, or query enhancing cached results to aid a specific report or query.
In summary, by not implementing a Data Virtualisation technology, a company’s systems are left to deal with the ever-increasing point-to-point integration complexity, as they grow, being difficult to assess change, even change, by not being agile enough to sustain change. Information Technology is left to deal with protracted assessments when change is required, complex implementations of change, with the ever-present chance of interfering with the functionality of other, non-change-related systems. Large point-to-point integrated systems also need a huge human resources contingent and infrastructure foot print, increasing expenditure and dependency on non-essential technologies.
This article looks at Data Virtualisation, more specifically the Data Delivery Platform, as an Enterprise Architecture.
An Overview of the Data Delivery Platform
An Enterprise Data Delivery Platform (DDP), is to be understood as defined by Rick L. van der Lans, in his book: “Data Virtualization for Business Intelligence Systems”, i.e. to facilitate a single access point to data, for real-time, ad-hoc, and Service Orientated (message based) architectures.
He defines the Data Delivery Platforms as follows: “The Data Delivery Platform is a business intelligence architecture that delivers data and metadata to data consumers in support of decision-making, reporting, and data retrieval, whereby data and metadata stores are decoupled from the data consumers through a metadata-driven layer to increase flexibility, and whereby data and metadata are presented in a subject-orientated, integrated, time-variant, and reproducible style.” (van der Lans R. L., 2012).
“A Data Delivery Platform must give data to a broad range of business applications, business intelligence tools, middleware such as an Enterprise Service Bus, or Business Process Management (BPM), using the access methods, formats and protocols required by these diverse consumers.” (Eve, Robert, 2010).
We therefore should imbibe Data Virtualisation and extend the context to include real-time transactions and message based data entities, as inter-system message-formatted data, or SOA Enterprise Service Bus functions. This architecture, call it the Data Delivery Platform, attempts to make a total solution for data delivery in the enterprise, for decision support (data marts or enterprise data warehouse) and historical transactional records.
Data Virtualisation is the process of abstracting the data sources of disparate systems through a unified data access layer. Access to data is so achieved by connecting to a single source, facilitating data abstraction, i.e. presenting only relevant data: “Rule 9: Logical Data Independence”, and encapsulation, i.e. access implementation hiding. (Harrington, 2009). A great advantage that Data Virtualisation provides over the hub and spoke message bus systems is that it is much more scalable for disaster, i.e. point failure situations. Data Integration, viewed in this context, is a subset of Application Integration, i.e. to allow applications, contemporary, and legacy, to be able to communicate with each-other. Data Integration is however, larger in scope than Application Integration, which also addresses the integration of data into structures such as data warehouses for the purpose of business intelligence.
In a current point-to-point Information Technology topology, the DDP (as a single source) replaces the “spaghetti” integration routes between applications and data stores. To a degree, the business logic of the modelling of data via queries (for extraction into applications) as well as the construction of message based datagram to legacy, e.g. mainframe systems can be facilitated by the DDP. In a point-to-point architecture, this logic is spread over application and database domains, with business logic vested in application code, database objects, and integration servers, e.g. BizTalk. When using BizTalk, for SOA message exchanges, these are sent between systems through the integration BizTalk server that acts as an Enterprise Service Bus (ESB) service. The ESB idea, therefore, taken up together with Data Virtualisation, collectively becomes a Data Delivery Platform for the Enterprise.
Data Virtualisation, as concept, is commonly used with Data Integration, Business Intelligence, Service-orientated Architecture Data Services, Cloud Computing, Enterprise Search, and Master Data Management. “It is both a Data Integration approach and a technology. It is used as middle-ware that leverages high-performance software and advanced computing architecture to integrate and deliver data from multiple, disparate sources in a loosely couples, logically federated way.” (Eve, 2011).
Data Virtualization has its roots in Business Intelligence. The idea comes from the mid 90’s, when it was known as “virtual data warehousing and marketed as an alternative to traditional data warehouses.” (Brunelli, 2013). Data virtualisation is foundational to Data Integration; it enables fast and direct access to the critical data and reports, required and trusted by the business. It is not to be confused with simple and traditional Data Federation. Instead, it is as a superset, which must complement existing data architectures to support Business Intelligence Agility (BIA), Master Data Management (MDM) and a Service Orientated Architecture (SOA). It handles the complexity involved, in integrating diverse and distributed data. As Forrester Research, states in the report: “Data provided through the data services layer is updated, transformed, and/or cleansed when (or before) applications access it. . . Data virtualization solutions provide a virtualized data services layer that integrates data from heterogeneous data sources and content in real-time, near real-time, or batch as needed to support a wide range of applications and processes.” (Yuhanna & Gilpen, 2012).
Unfortunately, any reference to Business Intelligence conjures up images of costly and mostly failed data warehouse implementations. To avoid any undue associations, it is important to make a distinct statement that this article refers to Business Intelligence as any form of reporting that is created from raw business or transactional data. This paper aims to explain the concepts and benefits of Data Virtualisation, by describing it within the a current point-to-point architecture, and to the concept of a Data Delivery Platform.
A fundamental of computer science that is the root of data virtualization: “Applications should be independent of the complexity of accessing data. They should not have to bother about where data is located, how it is accessed, or what its native format is. . . The broader context of data virtualization emphasises the abstraction and decoupling principle of E.F. Codd between all data sources, all consumers, all data delivery styles, and all data management functions.” (van der Lans R. L., 2012).
The main goal with Data Virtualisation is to achieve Business Agility. (Davis & Eve, 2011). According to The Data Warehouse Institute (TDWI) 34% of business users are dissatisfied with IT-delivery of Business Intelligence capabilities. Van der Lans lists the TDWI’s five factors towards self-service BI as follows”:
- Constantly changing business needs: 65%
- IT’s inability to satisfy new requests on time: 57%
- The need to be a more analytics-driven organization: 54%
- Slow and untimely access to information: 47%
- Business user dissatisfaction with IT-driven BI capabilities: 34%
According to the Aberdeen Group’s report: “Three Steps to Analytic Heaven (2011)”, 43% of respondents indicate that their “time window for decisions, [are] increasingly compressed.” (Aberdeen Group, 2011). Agility in business, i.e. to enable the business through its information technology, is a major hurdle to overcome and having a common, enterprise data integration layer through Data Virtualisation, “can become the heart of most systems.” (White, 2011). Achieving business agility, against the backdrop of a growing business volume of data sources and source data is the foremost pressure that demands a more efficient approach to data utilisation for business intelligence purposes. The Aberdeen Group quotes this as a significant pressure to organisations where 49% of respondents cited this. (Aberdeen Group, 2011).
Data Virtualisation provides the ability to standardise, through layers of abstraction, into a unified logical representation to all consumers throughout the enterprise, as a semantically structured, and consistent view. (Loshin, 2010). And, for this reason, it makes it possible to conduct real-time or near-real-time analytics because the Data Virtualisation layer provides access to data without having to create new copies of the information, eliminating the need to replicate it and introduce data flaws. One of the differences between data federation and data virtualisation is the latter’s ability to write data back to source. As opposed to some Enterprise Data Integrations technologies, making use of read-only optimisations, a data virtualisation layer provides write-backs.
“One of the less obvious advantages of database virtualization is a reduction in the read I/O issued against the underlying physical storage (Storage Area Networks) that ultimately stores the data for virtual databases.” (Hayward, 2013). Also, the original data is not being moved, so any tinkering with data quality can be done in the Data Virtualisation layer, without messing up the source. Such a layer also alleviates the architecture from the legacy mainframe concept of overnight batch processing, by mapping through this layer to original data artefacts. The idea of Data Virtualisation, basing the thought on the definition of Big Data, is to bring querying data more into step with the characteristics of Internet search engines. By optimising the search algorithms, the source data’s format becomes less relevant. Therefore also the relevance of a relational database management system’s structure. This structure, primarily required to aid a Structured Query Language (SQL) to extract specific data records, but now as authority passes to intelligent search engines; relieve the databases (data sources) from having any structure at all. The Data Virtualisation technology becoming the single point of structure, and through abstraction, hides whatever the data structures are even with encapsulation, the implementation details of the objects. The query-ability of the data repositories are consequently no longer vested in their structure, and the conformance of SQL to extract the data, but rests in the single point of query, i.e. the Data Virtualisation layer, making use of the data model and the search algorithms to get data from relational and non-relational stores, seamlessly.
Optimisation does not have to end with the Data Virtualisation layer, but can also spill over to the physical data stores. Because Data Virtualisation also provides the ability to write-back that is to change the original data, optimisation should not stop within the logical data model of the Virtual Database. Data may also be consolidated physically. Consolidation being a boon forthcoming from the Data Virtualisation’s ability to give insight into enterprise data, teaching the business and IT where optimisations may be affected.
Data consolidation is a huge attempt on a physical data store level. Data Virtualisation provides an advantage for analysing data, because of its consolidation ability, and consequently produces the ability in that for strategic motions and decisions. Data cleansing and enrichment are achieved first through the Data Virtualisation layer in that one consistent version is made available to all consumers. The underlying data could be semantically inconsistent, as a result of different business unit’s different interpretation of the same data business concepts, such as a customer. Resolving such inconsistencies is a by-product of the Data Virtualisation layer. The canonical result is a conformed representation derived from all the many attributes of the concept, out of all the data sources, grouped into semantically similar entities.
“Enterprise data quality can be viewed as a Gordian knot and data virtualization provides a fundamental approach to slice through this barrier to consistency and finesse the organizational challenges associated with data quality improvement. Any data quality team faced with the challenges discussed in this paper would benefit from considering data virtualization as an approach to implementing data quality control and consistency.” (Loshin, 2010). But, “To get data virtualization right, it requires a deep and thorough understanding of very complex problems that exist in the data integration domain.” (van der Lans R. L., 2012). Consider the architectural diagram: Enterprise Application Integration with Data Delivery Platform, depicting a physical data consolidation for transactional, historical, and summarised (market and enterprise) data, with data marts and enterprise data warehouse.
In summary, and in accordance with a broad view of Data Virtualisation, it is to combine heterogeneous sources of data, through a logical model, into a single point of access. This layer has a universal view of the data, which lends to it capabilities to consolidate objective business views of data entities. Mechanisms such as meta- and master data management allow for enterprise consolidation data practices to profile enterprise data to the extent that feasible changes are possible for the enterprise business taxonomy. Where data is often exported to other systems, and source data problems in quality occur where a retro-fix is required, the re-introduction of the repaired data introduces new issues for the downstream systems, such as making changes independent of meta- and master data definitions. Such problems are manifold for silo architectures and increase operational costs, because each time data is extracted and copied, it introduces a new opportunity for data flaws.
Business Drivers and Success Factors
The Enterprise Data Delivery Platform is a motion towards agile enterprise business analytics. In order to achieve this, all enterprise data must be provisioned as information to the business from which to derive decisions, in the quickest and most cost-efficient way possible.
“The long [Enterprise Data Warehouse] EDW timescale is a concern for many organizations. Businesses want to improve their decision-making today, not in two years’ time. Often, too, individual decision-makers want to see their reports and dashboards immediately, without waiting for the EDW process to complete, and so interim solutions are created. However, the underlying business intelligence requirements change frequently—66% of them at least monthly, according to Forrester. The likelihood is therefore that by the time the EDW is complete, the business’s requirements will have moved on, and any analytics that have already been developed in early iterations will need to be reworked. The traditional way to report rapidly on data within transaction systems is either to download it to Microsoft Excel (with all the inherent problems of uncontrolled data and calculation errors), or else to build the reports directly on the source applications.” (Capgemini, 2013).
The challenges in Information Management, in the provisioning of data for decision analytics, are twofold, first the frequent nature of requirements, and secondly the sluggish response by information technology to offer computerised solutions for these requirements in reasonable time. This is the case, mainly because of the impact on data and application integration. The problem, therefore, is vested in the interrelated integration points between applications and their data. Data Virtualisation, in support of real-time and message based data exchanges, supplants the traditional point-to-point integration with a single data access façade, called the Data Delivery Platform. This platform is the concept and technology to vet and prove for implementation, and to resolve those aspects that impede the business in achieving timeous technology solutions to changing business requirements.
On top of business decision-making agility, is also the cost based nature of an Information Technology department. The only way to optimise it, is to make sure that it runs lean. This can only be achieved through optimisation of performance and reduction the of infrastructure.
Achieving the economics of scale, through implementing a Data Virtualisation optimisation, requires that it be justified, not only for the cost, but the business need and ability to meet what it is punted being capable of.
Business Drivers for Data Virtualisation in the Organisation
The business must get some benefit from Data Virtualisation to prove the cost and effort of the alternative. A few drivers for data integration, through Data Virtualisation to be:
- Pressure from business for quick, cost-effective reporting solutions
- Need to build real-time reporting and analytics and self-service Business Intelligence
- Overcome departmental politics to allow applications to share crucial information
- Require an Enterprise Service Bus (ESB) to exchange data between applications
- Integration and combination of multiple data types, e.g. external social and/ or unstructured data with internal BI
- Multiple BI tools in use, each with their own presentation layer
Success Factors for a Virtual Database
Once the concept is implemented and the money spent, how is the efficiency of the Data Virtualisation measured? A few factors that could signify success:
- Data Quality: Very bad data quality can hamper the performance of the database by having to carry out complex rules
- Complexity of source systems: requires the modelling of complex queries or make data more granular for extraction, both of which impacts the performance
- Stability of underlying systems: the virtual database is only as good as the underlying databases
- Metadata: shared metadata is essential for ensuring consistency
- Real-time updates: the virtual database should not be employed to make real-time updates to source systems
- Integration Scope: realistic integration must be scoped, such as where real-time and historically granular data are mashed for reporting solutions, using extensive business rules and transformation, can cause performance degradation.
- Give yourself plenty of time to implement the technology–Pella
- Reduce facility costs: Where physical data stores are consolidated, the hardware infrastructure costs are reduced, hence the total facility cost
- Error Reduction: An integrated data interface minimises complexity and improves operational efficiency. This directly contributes to reducing the chance of error
Benefits of Data Virtualisation
Money spent must procure benefit, and consequently an organisation must attain some benefits from the Data Virtualisation implementation, to the effect of:
- Early and iterative business (analyst) involvement
- Makes data available in a unified format, a single environment for data integration and data federation
- The ability to create virtual views with no data movement – but also easily reuse for batch
- Hides complexities from applications (abstraction) and provides seamless access
- Reduces the data duplication phenomenon by retaining data at source
- Provides a single version of the truth
- Integrates real-time data into the decision (historic) data paradigm
- Provides stepping stone to agile BI (self-service)
- Provides a viable alternative to EDB (enterprise data bus) technologies such as TIBCO and JBossMQ where integration is primarily for reporting of read access
- Can also provide SOA solution with applications accessing the virtual database through web style services
- Can be deployed in a phased manner to gradually build the enterprise model
- A pre-built library of rich ETL-like advanced data transformations
- Carry out functions (on the fly):
- Data quality
Specific Benefits for Changing a Point-to-Point Architecture
The question that begs answering is what specifically would be helpful for a point-to-point organisation in implement Data Virtualisation, as a mechanism for Data Integration. The most outstanding goal to be achieved, for any cost-cantered business service, is Operational Efficiency. An Information Technology department is a business enabler that costs the business the amount of money used to provision the business services. These are cost, in accordance to the spend, required to bring products and services to market. Managing this spend, to make it as efficient as possible, requires that its operational processes be as lean as possible by reducing the waste as much as possible. (Deming, 1982).
If the current application integration landscape of the organisation’s enterprise architecture, with respect to data integration, is a heterogeneous, disparate, and point-to-point architecture, Data Virtualisation can offer significant improvements. Enterprise application integration (EAI) is a term used for denoting the process of combining separate applications in to a cooperating federation of applications. There are two logical architectures by which to achieve this, viz. point-to-point connections (the current architecture) and middleware based integration, e.g. a Data Delivery Platform.
Data Virtualisation, promoted to its matured last state as an Enterprise Data Delivery Platform, is a type of Centralised Application Integration Architecture. This design, over the traditional point-to-point, offers the opportunity to optimise the information technology within the business architecture, not only on performance but more specifically on cost, by reducing complexity and infrastructure demands.
Point-to-Point Application Integration
Point-to-Point integration on the middle tier allows an application to link to another by a connecting pipe through which the communication is facilitated, and generally relies on message queues to transport messages. It is limited in that it becomes tedious and ineffective to bind more than two applications, especially because it does not facilitate middle-tier processing, i.e. applications logic applied to information flowing through the pipe. It certainly is possible to link more than two applications with traditional point-to-point middleware, but “doing so is generally not a good idea. . . an does not represent an effective application integration solution.” (Linthicum, 2004).
A point-to-point integration’s infrastructure proves brittle because applications are tightly coupled. Changes to this architecture may break any function of applications involved. Each integration point forms an entity to support, and integration points between applications are double the amount of applications being integrated.
Middle-ware-based Application Architecture
This middleware purposes to reduce the interdependence of integrated applications. The integration points exist in a one-to-one ratio with applications. This reduces the risk of errors and integration maintenance. Application logic may be added to the middleware to facilitate complex operations on the travelling data, such as transformations, aggregations, routing, separating, and converting messages. The only impediment is to set up the middleware and transplant the point-to-point applications to use it. But, it is touted by Forrester Research that this may be a boon and not an impediment, for organisations can govern the transition, using their time to transform the organisation, thus reducing the change management costs.
By virtue of the word centralised, a centralised architecture is located central to the enterprise architecture, allowing applications to interact with data sources via this logical concept. Heterogeneous data sources, spanning physical relational, no-sql, files, and services are accessed without any knowledge of the implementation, or underlying data structures. Security, from a user’s perspective of an application, is implemented in this centralised architecture, providing an overall security. This becomes the single location for applications through which to interact with each-other and the enterprises data.
Such an architecture improves maintenance and management, and has business rules for routing and transformations, making the debugging of applications particular to one set. A Data Delivery Platform’s aim is to achieve all the punted advantages of a centralised architecture, but specifically to:
- Improve Data Quality: Meta- and Master- Data Services
- Loosen inter-application dependencies
- Split out transactional, historical, and reference data. This to allow the physical hardware to be consolidated and specified for specific data characteristics, viz. transactional and decision support data
- Provision volatile data cache for real-time processing, allowing for reference and historic transactional data to be cached for performance
- Eliminate the intra-day to history batch processing to do transactional data roll-overs
- Provision data for development and testing, through data masking
- Consolidate Security
- Optimise reporting queries comprising intra-day and historic transaction combinations, by utilising a common data cache and query engine
- Provide an enterprise view of data, through the centralised architecture’s ability to model data by making use of abstraction, including various data sources
- Provide aggregated (summarised) data store(s) for historical record keeping requirements
- Establish a horizontally infinitely scalable architecture, to meet enterprise performance demands for reporting and data services
- Provide a single-point enterprise application architecture for the organisation’s business systems
Data Integration Architecture
In order to achieve optimal and efficient enterprise application architecture, is to integrate applications with respect to their data. The Enterprise Application Integration with Data Delivery Platform picture illustrates such an architecture. Data Integration is achieved through a Centralised Architecture by creating a logical grouping of integration technologies called the Enterprise Data Delivery Platform that combines practices such as Data Virtualisation, Enterprise Service Bus, and Object Relational Mapping (ORM). The latter being a programming technique in which a metadata descriptor is used to connect object code to a relational database, but also is a type of Data Virtualisation implementation. (van der Lans R. L., 2012).
The drive of this architecture is to merge data access, by providing a single point of access to all applications. It also has consolidation in mind, especially for the physical data stores. Transactional and transaction history stores are consolidated into a single store, being the historic repository, by injecting an intermediary in-memory intra-day transaction repository cache, as part of the Data Delivery Platform. The historic transaction stores are trickle-fed (asynchronously) from the in-memory cache, being the primary receptacle of the real-time transactional feeds. The Data Virtualisation technology facilitates the queries from applications, for data spanning intraday and historical records, by serving appropriately.
In addition to transactional (intra-day) and transaction history, a second class of data exists, suffice to term that Market Data, or data to sell. This is summarised data, i.e. data in an aggregated form, information, as it is classified for Business Intelligence. Physical data separation,between transactional and analytical date, achieves the goal of separating data characteristics so that the hardware and storage types may be differentiated in accordance to the requirements of such data. For example, transactional data stores require different hardware to decision based data, in that the input-output characteristics determine that one is written in short bursts, while the other is read in long contiguous extents. One cannot use parallelism on the processor, while the other can. Having both on the same hardware is counter-productive. The same applies to disk formats.
Reference data synchronisation is important in terms of the master to slave relationship of the data, between the master copy and other consuming systems, such as Complex Event Processing. The Data Delivery Platform, being the single point of data integration, uses the cached (in-memory) data store from where to despatch reference data updates to synchronise other systems. The enterprise synchronisation ensures that the master copy of persisted reference data is up to date at the end and start of each business day. The in-memory cache is refreshed at start of day, and synchronisation perpetuated throughout the enterprise architecture.
The Data Virtualisation technology is extended by the in-memory database, using it as a cache for all sorts of reference, intraday transactional, and query beneficial data. In addition to caching and data access, the Data Delivery Platform must also facilitate functions of an Enterprise Service Bus, by catering for Service Orientated (SOA) message based information exchanges, such as done by a Message Bus. In this architecture, Object Relational Mapping (ORM) is to be deemed a type of Data Virtualisation implementation. Its function is to map a single application to a single data source. This is also achievable via the Data Virtualisation layer, and it remains a moot point if the Data Virtualisation technology won’t be able to provide the Object Relational Mapping functions more economically.
Enterprise Application Integration with Data Delivery Platform
Restrictions & Limitations
The term Business Intelligence (see Definitions below) is restricted to show the practices, technologies (applications, tools, infrastructure) by which raw transactional data is analysed to make it meaningful information that is delivered to the business (for business purposes), for strategic, and tactical insights, by means of a report or query, and other concepts, such as data mining and data warehousing analytics.
The scope of this article, in discussing the concept of a Data Delivery Platform, restricts itself to the meaning of Business Intelligence as inclusive of all concepts other than data mining and data warehousing. The resultant meaning thus, in summary, for Business Intelligence to be the facilitation of raw (transactional) data, via tools and technologies, using best practices, as information accessible to the business on which to do analytics, by way of reports and queries, to make decisions in support of strategy, tactics, operations and business insights.
Various definitions exist, but suffice to quote about three authoritative sources:
- Gartner: Business intelligence (BI) is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance. (Gartner, 2013).
- Business intelligence is a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information used to enable more effective strategic, tactical, and operational insights and decision-making.
- A set of methodologies, processes, architectures, and technologies that leverage the output of information management processes for analysis, reporting, performance management, and information delivery.
- Olivia Parr Rud: Business intelligence (BI) is a set of theories, methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information for business purposes. BI can handle large amounts of information to help show and develop new opportunities. Making use of new opportunities and implementing an effective strategy can give a competitive market advantage and long-term stability. (Rud, 2009).
- CIO: Business intelligence, or BI, is an umbrella term that refers to a variety of software applications used to analyze an organization’s raw data. BI as a discipline is made up of several related activities, including data mining, online analytical processing, querying and reporting. (Mulcahy, 2010).
 When using this definition, BI also has to include technologies such as data integration, data quality, data warehousing, master data management, text, and content analytics, and many others that the market sometimes lumps into the information management segment. (Forrester, 2010).
 We also refer to data preparation and data usage as two separate, but closely linked, segments of the BI architectural stack. (Forrester, 2010).
Encapsulation, is a form of information hiding, and consists of separating the external aspects of an object, which is accessible to other objects, from the internal implementation details of the object, again hidden from other objects.
In abstraction, non-relevant tables, columns, and rows are hidden and is synonymous with the Codd concept of logical data independence. This is typically achieved in SQL databases, via views.
Webopedia: Data virtualization is a methodology that takes a layered approach to dealing with huge volumes of data from disparate sources. The phrase virtualization is used because data virtualization is the aggregation of the data from disparate sources, including databases, systems and storage, to create a single virtual view from within a front-end systems; such as applications or dashboards. Data virtualization is commonly associated with business or enterprise applications including sales force automation, customer relationship management, enterprise resource planning, and business intelligence. (Webopedia, 2013).
Rick L. van der Lans: Data virtualization is the technology that offers data consumers a unified, abstracted, and encapsulated view for querying and manipulating data stored in a heterogeneous set of data stores. (van der Lans R. L., 2012).
Techopedia: Data virtualization is the process of aggregating data from different sources of information to develop a single, logical and virtual view of information so that it can be accessed by front-end solutions such as applications, dashboards and portals without having to know the data’s exact storage location. (Janssen, 2013).
A virtual database is also called a federated database, and is a mechanism to query several databases as if they were a single entity. It is also called sharding. Federated databases are a subset of Data Virtualisation.
Java World (Kang, 2002): Application Enterprise Integration combines separate applications into a cooperating federation of applications. Tow logical integration architectures for integrating applications exist: Direct point-to-point and middleware based integration.
PC Mag Encyclopaedia (PC Mag, 2013):
- Translating data and commands from the format of one application into the format of another. It is essentially data and command conversion on an ongoing basis between two or more incompatible systems.Implementing application integration has traditionally been done by tedious programming, or occasionally one package might support the interfaces of one or two other packages. However, the trend today is to use message brokers, applications servers and other specialized integration products that offer a common connecting point. Since the advent of the Web, these pre-packaged “middleware” solutions have become widely used to Web enable the enterprise.
- Redesigning disparate information systems into one system that uses a common set of data structures and rules.
Investopedia (Definition of ‘Enterprise Application Integration’, 2013): The translation of data and other commands from one application format into another. Enterprise application integration is an ongoing process between two incompatible systems. This can allow for differing financial applications to interface effectively and process data or transactions.
SearchSOA (EAI (enterprise application integration), 2013): EAI (enterprise application integration) is a business computing term for the plans, methods, and tools for modernizing, consolidating, and coordinating the computer applications in an enterprise.
Typically, an enterprise has existing legacy applications and databases and wants to continue to use them while adding or migrating to a new set of applications that exploit the Internet, e-commerce, extranet, and other new technologies.
EAI may involve developing a new total view of an enterprise’s business and its applications, seeing how existing applications fit into the new view, and then devising ways to efficiently reuse what already exists while adding new applications and data.
Enterprise Service Bus
SearchSOA (enterprise service bus (ESB), 2013): An enterprise service bus (ESB) is a software architecture for middleware that provides fundamental services for more complex architectures. For example, an ESB incorporates the features required to add a service-oriented architecture (SOA). In a general sense, an ESB can be thought of as a mechanism that manages access to applications and services (especially legacy versions) to present a single, simple, and consistent interface to end-users via Web- or forms-based client-side front ends.
Techopedia (Enterprise Service Bus (ESB), 2013): An enterprise service bus (ESB) is an integrated platform that provides fundamental interaction and communication services for complex software applications via an event-driven and standards-based messaging engine, or bus, built with middleware infrastructure product technologies. The ESB platform is geared toward isolating the link between a service and transport channel and is used to fulfil service-oriented architecture (SOA) requirements.
Gartner (Gartner, 2013): The discipline of data integration comprises the practices, architectural techniques and tools for achieving the consistent access and delivery of data across the spectrum of data subject areas and data structure types in the enterprise to meet the data consumption requirements of all applications and business processes.
Techopedia (Data Integration, 2013); Data integration is a process in which heterogeneous data is retrieved and combined as an incorporated form and structure. Data integration allows different data types (such as data sets, documents and tables) to be merged by users, organizations and applications, for use as personal or business processes and/or functions.
Object Relational Mapping (ORM)
Digplanet (Object-relational mapping, 2013): Object-relational mapping (ORM, O/RM, and O/R mapping) in computer software is a programming technique for converting data between incompatible type systems in object-oriented programming languages. This creates, in effect, a “virtual object database” that can be used from within the programming language. There are both free and commercial packages available that do object-relational mapping, although some programmers opt to create their own ORM tools.
Techopedia (Object-Relational Mapping (ORM), 2013): Object-relational mapping (ORM) is a programming technique in which a metadata descriptor is used to connect object code to a relational database. Object code is written in object-oriented programming (OOP) languages such as Java or C#. ORM converts data between type systems that are unable to coexist within relational databases and OOP languages.
Douglas K Barry (Barry, Douglas K.;, 2013): Object-relational mapping (OR mapping) products integrate object programming language capabilities with relational databases managed by Oracle, DB2, Sybase, and other RDBMSs. Database objects appear as programming language objects in one or more existing object programming languages. Often, the interface for object-relational mapping products is the same as the interface for object databases.
Data Integration. (2013). Retrieved April 22, 2013, from Techopedia: http://www.techopedia.com/definition/28290/data-integration
Definition of ‘Enterprise Application Integration’. (2013). Retrieved April 22, 2013, from Investopedia: http://www.investopedia.com/terms/e/enterprise-application-integration.asp
EAI (enterprise application integration). (2013). Retrieved April 22, 2013, from searchsoa.techtarget.com: http://searchsoa.techtarget.com/definition/EAI
enterprise service bus (ESB). (2013). Retrieved April 22, 2013, from Search SOA: http://searchsoa.techtarget.com/definition/enterprise-service-bus
Enterprise Service Bus (ESB). (2013). Retrieved April 22, 2013, from Techopedia: http://www.techopedia.com/definition/5229/enterprise-service-bus-esb
Object-relational mapping. (2013). Retrieved April 22, 2013, from Digplanet: http://www.digplanet.com/wiki/Object-relational_mapping
Object-Relational Mapping (ORM). (2013). Retrieved April 22, 2013, from Techopedia: http://www.techopedia.com/definition/24200/object-relational-mapping–orm
PC Mag. (2013). Retrieved April 22, 2013, from Definition of:application integration: http://www.pcmag.com/encyclopedia/term/37910/application-integration
W. Edwards Deming. (2013). Retrieved April 22, 2013, from WikiPedia: http://en.wikipedia.org/wiki/W._Edwards_Deming
Aberdeen Group. (2011, March). Three Steps to Analytic Heaven. Aberdeen Group, 28.
Analyst, M. (2010, 02 15). Modern Analyst. Retrieved 02 15, 2010, from Modern Analyst: http://www.modernanalyst.com/
Barry, Douglas K.;. (2013). Object-relational mapping (OR mapping) definition. Retrieved April 22, 2013, from Service Architecture: http://www.service-architecture.com/object-relational-mapping/articles/object-relational_mapping_or_mapping_definition.html
Brunelli, M. (2013, February). In an information downpour, data virtualization products bloom. Retrieved April 09, 2013, from Search Data Management: http://searchdatamanagement.techtarget.com/feature/In-an-information-downpour-data-virtualization-products-bloom
Capgemini. (2013). Data Virtualization. Retrieved April 04, 2013, from Capgemini.com: http://www.capgemini.com/sites/default/files/resource/pdf/data_virtualization._how_to_get_your_business_intelligence_answers_today.pdf
Davis, J. R., & Eve, R. (2011). Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility (1st ed.). Ashland, OR, USA: Composite Software, 2011.
Deming, E. W. (1982). Out of Crisis. Cambridge, MA, U.S.A: Massachusetts Institute of Technology.
Eve, B. (2011). Achieving Business Agility with Data Virtualization. Compostire Software, 2.
Eve, Robert. (2010, April 13). Data Services Platforms – Bringing Order to Chaos. (Composite Software) Retrieved April 15, 2013, from Ebiz.net: http://www.ebizq.net/topics/bi/features/12478.html?page=3
Forrester. (2010). Enterprise Business Intelligence Platforms. Forrester Wave.
Gartner. (2013, April 15). Business Intelligence (BI). Retrieved April 15, 2013, from Gartner: http://www.gartner.com/it-glossary/business-intelligence-bi/
Gartner. (2013). Data Integration. Retrieved April 22, 2013, from Garner IT Glossary: http://www.gartner.com/it-glossary/data-integration-tools/
Harrington, J. L. (2009). Relational Database Design and Implementation: Clearly Explained (3rd ed.). Burlington, MA, USA: Morgan Kaufmann.
Hayward, M. (2013, February 15). Alleviate load on SANs with Data Virtualization. Retrieved April 15, 2013, from http://dboptimizer.com/2013/02/15/alleviate-load-on-sans-with-data-virtualization/
Janssen, C. (2013). Te. Retrieved April 19, 2013, from Techopedia: http://www.techopedia.com/definition/1007/data-virtualization
Kang, A. (2002, August 09). Enterprise application integration using J2EE. Retrieved April 22, 2013, from Java World: http://www.javaworld.com/javaworld/jw-08-2002/jw-0809-eai.html?page=1
Linthicum, D. (2004). Next generation application integration: from simple information to Web services (2nd ed.). Boston, MA, U.S.A: Addison-Wesley Professional.
Loshin, D. (2010, June). Effecting Data Quality Improvement. Knowledge Integrity Incorporated, 11.
Mulcahy, R. (2010). Business Intelligence Definition and Solutions. Retrieved April 15, 2013, from CIO: http://www.cio.com/article/40296/Business_Intelligence_Definition_and_Solutions
Rud, O. P. (2009). Business Intelligence Success Factors. Hoboken, NJ, USA: John Wiley & Sons, Inc.
van der Lans, R. L. (2012). Data Virtualization for Business Intelligence Systems (1st ed.). (A. Dierna, & R. Day, Eds.) Waltham, MA, USA: Morgan Kaufmann Publishers as an import of Elsevier.
van der Lans, R. L. (2012). Data Virtualization for Business Intelligence Systems (1st ed.). (A. Dierna, & R. Day, Eds.) Waltham, MA, USA: Morgan Kaufmann Publishers as an import of Elsevier.
Webopedia. (2013). Data virtualization. Retrieved 2013, from Webopedia: http://www.webopedia.com/TERM/D/data_virtualization.html
White, D. (2011, April 01). Agile BI: Three Steps to Analytical Heaven. Retrieved April 11, 2013, from Mxisoft.com: http://www.mxisoft.com/Portals/53068/docs/3%20Steps%20to%20Analytic%20Heaven.pdf
Yuhanna, N., & Gilpen, M. (2012, Jannuary 05). The Forrester Wave: Data Virtualization Q1 2012. Retrieved April 11, 2013, from Informatica.com: 1888_forrester-wave-data-virtualization_ar.pdf