Skip to main content

Data Warehousing

 Data Warehouse is open to an almost limitless range of definitions. Simply put, data warehouses store and aggregation of a company's data.
Data warehouses are an important asset for organisations to maintain efficiency, profitability and competitive advantages, organisations collect data through many sources- online, call centre, sales needs, inventory management. The data collected have degrees values and business relevance.

Figure shown below shows the architecture of a typical data warehouse and illustrate the gathering of data, the storage of data, and the quaring and data analysis support.


Different steps involved in getting data into a warehouse are called as extract, transform and lode or ELT tasks; extraction refers to getting data from the sources, while loaders reference to loading the data into data warehouse.

Characteristics of data warehouse:

  • Multidimensional conceptual view
  • Generic dimensionality
  • Unlimited dimensions and aggregation levels
  • Unrestricted cross dimensional operations
  • Dynamic sparse Matrix handling
  • Client/server architecture
  • Multi user support
  • Accessibility
  • Transparency
  • Intuitiive data manipulation
  • Consistent reporting performance
  • Flexible reporting

Pre- Data Warehouse:
The pre- data warehouse zone provides the data for data warehousing. Data warehouse designers determine which data contains business value for insertion.
OLTP databases are where operational data are stored. OLTP databases can reside in transactional software applications such as enterprise resource management, supply chain, point of sale, customer serving software. OLTP are design for Transaction speed and
accuracy.
Meta data ensure the sanctity and accuracy of data entering into the data lifecycle process. Metadata in shows that that has the right format and relevancy.

Data Cleansing:
Before data and data warehouse, the extraction, transformation and cleaning process ensure that the data passes the data quality threshold.

Data Repository:
The data warehouse repository is the database that stores active data of business value for an organisation. The data warehouse modelling design is optimised for data analysis.
 There are variants of data warehouses- Data Marts and ODS.data warehouses collects data and is the repository for historical data. Hence it is not always efficient for providing up to date analysis. This is where ODS, operational data stores, come in.

Front End Analysis:
The last and most critical portion of the data warehouse over you are the front and applications that business users will use to interact with data stored in the repositories.

Typical functionality of a Data Warehouse: 
Data warehouse exist to facilitate complex, data intensive, and frequent ad hoc queries.Accordingly,data warehouses must provide for greater and more efficient query support then is demanded of transactional databases. In particular enhanced spreadsheet functionality includes support for state -of-the art spreadsheet applications, for example :MS-Excel, as well as for OLAP applications programs. These offer pre-programmed functionalities such as the following:
  • Roll up: with increasing generalisation (for example weekly to quarterly to annually).
  • Drill down: increasing levels of details are revealed (the complement of roll-up)
  • Pivot: cross tabulation (also referred as rotation) is performed
  • Slice and dice: projection operations are performed on the dimensions
  • Sorting: data is sorted by ordinal value
  • Selection: data is available by value or range
  • Derived attributes: attributes are computed by operations on stored and derived values

Data warehouse problems:

  • Data that is required is not collected or not accessible
  • Not enough time saw spent prototyping or understanding the real business needs in depth
  • initial database scope was too broad trying to contain too much information to soon
  • Besides the initial project approval, senior management did not provide much direction in terms of priorities, resulting in a disconnection between the data needed and the data gathered 


Comments

  1. Nice one make more comtent related programming

    ReplyDelete
  2. Please check out my blog too www.mindconqueror.com

    ReplyDelete
  3. If you're trying hard to burn fat then you certainly need to jump on this totally brand new tailor-made keto meal plan diet.

    To design this service, licensed nutritionists, fitness trainers, and top chefs united to develop keto meal plans that are efficient, convenient, economically-efficient, and delicious.

    From their first launch in January 2019, hundreds of individuals have already remodeled their body and well-being with the benefits a professional keto meal plan diet can give.

    Speaking of benefits: clicking this link, you'll discover eight scientifically-tested ones offered by the keto meal plan diet.

    ReplyDelete

Post a Comment

If you find something wrong about this post please let us know. No Abusive Messages please.

Popular posts from this blog

Hub, repeater, switch, router, gateway, bridge

HUB Hub is a controller that controls the traffic on the network.  The following important properties of hub are:  1) It amplify signals. 2) It propagates signals through the network. 3) It does not require filtering. 4) It does not require path determination for switching. 5) It is used as network concentration points. Hubs are basically two types: 1) Active hub 2) Passive hub Active hub: A ctive hub works as repeater which is a hardware device that regenerates the received bit pattern before sending them out . Passive hub : A passive hub is a simple hardware device which provide a simple physical connection between the attached devices. Advantages of hub: It cannot filter the traffic full stop feeling generally refers to a process or device that screens network traffic for certain characteristics such as source address and destination address and protocol. Disadvantages of hub: On a hub, more than one user may try to send data on the netwo...

Data Mining

Data mining is the discovery of useful patterns in data. Data mining are used for prediction analysis and classification- e.g; what is the likelihood that a customer will migrate to a competitor. OLAP, online analytical processing, is used to analyse historical data and its lies the business information required. OLAP are often used by marketing managers. Slice of data that are useful for marketing managers can be- How many customers between the ages 24 -25, that live in New York state, buy over $2,000 worth of groceries a month? Reporting tools are used to provide reports on the data. That are displayed to show relevance to the business and keep track of key performance indicators. Data visualization tools is used to display data from the data repository. Often data visualization is combined with data mining and OLAP tools. Data visualization can allow the user to manipulate that are to show relevancy and patterns. Clustering: Intuitively, clustering was the problem of ...