Skip to main content

Data Warehousing

 Data Warehouse is open to an almost limitless range of definitions. Simply put, data warehouses store and aggregation of a company's data.
Data warehouses are an important asset for organisations to maintain efficiency, profitability and competitive advantages, organisations collect data through many sources- online, call centre, sales needs, inventory management. The data collected have degrees values and business relevance.

Figure shown below shows the architecture of a typical data warehouse and illustrate the gathering of data, the storage of data, and the quaring and data analysis support.


Different steps involved in getting data into a warehouse are called as extract, transform and lode or ELT tasks; extraction refers to getting data from the sources, while loaders reference to loading the data into data warehouse.

Characteristics of data warehouse:

  • Multidimensional conceptual view
  • Generic dimensionality
  • Unlimited dimensions and aggregation levels
  • Unrestricted cross dimensional operations
  • Dynamic sparse Matrix handling
  • Client/server architecture
  • Multi user support
  • Accessibility
  • Transparency
  • Intuitiive data manipulation
  • Consistent reporting performance
  • Flexible reporting

Pre- Data Warehouse:
The pre- data warehouse zone provides the data for data warehousing. Data warehouse designers determine which data contains business value for insertion.
OLTP databases are where operational data are stored. OLTP databases can reside in transactional software applications such as enterprise resource management, supply chain, point of sale, customer serving software. OLTP are design for Transaction speed and
accuracy.
Meta data ensure the sanctity and accuracy of data entering into the data lifecycle process. Metadata in shows that that has the right format and relevancy.

Data Cleansing:
Before data and data warehouse, the extraction, transformation and cleaning process ensure that the data passes the data quality threshold.

Data Repository:
The data warehouse repository is the database that stores active data of business value for an organisation. The data warehouse modelling design is optimised for data analysis.
 There are variants of data warehouses- Data Marts and ODS.data warehouses collects data and is the repository for historical data. Hence it is not always efficient for providing up to date analysis. This is where ODS, operational data stores, come in.

Front End Analysis:
The last and most critical portion of the data warehouse over you are the front and applications that business users will use to interact with data stored in the repositories.

Typical functionality of a Data Warehouse: 
Data warehouse exist to facilitate complex, data intensive, and frequent ad hoc queries.Accordingly,data warehouses must provide for greater and more efficient query support then is demanded of transactional databases. In particular enhanced spreadsheet functionality includes support for state -of-the art spreadsheet applications, for example :MS-Excel, as well as for OLAP applications programs. These offer pre-programmed functionalities such as the following:
  • Roll up: with increasing generalisation (for example weekly to quarterly to annually).
  • Drill down: increasing levels of details are revealed (the complement of roll-up)
  • Pivot: cross tabulation (also referred as rotation) is performed
  • Slice and dice: projection operations are performed on the dimensions
  • Sorting: data is sorted by ordinal value
  • Selection: data is available by value or range
  • Derived attributes: attributes are computed by operations on stored and derived values

Data warehouse problems:

  • Data that is required is not collected or not accessible
  • Not enough time saw spent prototyping or understanding the real business needs in depth
  • initial database scope was too broad trying to contain too much information to soon
  • Besides the initial project approval, senior management did not provide much direction in terms of priorities, resulting in a disconnection between the data needed and the data gathered 


Comments

  1. Nice one make more comtent related programming

    ReplyDelete
  2. Please check out my blog too www.mindconqueror.com

    ReplyDelete
  3. If you're trying hard to burn fat then you certainly need to jump on this totally brand new tailor-made keto meal plan diet.

    To design this service, licensed nutritionists, fitness trainers, and top chefs united to develop keto meal plans that are efficient, convenient, economically-efficient, and delicious.

    From their first launch in January 2019, hundreds of individuals have already remodeled their body and well-being with the benefits a professional keto meal plan diet can give.

    Speaking of benefits: clicking this link, you'll discover eight scientifically-tested ones offered by the keto meal plan diet.

    ReplyDelete

Post a Comment

If you find something wrong about this post please let us know. No Abusive Messages please.

Popular posts from this blog

Hub, repeater, switch, router, gateway, bridge

HUB Hub is a controller that controls the traffic on the network.  The following important properties of hub are:  1) It amplify signals. 2) It propagates signals through the network. 3) It does not require filtering. 4) It does not require path determination for switching. 5) It is used as network concentration points. Hubs are basically two types: 1) Active hub 2) Passive hub Active hub: A ctive hub works as repeater which is a hardware device that regenerates the received bit pattern before sending them out . Passive hub : A passive hub is a simple hardware device which provide a simple physical connection between the attached devices. Advantages of hub: It cannot filter the traffic full stop feeling generally refers to a process or device that screens network traffic for certain characteristics such as source address and destination address and protocol. Disadvantages of hub: On a hub, more than one user may try to send data on the network at sam

Scheduling: preemptive scheduling

Preemptive Scheduling : In contrast to non preemptive scheduling, a scheduling decision can be made even while the job is executing whereas in non preemptive scheduling, a scheduling decision is made only after job completes its execution. Therefor preemptive scheduling may force a job in execution to release the processor, so that the execution of some other job can be undertaken, in order to improve throughput considerably. Types of preemptive scheduling: 1) Round Robin scheduling algorithm : the round Robin scheduling is designed for time sharing systems. The primary objective of round Robin scheduling are interactive use, good response time and sharing the resources equitable among processes. It is similar to FCFS, but preemption is added to switch between processes. The processes are alocated a small unit of time. Known as time Quantum or time slice is in rotation until the completion of processes. To implement round Robin scheduling, a FIFO(first in first out) queue

Scheduling: Non-Preemptive Scheduling

Scheduling : In multi-programmed computer, multiple processes competing for the CPU at the same time. This situation occurs whenever two or more processes are simultaneously in the ready state. If only one CPU is available. Then we need a system that decide which process run first and then next and this will be done by the scheduler. Scheduler : scheduler is an operating system module that she loves an axe top to be admitted into the system and then the next process to run. Scheduling is of two type: 1) Pre-emptive 2) Non pre-emptive Non Pre-emptive Scheduling : In batch non Pre-emptive scheduling implies that, once scheduled, selected job runs to completion. In other words, the running process not forced to relinquish ownership of the processor when a higher priority process becomes ready for execution. The scheduling techniques which use non preemptive scheduling are: 1) first come first serve (FCFS) scheduling 2) shortest job next (SJN) scheduling 3) dea