Data Mining

Data mining is the discovery of useful patterns in data. Data mining are used for prediction analysis and classification- e.g; what is the likelihood that a customer will migrate to a competitor.

OLAP, online analytical processing, is used to analyse historical data and its lies the business information required. OLAP are often used by marketing managers. Slice of data that are useful for marketing managers can be- How many customers between the ages 24 -25, that live in New York state, buy over $2,000 worth of groceries a month?

Reporting tools are used to provide reports on the data. That are displayed to show relevance to the business and keep track of key performance indicators.

Data visualization tools is used to display data from the data repository. Often data visualization is combined with data mining and OLAP tools. Data visualization can allow the user to manipulate that are to show relevancy and patterns.

Clustering:

Intuitively, clustering was the problem of finding clusters of points in the given data. The problem of clustering can formalized from distance Metrics in several ways. One way is to phrase it as the problem of grouping points into K (for a given K) sets so that the average distance of point from the centroid of their assigned cluster is minimized. Another way is to group point, So that the average distance between every pair of points in each cluster is minimized.

Another type of clustering appears in classification systems in biology. For instance leopards and humans are clustered under the class mammalia, wild crocodiles and snakes are clustered under reptilia.The clustering of mammalia has further clusters, such as carnivora and primates. We thus have hierarchical clustering. Given characteristics of different species biology have created a complex hierarchical clustering scheme grouping related species together at different levels of hierarchy.

The statistics community has studied clustering extensively. Database research has provided scalable clustering algorithm that can cluster very large datasets. The Birch clustering algorithm is one such algorithm. Intuitively that the points are inserted into a multidimensional tree structure and guided to appropriate leaf nodes on the basis of nearness to representative points in the internal nodes of the tree. Nearby points are there is clustered together.

An interesting application of clustering is to predict what new movies a person likely to be interested in, on the basis of:

1. The persons passed preferences in movies.

2. Other people with similar past preferences

3. The preferences of such people full new movies.

To find people with similar past preferences we create cluster people base only preferences for movies. The accuracy of clustering can be improved by previously clustering movies by the similarity, so even if people have not seen the same movies, if they have seen similar movies they would be clustered together. We can repeat the clustering, alternately clustering people, then movies, then people and so on till we reach and equilibrium.

Comments

UnknownMarch 13, 2020 at 7:19 PM
Listen...

What I'm going to tell you might sound pretty creepy, maybe even kind of "strange"

WHAT if you could just click "PLAY" to LISTEN to a short, "musical tone"...

And magically attract MORE MONEY into your life?

What I'm talking about is hundreds... even thousands of dollars!

Think it's too EASY? Think it's IMPOSSIBLE?!?

Well then, I'll be the one to tell you the news...

Usually the most magical blessings in life are the easiest to RECEIVE!

In fact, I will provide you with PROOF by letting you listen to a real-life "miracle money-magnet tone" I've produced...

YOU just press "PLAY" and watch as your abundance angels fly into your life... it starts right away...

CLICK here NOW to PLAY the mysterious "Miracle Money-Magnet Sound Frequency" as my gift to you!
ReplyDelete
Replies

Add comment

Computer Science UGC Net

Search This Blog

Data Mining

Comments

Post a Comment

Popular posts from this blog

Hub, repeater, switch, router, gateway, bridge

Database and DBMS

DBMS: Normalization