Friday, July 20, 2007

DATA MINING [Eng]

Data mining (DMM), also called Knowledge-Discovery in Databases (KDD) or Knowledge-Discovery and Data Mining, is the process of automatically searching large volumes of data for patterns using tools such as classification, association rule mining, clustering, etc. Data mining is a complex topic and has links with multiple core fields such as computer science and adds value to rich seminal computational techniques from statistics, information retrieval, machine learning and pattern recognition. Data mining has been defined as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" [1] and "the science of extracting useful information from large data sets or databases" [2]. It involves sorting through large amounts of data and picking out relevant information. It is usually used by businesses, intelligence organizations, and financial analysts, but is increasingly used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods. Metadata, or data about a given data set, are often expressed in a condensed data mine-able format, or one that facilitates the practice of data mining. Common examples include executive summaries and scientific abstracts. Although data mining is a relatively new term, the technology is not. Companies for a long time have used powerful computers to sift through volumes of data such as supermarket scanner data, and produce market research reports. Continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy and usefulness of analysis. Data mining identifies trends within data that go beyond simple analysis. Through the use of sophisticated algorithms, users have the ability to identify key attributes of business processes and target opportunities. The term data mining is often used to apply to the two separate processes of knowledge discovery and prediction. Knowledge discovery provides explicit information that has a readable form and can be understood by a user. Forecasting, or predictive modeling provides predictions of future events and may be transparent and readable in some approaches (e.g. rule based systems) and opaque in others such as neural networks. Moreover, some data mining systems such as neural networks are inherently geared towards prediction and pattern recognition, rather than knowledge discovery. The term "data mining" is often used incorrectly to apply to a variety of other processes besides data mining. In many cases, applications may claim to perform "data mining" by automating the creation of charts or graphs with historic trends and analysis. Although this information may be useful and timesaving, it does not fit the traditional definition of data mining, as the application performs no analysis itself and has no understanding of the underlying data. Instead, it relies on templates or pre-defined macros (created either by programmers or users) to identify trends, patterns and differences. A key defining factor for true data mining is that the application itself is performing some real analysis. In almost all cases, this analysis is guided by some degree of user interaction, but it must provide the user some insights that are not readily apparent through simple slicing and dicing. Applications that are not to some degree self-guiding are performing data analysis, not data mining.

7 comments:

Anonymous said...

Hello. This post is likeable, and your blog is very interesting, congratulations :-). I will add in my blogroll =). If possible gives a last there on my blog, it is about the Smartphone, I hope you enjoy. The address is http://smartphone-brasil.blogspot.com. A hug.

Anonymous said...

Remarkable! Its in fact awesome post, I have got much clear idea regarding from this post.
My webpage :: diet plans that work

Anonymous said...

Very descriptive blog, I enjoyed that bit. Will there be a part 2?



My web-site: online graduate Certificate

Anonymous said...

Very descriptive blog, I enjoyed that bit. Will there be a part
2?

My web page: online graduate Certificate

Anonymous said...

I do agree with all the ideas you have presented in your post.
They're really convincing and will definitely work. Still, the posts are too short for starters. Could you please extend them a little from next time? Thanks for the post.

my web page :: free dating site

Anonymous said...

It іs pеrfеct tіmе to make some plаns for the future аnd it is
tіme to be happy. I've read this post and if I could I want to suggest you few interesting things or tips. Perhaps you can write next articles referring to this article. I desire to read more things about it!

My blog - hcg diet food list

Anonymous said...

I every tіme spent my half an hour to гead this
blog's articles or reviews daily along with a mug of coffee.

Stop by my site hcg tablets