Sunday, July 15, 2007

Knowledge Discovery In Database

KDD is the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data (Fayyad, Piatetsky-Shapiro, and Smyth 1996). Knowl­edge means relationships and patterns between data ele­ments. It must be new and not obvious before the analysis is performed, and one must be able to use it. KDD is the whole process of extracting knowledge from data. The term DM is used exclusively for the knowledge discovery stage of the KDD process (Adriaans and Zantinge 1996). One of the key elements of DM is data warehousing. Des­tination data warehousing is a prerequisite for the DM since data have to be stored in a database before the DM is per­formed. A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of raw data that can be used to support destination management decision making (Kasavana and Knutson 1999; Inmon 1996). Also, a data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to the end users in a way they can understand and use in a destination tourism context (Devlin 1997). Data warehouses are designed as customer-oriented domains such as spending patterns, motivations, bundled purchases, length of stay, and other visitor experiences, and data in the warehouse can be readily mined to identify a destination’s strengths, weak­nesses, and other relevant information/knowledge (Kasavana and Knutson 1999). Storage and retrieval systems with data analysis and knowledge distribution capabilities should func­tion appropriately to provide knowledge as intended. It is getting convenient to obtain storage and retrieval sys­tems with sufficient capabilities to meet the needs of destina­tion data warehousing. However, the ability to effectively analyze and act on the resulted destination knowledge has not made as much advancement as the technology capabili­ties. Research is needed to structure and prioritize the needed knowledge to solve specific end-user problems, define data collection process and analysis methods, and deliver the resulted knowledge to the very needed end users. DM is an activity in the KDD process that applies a spe­cific algorithm to extract trends, patterns, and correlations, and it is also a process of discovering implicit knowledge from a data warehouse (Chou and Chou 1999). Broadly speaking, DM is a discovery-oriented data analysis technol­ogy, which automatically detects hidden important informa­tion in the data warehouse. Machine-learning methods (such as neural networks, association rules, decision trees, and genetic algorithms that are rooted in artificial intelligence), which require only limited human involvement, are used to extract patterns or knowledge from data (Peacock 1998a). Successful DM extracts useful relationships, patterns, and trend knowledge to understand the current and historical behavior of tourists and destination performance to enhance decision-making processes in destinations (Kasavana and Knutson 1999).

No comments: