Recent Topics

AUTOMATED MARKET BASKET ANALYSIS SYSTEM

 

CHAPTER ONE

1.0    INTRODUCTION

Association rule mining(ARM) is used for identification of association between a large set of data items. Due to large quantity of data stored in databases, several industries are becoming concerned in mining association rules from their databases. For example, 410 Savi Gupta &RoopalMamtora the detection of interesting association relationships between large quantities of business transaction data can assist in catalog design, cross-marketing, and various business decision making processes. A typical example of association rule mining is market basket analysis. This method examines customer buying patterns by identifying associations among various items that customers place in their shopping baskets. The identification of such associations can help retailers to expand marketing strategies by gaining insight into which items are frequently purchased jointly by customers. This work acts as a broad area for the researchers to develop a better data mining algorithm. This paper presents a survey about the existing data mining algorithm for market basket analysis. This review paper is organized as follows: Section I contains brief introduction of ARM, Section II depicts market basket analysis which is an application of ARM, Section III discusses the literature survey in which various data mining algorithms are discussed, section IV discusses apriori algorithm, problems and directions of data mining algorithms are depicted in section V. Then the complete paper is summarized in the section VI, which includes conclusion and future scope.

1.1 BACKGROUND OF STUDY

Data mining is described as the extraction of hidden helpful information from a collection of huge databases; data mining is also a technique that encompasses an enormous form of applied mathematics and computational techniques like link analysis, clustering, classification, summarizing knowledge, regression analysis and so on. Data mining tools predict future trends and behaviors, permitting businesses to create knowledge-driven selections. The machine-driven, prospective analyses offered by data mining move on the far side the analyses of past events. Data mining tools provides answer to business questions that were time consuming. They search databases for hidden patterns, finding useful information that is beyond the reach of specialists.

Data mining techniques is enforced speedily on existing package and hardware platforms to reinforce the worth of existing information resources, and might be integrated with new product and systems as they're brought. once enforced on high performance client/server or multiprocessing computers, data mining tools will analyze huge databases to provide answers to questions such as, ”What goods consumers tend to buy the most  and goods that go along side with it”.

Coenen(2010) in his publication” Data Mining: Past, Present and Future” discussed the history of data mining can be dated as far back as  late 80s when the term began to be used, at least within the research community and diffrentiated it from sql.

Broadly data mining can be defined as as set of mechanisms and techniques, realized in software, to extract hidden information from data. However,the word hidden in this definition is important; By the early 1990s data mining was commonly recognized as a sub process within a larger process called Knowledge Discovery in Databases or KDD , the most commonly used definition of KDD is that of Fayyad et al  as “the nontrivial process of identifying valid, novel, potentially useful and ultimately understandable patterns in data.’’ (Fayyad et al. 1996).

As such data mining should be viewed as the sub-process, within the overall KDD process, concerned with the discovery of hidden information. Other sub-processes that form part of the KDD process are data preparation (warehousing, data cleaning, pre-processing,and so on) and the analysis/visualisation of results. For many practical purposes KDD and data mining are seen as synonymous, but technically one is a sub-process of the other. The data that data mining techniques were originally directed at was tabular data and, given the processing power available at the time, computational eficiency was of significant concern. As the amount of processing power generally available increased, processing became less of a concern and was replaced with a desire for accuracy and a desire to mine ever larger data collections. Today, in the context of tabular data, we have a well-established range of data mining techniques available.  It is well within the capabilities of many commercial enterprises and researchers to mine tabular.

data, using software such as  Weka, on standard desktop machines. However, the amount of electronic data collected by all kinds of institutions and commercial enterprises, year on year, continues to grow and thus there is still a need for efective mechanisms to mine ever larger data sets. The popularity of data mining increased significantly in the 1990s, notably with the establishment of a number of dedicated conferences; the ACM SIGKDD(special interest group on knowledge discovery in data) annual conference in 1995, and the European PKDD(practice of knowledge discovery in databases) and the Pacific/Asia PAKDD(pacific Asia conference on knowledge discovery and data mining) conferences This increase in popularity can be attributed to advances in technology; the computer processing power and data storage capabilities available meant that the processing of large volumes of data using desktop machines was a realistic possibility. It became common place for commercial enterprises to maintain data in computer readable form, in most cases this was primarily to support commercial activities, the idea that this data could be mined often came second. The 1990s also saw the introduction of customer loyalty cards  that allowed enterprises to record customer purchases, the resulting data could then be mined to identify customer purchasing patterns. Data mining , is the method of looking into giant volumes of data for patterns using methods like classification, association rule mining, clustering, etc. Data mining is a topic that is related to   topics like machine learning and pattern recognition. Data mining techniques area unit the results of an extended process of analysis and products development.

I am in my final year. I was bright and brilliant, my family is optimistic in me; they thought so much of me, but I had a fault. What was my fault? I hated compiler construction.  I struggled with calculations all my life. Though i have been  lucky; I did well all the same. However, I had to write my final exam. I searched for all Compiler construction past question for each year, compared, and sorted them. Guess what I discovered! Over 35% of the questions were repetitions. I had hit the jackpot. I carefully and thoroughly checked through the answer page. Therefore, I kept on revising only the repeated questions. Well, I have a good grade to show for the Data Mining I performed.

There is huge amount of data available in Information Industry. This data is of no use until converted into useful information. Analyzing this huge amount of data and extracting useful information from it is necessary. The extraction of information is not the only process we need to perform; it also involves other processes such as Data pre-processing( Data Cleaning, Data Integration, Data Transformation) Data Mining, Pattern Evaluation and Data Presentation. Once all these processes are over, we are now position to use this information in many applications such as Fraud Detection, Market Analysis, Production Control, Science Exploration etc.

 

1.2 STATEMENT OF PROBLEM 

Through in depth research and observations carried on supermarket we have discovered that retailers are willing to know what product is purchased with the other or if a particular products are purchased together as a group of items . Which can help in their decision making with respect to placement of  product, determining the timing and extent of promotions on product  and also have a better understanding of customer purchasing habits by grouping customers with  their transactions.

This project is aimed at designing and implementing a well-structured market basket analysis software tool to solve the problem stated above and compares the result to that of an existing software called WEKA.

 

1.3 AIM AND OBJECTIVE OF THE STUDY

The aim of the study is to maximize profit for the retailers by providing better services to the consumers

The objectives of this study are: Cross-Market Analysis - Data Mining performs Association/correlations between product sales

Identifying Customer Requirements - helps in identifying the best products for different customers. It uses prediction to find the factors that may attract new customers.

 

Customer Profiling - helps to determine what kind of people buy what kind of products.

 

1.4 SIGNIFICANT OF THE STUDY

The essence of market basket analysis system is to deliver or supply the right goods in the right quantity and at the right time/place to the right customer. For these, the benefits that are derived market basket analysis system are as follows:

    A:  Increased visibility of your customer’s buying behavior

    B:  Reduced order processing costs

    C:  increased sale and market share

    D:  Quicker execution of pricing and promotion strategies to specific target

          Market segments

 

1.5: SCOPE OF STUDY

This project research is done to give knowledge about the operation of customer order and monitoring system in Top Hills shopping mall. At focuses on the following area:

   A:  Effective processing of data.

   B:  Fast movement of products (delivery).

   C:  Effective in the flow of information.

   D:  Security of document.

  1.6 LIMITATIONS OF STUDY

During the causes of the research work, I encountered some constraint, which restricts us to study only the above mentioned areas. Some of the limitations include:

A: Inadequacy of funds to finance the project as a result of economic

     Instability

B: Constraint by time factor because this research work is being done together

     With other academic work

C: Unable to get the necessary information from the project concerned and

     Poor information facilities     

 1.7 DEFINITION OF TERMS

It is necessary to define some of the terms associated with customer order and monitoring system. And they include:

A: SYSTEM: A system is a group of things or parts working together in a

     Particular relation

B: ORDER: This is the terminology reserved for a request to supply the goods

asked for

C: CUSTOMER: A customer is a person or organization who buys goods and

services from a shop, business etc.

 D: DATA: Data are raw facts and figures about the activities of a business.

F: BREWAY: A place where beer is manufactured.

E: INVOICE: List of goods sold with the price charged.

G: ON-LINE PROCESSING: This is the transferring of information through

On-line in cable.

 H: BEER: A type of alcoholic drink made from malt and flavored with hoops.

I: COMPUTERISATION: It is the process of converting manually based system

to a computer based system.

 DOWNLOAD FULL MATERIAL

Previous Post Next Post