Sunteți pe pagina 1din 52

1.

INTRODUCTION
Given a set of evaluative text documents that contain opinions about an object, opinion mining aims to extract attributes and components of the object that have been commented on in each document and to determine whether the comments are positive, negative or neutral. Before the Web, when an individual needs to make a decision, he/she typically asks for opinions from friends and families. When an organization needs to find opinions of the general public about its products and services, it conducts surveys and focused groups. With the Web, especially with the explosive growth of the user generated content on the Web, the world has changed. One can post reviews of products at merchant sites and express views on almost anything in Internet forums, discussion groups, and blogs, which are collectively called the user generated content. Now if one wants to buy a product, it is no longer necessary to ask ones friends and families because there are plentiful of product reviews on the Web which give the opinions of the existing users of the product. For a company, it may no longer need to conduct surveys, to organize focused groups or to employ external consultants in order to find consumer opinions or sentiments about its products and those of its competitors.

1.1 Objective To make effective decision for product and sales improvement in manufacturing business sector with the help of feedback opinion collected from various customers. The opinion based multiple choices feedback questionnaire help the manufacturing business sector to drive in aright path boosting.

2. SYSTEM ANALYSIS
Systems analysis is the study of sets of interacting entities; this field is closely related to requirement analysis. The analysis of the role of a proposed system and the identification of the requirements that it should meet and it is the starting point for system design 2.1 Existing System Generally the product feedbacks on the web are in three formats. 1. Pros, cons and the detailed review 2. Pros and cons 3. Free format 2.1.1 Drawbacks of existing system 1) Finding opinion sources and monitoring them on the Web however, can still be a formidable task because a large number of diverse sources exist on the Web and each source also contains a huge volume of information. 2) Processing large amounts of review one by one requires a lot of time and cost for both businesses and customers. 2.2 Proposed System A good summarization system can help them in getting the required and relevant information without going through all the reviews present on the site. Instead of collecting feedback as stated above, the feedback is collected from customers by giving multiple choices to each feature. 2.2.1 Merits of proposed system Opinions are so important that whenever one needs to make a decision, one wants to hear others opinions. This is true for both individuals and organizations. 1. Individual consumers: If an individual wants to purchase a product, it is useful to see a summary of opinions of existing users so that he/she can make an informed decision. This is better than reading a large number of reviews to form a mental picture of the strengths and weaknesses of the product. He/she can also compare the summaries of opinions of competing products, which is even more useful. 2. Organizations and businesses: Opinion mining is equally, if not even more, important to businesses and organizations. For example, it is critical for a product manufacturer to know how consumers perceive its products and those of its competitors. This information is not only useful for marketing and product benchmarking but also useful for product design and product developments.
2

2.3 Feasibility Study


A feasibility study is a high-level capsule version of the entire System analysis and Design Process. The study begins by classifying the problem definition. Feasibility is to determine if its worth doing. By having a detailed feasibility study the management will have a clear view of the proposed system with its benefits and drawbacks.

2.3.1 Economical Feasibility Economical feasibility attempts to weigh the costs of developing and implementing a new system, against the benefits that would accrue from having the new system in place. This feasibility study gives the top management the economic justification for the new system. A simple economic analysis which gives the actual comparison of costs and benefits are much more meaningful in this case. In addition, this proves to be a useful point of reference to compare actual costs as the project progresses. There could be various types of intangible benefits on account of automation. These could include increased customer satisfaction, improvement in product quality better decision making timeliness of information, expediting activities, improved accuracy of operations, better documentation and record keeping, faster retrieval of information, better employee morale. The system developed and installed will be good benefit to the organization. The system will be developed and operated in the existing hardware and software infrastructure. So there is no need of additional hardware and software for the system.

2.3.2 Operational Feasibility Proposed project is beneficial only if it can be turned into information systems that will meet the organizations operating requirements. Simply stated, this test of feasibility asks if the system will work when it is developed and installed. Are there major barriers to Implementation? Here are questions that will help test the operational feasibility of a project. The proposed system is operationally feasible, as it is developed in such a way that any user without knowledge can use the system very easily. It makes effective decision for product and sales improvement in manufacturing business sector with the help of feedback opinion collected from various customers. As the entire System is user friendly and is designed based on the users mental model, the user can easily understand without much learning.
3

2.3.3 Technical Feasibility Evaluating the technical feasibility is the trickiest part of a feasibility study. This is because, at this point in time, not too many detailed design of the system, making it difficult to access issues like performance, costs on (on account of the kind of technology to be deployed) etc. A number of issues have to be considered while doing a technical analysis. The project is developed on Intel Core i3 processor with 2GB RAM. The environment required in the development of system is any windows platform. The language used in the development is VB.NET which is technically strong & Windows Environment and the entire Project can be easily developed using .Net so the project is technically feasible.

2.4 Development Environment 2.4.1 Waterfall Model The waterfall model, sometimes called the classic life cycle, suggests a systematic sequential approach to software development that begins with customer specification of requirements and progressed through planning modeling, construction, and deployment, culminating in on-going support of the completed software. For this project the following requirements are needed. Communication
Planning

Project initiation Requirements gathering

Estimating Scheduling Tracking

Modeling Analysis Design

Construction Code Test

Deployment Delivery

2.4.1 Waterfall model a) Product features. b) Questionnaires for all features of a particular product. c) Feedback from various customers.

Support Feedback

2.5 Project Architectural Design


Product features customization

Fetching feedback from customer based on the selected model

Categorizing the feedback option

Applying cluster for understanding the opinion

Decision making based on opinion

Fig 2.5 Project Architectural Design Description a) The customer has to register by giving their personal details. b) A registered customer has selected a particular product from various products and the customer has also view all features of a particular product. c) The feedback has been fetched from customers based on selected model. d) Categorizes the feedback option after fetching the feedback from customers. e) Finally the customer has taken right decision based on feedback given by various customers.

3. SYSTEM SPECIFICATIONS
3.1 Software Requirements Technology Tool Operating System Backend Coding Language : : : : : .Net 2008 Weka 3-5 Windows XP MS Access 2007 VB.NET

3.2 Hardware Requirements CPU Type RAM Memory Hard disk space required : : : Intel Core i3 1 GB 40 GB

4. SOFTWARE DESCRIPTION 4.1 Front End


The Microsoft .NET Framework is software framework that can be installed on computers running Microsoft windows operating systems. It includes a large library of coded solutions to common programming problems and virtual machine thatmanages the execution of programs written specifically for the framework. The .NET Framework is a Microsoft offering and is intended to be used by most new applications created for the Windows platform. The frameworks base class library provides a large range of features including user interface, data access, database connectivity, web application development and network communications. The class library is used by programmers, who combine it with their own code to produce applications.
Using Solution Explorer

Solution Explorer is an area of the integrated development environment (IDE) that contains your solution and helps you manage your project files. The files are displayed in a hierarchical view, much like that of Windows Explorer. By default, Solution Explorer is located on the right side of the IDE. If Solution Explorer is not visible, you can click the View menu and then click Solution Explorer to open it.
7

When you create a new Windows Forms application by using Visual Basic Express Edition, a Windows Application solution appears in Solution Explorer. The solution contains two nodes: My Project and Form1.vb, as the following diagram illustrates. The My Project node opens the Project Designer when you double-click it. The Project Designer gives you access to project properties, settings, and resources. For more information, see Introduction to the Project Designer. The Form1.vb node is the Windows Form in your solution. You can view this file in Design view, which enables you to see the form and any controls that you have added to it. You can also view this file in the Code Editor, which enables you to see the code associated with the application you're creating. Toolbox The Toolbox is a container for all the controls that you can add to a Windows Forms application or a Windows Presentation Foundation (WPF) application. By default, the Toolbox is located on the left side of the integrated development environment (IDE). If the Toolbox is not visible, you can click the View menu, and then click Toolbox to display it. The following illustration shows the common controls in the Toolbox. Common controls in the Toolbox

You can set the Toolbox to automatically hide when you're not using it, or you can set the Toolbox to always be visible in the IDE. This makes it easier for you to see all the controls while you create your application. The controls are not visible on the Toolbox when you are in the Code Editor.

To add controls to your application, you can drag them directly from the Toolbox to the form Introduction to Windows Forms The user interface is the part of your program that users see when they run the program. A user interface usually consists of a main window or form, and several controls, such as buttons, fields for entering text, and so forth. These types of Visual Basic programs are known as Windows Forms applications, and the user interface is created using Windows Forms controls.

Toolbox Component Buttons The easiest way for users to interact with your program is through buttons. For example, many programs have Exit buttons. As you saw in the previous lesson, the Button control in Visual Basic looks and behaves like a push button. The Button control also has predefined events that can be used to initiate actions such as ending a program. Buttons are, generally, rectangular controls with a raised appearance on the form. There are many properties, however, that can be set to change their appearance. The most obvious is the Text property, which determines the text displayed, and this text is displayed in the font or typeface determined by the Font property. The BackColor property determines the button's color, and the ForeColor property determines the text's color. When the user clicks a button at run time, the Button raises the Click event. When an event occurs, controls run code in response to those events. You can write code that should run when the user clicks the button by creating an event handler. An event handler is a method that executes when an event occurs. When a user clicks a button, the button's Click event has an event handler.

Controls Used: Textbox It is used to receive input from the user by allowing the user to enter the data into it. The basic property used is text property. Syntax: TextBox1.text To get data that is in the TextBox1 to a variable named address, the following code is used. Address=TextBox1.Text If the data to be inserted is huge, then the property Multiline is set True so that data can be inserted in multiple lines. Label It is used to provide a descriptive caption and possibly an associated hot key for other controls. The basic property used to provide caption is text property. Syntax: Label1.Text If a label is to be displayed with caption as Username, then the following code is used. Label1.Text=Username Button It is commonly used to perform some action when the user clicks on the button. The basic property used is text property with Click event. Syntax: Button1.Text If a button with a name as Send is to be displayed, then the following code is used. Button1.Text=Send Checkbox It is commonly used to offer users a yes or no, true or false choice. Anytime by clicking on this control, it toggles between the yes state and the no state. The basic property used is checked. The only important event for Checkbox control is Click event. Syntax: CheckBox1.Checked=True CheckBox1.Checked=False If a Checkbox named CheckBox1 is checked, the following code is used to check it If CheckBox1.Checked=True then Desired code End If
10

Panel It is used as a Container to group more controls under a single control. When a Panel is placed, the basic control used is to set the border of the panel through Border Style. Syntax: Panel1.BorderStyle=Fixed Single If a Panel with a single line border is to be displayed, then the following code is Panel1.BorderStyle=Fixed Single The other basic control used is visible, which controls the visibility of the panel. Syntax: Panel1.Visible=True Panel1.Visible=False If a panel named Panel1 is to made invisible, then the following code is used. Panel1.Visible=False Combo Box It is used to select a particular item from a set of items. The item that the user selected is obtained by the control Text. Syntax: ComboBox1.Text If the item selected by the user from the ComboBox1 is to be stored in a variable named month, the following code is used. Month=ComboBox1.Text Events used Click A click event occurs when the user left-clicks on a control. Button, Checkbox, List Box, Combo Box etc. Lost Focus Lost Focus fires when the input focus leaves the control and passes to another control. used.

4.2 Features of VB.NET XML comments that can be processed by tools like NDoc to produce "automatic" documentation. Data Source binding, easing database client/server development. Design-time expression evaluation. The My pseudo-namespace (overview, details), which provides: Easy access to certain areas of the .NET Framework that otherwise require significant code to access.
11

Dynamically-generated classes. Relational, object, and XML data. Visual Basic 9.0 unifies access to data independently of its source in relational databases, XML documents, or arbitrary object graphs, however persisted or stored in memory. The unification consists in styles, techniques, tools, and programming patterns. The especially flexible syntax of Visual Basic makes it easy to add extensions like XML literals and SQL-like query comprehensions deeply into the language. This greatly reduces the "surface area" of the new .NET Language Integrated Query APIs, increases the discoverability of data-access features through IntelliSense and Smart Tags, and vastly improves debugging by lifting foreign syntaxes out of string data into the host language.

Just My Code, which hides boilerplate code written by the Visual Studio .NET IDE. Increased dynamism with all the benefits of static typing. The benefits of static typing are well known: identifying bugs at compile time rather than run time, high performance through early-bound access, clarity through explicitness in source code, and so on. However, sometimes, dynamic typing makes code shorter, clearer, and more flexible. If a language does not directly support dynamic typing, when programmers need it they must implement bits and pieces of dynamic structure through reflection, dictionaries, dispatch tables, and other techniques. This opens up opportunities for bugs and raises maintenance costs. By supporting static typing where possible, and dynamic typing where needed, Visual Basic delivers the best of both worlds to programmers.

Reduced cognitive load on programmers. Features such as type inference, object initializes, and relaxed delegates greatly reduce code redundancy and the number of exceptions to the rules that programmers need to learn and remember or look up, with no impact on performance. Features such as dynamic interfaces support IntelliSense even in the case of late-binding, greatly improving discoverability over advanced features.

Other

features

are

Overloads,

Constructors,

New

Property

Syntax,

Parameterized Properties, and Shared Members.

12

4.3 Back End


MS Access 2007 is used as back-end in this system. This system maintains a relational database in back-end. In the relational model, data is stored in structures called relations. Here no constraints are used to create the database. Microsoft Access database is available with the Microsoft Office Professional suite of business products therefore no additional database software is required since any company purchases computers with this suite of products already installed. Microsoft Access database is likely to be available and supported for years to come because Microsoft is the premier software company in the world. MS Access is the most widely used desktop database system in the world. Access may be the best choice since Access has more support and development consultants than any other desktop database system. It is significantly cheaper to implement and maintain compared to larger database systems such as Oracle or SQL Server. Fairly complex databases can be setup and running in 1/2 the time and cost of other large database systems (the simpler the database the greater the cost advantage). Access integrates well with the other members of the Microsoft Office suite of products (Excel, Word, Outlook, etc.). Other software manufacturers are more likely to provide interfaces to MS Access than any other desktop database system. When designed correctly, Access databases can be ported to SQL Server or Oracle. This is important if you want to start small or develop a pilot database system and then migrate to the larger database management systems.

13

5. PROJECT DESCRIPTION
5.1 Problem Definition In opinion mining the feedback is collected from various customers about a particular product based on the selected model and then the feedback option is categorizes based on the combination of choices then the feedback is mined by applying clustering technique. Finally the manufacturer has taken right decision about a particular product. 5.2 Overview of the Project In this system, first the customer has to register by giving their personal details. A registered customer has selected a particular product from various products and the Customer has also view all features of that particular product. The feedback is collected from various customers for the selected model and he/she also fetched feedback based on selected model that can be given by several customers. Then it categorizes the feedback option after getting the feedback from customers based on the combination of choices. After collecting feedback from several customers and then the feedback has been mined by applying clustering technique. After mining feedback and then the feedback has been analyzed based on similarity of choices. Finally the customer has taken right decision based on feedback given by various customers and that can also be useful for manufactures in the further improvement of sales. 5.3 Module Description The project mainly consists of six modules. The module is used to distinguish one set of task from the other. 1. Existed customers 2. New customer 3. Features 4. Questionnaires 5. Data Conversion 6. Feedback analysis Existed customers a) The existed customer has entered by giving customer id and password. b) After entering all details click on submit. New customers a) Customer name, E-mail id, occupation and password. b) After that click on submit.
14

c) The customer who is not registered they may register by giving customer id, d) After the customer has log in by giving their id and password. Features a) After the customer has logged then the customer can view all features by selecting product id and then click on search b) Then the all features are displayed for selected product. Questionnaires a) It consists of all questions after entering customer details all questionnaires of a particular product can be displayed. b) After that the customer has to answer for all the feedback questionnaires and then click on submit. c) By clicking on cancel, all the details which are entered by user those are cancelled. Data conversion a) After collecting feedback from various customers then the data in tables can be converted into arff file format. b) The data which is in arff file that can be given as input for clustering in Weka tool. Feedback analysis a) The feedback is analyzed based on the combination of choices. b) After collecting feedback from various customers after that we can mine the feedback by applying clustering techniques. c) After mining the feedback we can analyze the feedback based on similarity of choices. d) After analyzing the feedback the customer has taken correct decision.

5.4 Data Flow Diagram A data flow diagram is graphical tool used to describe and analyze movement of data through a system. These are the central tool and the basis from which the other components are developed. The transformation of data from input to output, through processed, may be described logically and independently of physical components associated with the system. These are known as the logical data flow diagrams. The physical data flow diagrams show the actual implements and movement of data between people, departments and workstations. A full description of a system actually consists of a set of data flow diagrams. A DFD is also known as a bubble Chart has the purpose of clarifying system requirements and identifying major transformations that will become programs in system

15

design. So it is the starting point of the design to the lowest level of detail. A DFD consists of a series of bubbles joined by data flows in the system. DFD Symbols In the DFD, there are four symbols 1. A square defines a source(originator) or destination of system data 2. An arrow identifies data flow. It is the pipeline through which the information flows 3. A circle or a bubble represents a process that transforms incoming data flow into outgoing data flows. 4. An open rectangle is a data store, data at rest or a temporary repository of data.

Process that transforms data flow

Source or Destination of data

Data flow

Data Store

Fig 5.4.1 Symbolic notations of Data Flow Diagram

Constructing a DFD Several rules of thumb are used in drawing DFDS: 1. Process should be named and numbered for an easy reference. Each name should be representative of the process. 2. The direction of flow is from top to bottom and from left to right. Data

traditionally flow from source to the destination although they may flow back to the source. One way to indicate this is to draw long flow line back to a source. An alternative way is to repeat the source symbol as a destination. Since it is used more than once in the DFD it is marked with a short diagonal. 3. When a process is exploded into lower level details, they are numbered. 4. The names of data stores and destinations are written in capital letters. Process and dataflow names have the first letter of each work capitalized.
16

A DFD typically shows the minimum contents of data store. Each data store should contain all the data elements that flow in and out. Questionnaires should contain all the data elements that flow in and out. Missing interfaces redundancies and like is then accounted for often through interviews. Salient features of DFDs 1. The DFD shows flow of data, not of control loops and decision are controlled considerations do not appear on a DFD. 2. The DFD does not indicate the time factor involved in any process whether the dataflow take place daily, weekly, monthly or yearly. 3. The sequence of events is not brought out on the DFD. Rules Governing the DFDs Process 1) No process can have only outputs. 2) No process can have only inputs. If an object has only inputs than it must be a sink. 3) A process has a verb phrase label. Data Store 1) Data cannot move directly from one data store to another data store, a process must move data. 2) Data cannot move directly from an outside source to a data store, a process, which receives, must move data from the source and place the data into data store 3) A data store has a noun phrase label. Source or Sink The origin and / or destination of data. 1) Data cannot move direly from a source to sink it must be moved by a process 2) A source and / or sink has a noun phrase land Data Flow 1) A Data Flow has only one direction of flow between symbols. It may flow in both directions between a process and a data store to show a read before an update. The latter is usually indicated however by two separate arrows since these happen at different type. 2) A join in DFD means that exactly the same data comes from any of two or more different processes data store or sink to a common location.

17

3) A data flow cannot go directly back to the same process it leads. There must be at least one other process that handles the data flow produce some other data flow returns the original data into the beginning process. 4) A Data flow to a data store means update (delete or change). 5) A data Flow from a data store means retrieve or use. A data flow has a noun phrase label more than one data flow noun phrase can appear on a single arrow as long as all of the flows on the same arrow move together as one package. Context Level -0 DFD

Giving feedback Customer

Adopting Clustering Based on Multiple Choices feedback Opinion Mining

Feedback analyzed based on opinion Management

Fig 5.4.2 Context Level Data Flow Diagram for opinion mining

18

Context Level-1 DFD

Register

Product_details

Questionnaires

Login Customer

Validate login

Select product

View Features

Giving feedback Questions

Collecting feedback Making decision Management Apply clustering Customer feedback

Feedback Analysis

customer_response

Fig 5.4.3 First-Level Data Flow Diagram for opinion mining

5.5 Database Design In order to design effective database without redundancy, the normalization technique applied according to the process that is proposed to make effective retrieval and storage of data and analysis has given as follows. 5.5.1Normalization Normalization is a process of minimization of redundancy related to non-primary keys. Before designing any system, normalization of databases is done for the following reasons, To reduce the redundancy of stored data. To avoid loss of data. To structure the data. To permit simple retrieval of data.
19

Procedure for normalizing the database Eliminate non atomic values. The resulting database will be First NF. Take projections to eliminate transitive dependencies. Now the database will be Second NF. Take projections to eliminate transitive dependencies. This gives a Third NF relation. Take projections of these Third NF relations to eliminate functional dependencies in which the determinant is not a candidate key. This gives a collection of BCNF relations. Take projections of these BCNF relations to eliminate any multi valued dependencies that are not implied by the candidate key. This produces a collection of Fifth NF relations. This reduction consists of replacing the relations by suitable projections. The collection of these projections on equipment to the original relations so on information is lost in the process. In other words, the process is reversible. This is called non loss decomposition. Since no information is lost in the deductions process, any information that can be derived from the original structure can also be derived from the new structure. Having arrived at the program specifications, programs, and hardware specifications, testing of the system is to be undertaken. This step of system analysis and designing is known as implementation.

First Normal Form


A relation is in First NF if the intersection of any column and row contains only one value (no repeating groups). Methods Identify suitable primary key from a pool of Un-normalized Data. Remove any item that repeats within a single value of this key to another relation bringing within them the primary key to form part of a new composite key in the new relation.

Second Normal Form


A table in Second NF is also in First NF if the values in every column are functionally dependent on the complete primary key.

20

Methods For every relation within a single data item making up the primary key, this rule should always be true. For those with a compound key examines every column and determines whether its value depends on the whole of the compound key or just some of the part of it. Remove those that depend only on part of the key to a new relation with that part as the primary key.

5.5.2 Table Structure


5.5.2.1 Name: Register Description: This table consists of personal details for all the customers. Column name Customer id Customer name E-mail id Occupation Password Data type Text Text Text Text Text Size 10 30 30 20 20 Constraints Primary key Not null Not null Not null Not null

5.5.2.2 Name: Product_details Description: It consists of specifications and their description details for all the products. Column name Productid Product name Modelid Processor Memory Harddisk Opticaldrive Data type Text Text Text Text Text Text Text Size 10 20 10 30 30 30 30 Constraints Primary Key Unique Not null Not null Not null Not null Not null

21

Screensize Internet Multimedia Webcam Ports Color Battery OS

Text Text Text Text Text Text Text Text

30 30 30 30 30 30 30 30

Not null Not null Not null Not null Not null Not null Not null Not null

5.5.2.3 Name: Questionnaires Description: It contains all questionnaires and their choices for the specific products. Column name Qno Questions Choice1 Choice2 Choice3 Choice4 Data type AutoNumber Text Text Text Text Text Size 10 150 5 5 5 5 Constraints Foreign key Unique Not null Not null Not null Not null

22

5.5.2.4 Name: Customer_response Description: It contains response to all the questionnaire of that particular product forall the customers. Column name Customer id Product id Model id Date OpQ1 OpQ2 OpQ3 OpQ4 OpQ5 Data type Text Text Text Date/time Number Number Number Number Number Size 10 10 10 5 5 5 5 5 Constraints Not null Not null Not null Not null Not null Not null Not null Not null Not null

OpQ6 OpQ7 OpQ8 OpQ9 OpQ10

Number Number Number Number Number

5 5 5 5 5

Not null Not null Not null Not null Not null

23

5.5.2.5 Name: Decision Description: It contains values as per the choices from the user. Column name Option1 Option2 Option3 Option4 Data type Number Number Number Number Size 20 20 20 20 Constraints Not null Not null Not null Not null

5.5.3 E-R Diagram


In this system, the need for flexibility of storage and retrieval is essential. Entity Relationship (ER) diagrams is associated with the databases and database management systems and explores how to use relationships in a pool of data when developing methods for data storage and retrieval. There are some effective ways of organizing and storing data (random, sequential and indexed) in transaction systems, where each transaction is processed to update a record in the master file. When information systems are developed primarily for use in transaction processing, the focus is often on a single entity. The overall logical structure of a database can be expressed graphically by E-R diagrams, which have following components: Entity Any object that has meaning for particular application. A computer system may be an entity. Entity is an object that is distinguishable from other objects with a specific set of attributes. Attributes Element of a data model or characteristic quality of a data type. Relation Equally, inequality or any property that can be said to hold for two objects in a specific order.

24

Symbolic notations:

Represents entities

Represents attributes

Represents Relationships

Represents primary key

Represents links

Fig 5.5.3.1 Symbolic notations of ER Diagram

25

Id

Password

Model id

Product id Product

Model name

Customer

Gets features

Log in

Answe -ring

Questionnaires Register Choice 4 Choice2 Choice3

Choice1

Id Name E-mail id

Occupati on

Fig 5.5.3.2 ER-Diagram for opinion mining

26

5.5.4 Architectural Design

Customer

Login

Main page

Products

questioners

ADMIN

Login

Main page

Products

Questionnaire s

Data Conversion

Report

Output

Decision

27

5.5.5 Data Dictionary A Data Dictionary, as the name implies, is a repository of information about data. In some database systems, the stored definitions of data (called schema) provide all necessary data dictionary information; in other, the data dictionary is supplementary. The information in the data dictionary is about types of data and uses of data. The data dictionary provides lists of data items sequenced alphabetically by classification, keyword, etc. the dictionary provides a consistent official description of data as well as consistent data names for programming and retrieval. The advantages of a data dictionary are not only consistency of data description and naming, but also ease of updating where one data description serves many purposes. The database administrator to enforce standards for names and descriptions may use data dictionaries; those who create data must follow their standards. Creating a data dictionary requires significant effort to remove past inconsistencies and ambiguities. Data dictionary is used To manage the details in a large system. To communicate a common meaning for all system elements. To document the feature of the system. To locate errors and omissions in the system.

FIELDNAME Customerid Customer name e-mailid occupation Password Productid Product name Modelid processor Memory Hard disk opticaldrive

CONSTRAINT Primary key Not null

LOCATION Register Register Register

DATATYPE Text Text Text Text Text Text Text Text Text Text Text Text

SIZE 10 30 30 20 20 30 30 30 30 30 30 30

Not null Not null Primary key Unique Not null Not null Not null Not null Not null

Register Register Product_details Product_details Product_details Product_details Product_details Product_details Product_details

28

Screensize Internet Multimedia Webcam Ports Color Battery Os Qno Questions Choice1 Choice2 Choice3 choice4 Customerid Productid Modelid Date Opq1 Opq2 Opq3 Opq4 Opq5 Opq6 Opq7 Opq8 Opq9 Opq10

Not null Not null Not null Not null Not null Not null Not null Not null Primary key Unique Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null Not null

Product_details Product_details Product_details Product_details Product_details Product_details Product_details Product_details Questionnaires Questionnaires Questionnaires Questionnaires Questionnaires Questionnaires Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response Customer_response

Text Text Text Text Text Text Text Text Auto number Text Text Text Text Text Number Number Number Date/time Number Number Number Number Number Number Number Number Number Number

30 30 30 30 30 30 30 30 10 150 5 5 5 5 10 10 10 5 5 5 5 5 5 5 5 5 5

Table 5.5.4.1 Data Dictionary for Opinion mining

29

5.5.6 Database Constraints

a) Primary Key: Username is a primary key constraint which represents uniqueness. The advantage is that we can avoid redundancy by using this primary key constraint. By using this, we can derive table in to 1Nf automatically. b) Not Null: this is used to represent that the concerned fields should not be left empty. That means one should enter the fields so that it is necessary to know the fields.

30

6. SYSTEM IMPLEMENTATION
6.1.1 Installation of .NET The three phases for installing Visual Studio .NET are as follows: Phase 1 involves installing Windows components. Phase 2 involves installing Visual Studio .NET. Phase 3 involves checking for service releases.

Installing Visual Studio is not a difficult task. The steps to install .NET are a) To start the installation, insert the Visual Studio .NET CD-ROM. If installation does not start automatically, double-click setup.exe to start the installation. Setup launches the initial screen. b) Click Windows Component Update to bring up the End User License Agreement screen. c) Click the I accept the agreement button to accept the user agreement, and the next screen appears. This screen lists the required Windows components for running Visual Studio .NET. d) Click Continue, and the next screen appears. Installing windows components requires rebooting the machine several times. Setup gives us an option to enter the password to do an unattended install. Setup uses the password to automatically log the user in after every reboot. Checking the Automatically log on check box enables the two text boxes. Type the password in the first text box. Retype the password for confirmation in the Confirm Password textbox. e) After specifying the password, click Install Now!To begin the installation of Windows components. The setup program installs the components and automatically reboots the system when necessary. This marks the end of the first phase of installation. After all the necessary Windows have been successfully installed, a screen is displayed. f) The next step is to start installing, which constitutes the second phase of the entire installation procedure. After clicking the done hyperlink the screen that appeared before is displayed, but this time the second link is enabled, and the first and third hyperlinks are disabled. g) Click Visual Studio .NET, and the setup program copies the files necessary for installation and displays the next screen.

31

h) After entering the product key and the desired name, click the I accept the agreement button. Click Continue to continue to the next part of the current phase, which is selecting the features we want to install. i) After selecting the features to install, click Install Now! to start the installation. The last phase of the installation, which is checking for service releases, kicks in after the Visual Studio .NET installation is complete. 6.1.1 Deployment of the System a) Copy the folder opinion mining from CD-ROM to the C partition. b) Open the file opinion mining of type Visual Basic Project File, double-click it. c) .NET application is launched, then Login form is displayed. The user then may register or a registered user authenticates to the System. 6.2 Launching Weka tool The WekaGUIChooser (class weka.gui.GUIChooser) provides a starting for launching Wekas main GUI applications and supporting tools. If one prefers a MDI (multiple document interface) appearance, then this is provided by an alternative launcher called Main (class weka.gui.Main). The GUI Chooser consists of four buttonsone for each of the four major Weka applicationsand four menus.

Fig 6.2 Weka GUI Chooser

The buttons can be used to start the following applications: Explorer An environment for exploring data with WEKA (the rest of this documentation deals with this application in more detail).

32

Experimenter An environment for performing experiments and conducting statistical tests between learning schemes. Knowledge Flow This environment supports essentially the same functions as the Explorer but with a drag-and-drop interface. One advantage is that it supports incremental learning. SimpleCLI Provides a simple command-line interface that allows direct execution of WEKA commands for operating systems that do not provide their own command line interface. The menu consists of four sections: 1. Program Log Window opens a log window that captures all that is printed to stdout or stderr. Useful for environments like MS Windows, where WEKA is normally not started from a terminal. Exit Closes WEKA.

2. Tools other useful applications. ArffVieweran MDI application for viewing ARFF files in spreadsheet format. SqlViewer represents an SQL worksheet, for querying databases via JDBC. Bayes net editor An application for editing, visualizing and learning Bayes nets. Weka homepage opens a browser window with WEKAs homepage.HOWTOs, code snippets, etc. The general WekaWiki, containing lots of examples and HOWTOs around the development and use of WEKA. Wekas on Source forgeWEKAs project homepage on Sourceforge.net. System Info lists some internals about the Java/WEKA environment, e.g., the CLASSPATH. To make it easy for the user to add new functionality to the menu without having to modify the code of WEKA itself, the GUI now offers a plug-in mechanism for such addons. More details can be found in the Wiki article Extensions for Wekas main GUI. 6.2.2 Technique Used In this system I can use only one clustering technique that can be applied in the Weka-3-6-4 tool i.e. Simple K-means algorithm which can be used for clustering the feedback that can be collected by several customers.

3. Visualization ways of visualizing data with WEKA.

33

7. SCREEN SHOTS

Login Form

The Existed Customer has log in by entering their details through this form.

34

Registration Form

The new customer has register by giving their personal details in this form.

35

Welcome Form

After successfully logging the existed customer a welcome message will be displayed And its like Home page includes information and their features for all the products.

36

Product Specifications Form This consists of all the specification details of product vostro by choosing the product id.

37

Product Specifications Form This form consists of all the specification details of product XMS by choosing the product id.

38

Product Specifications Form This form consists of all the specification details of product latitude by choosing the product id.

39

Product Specifications Form This form consists of all the specification details of product inspiron by choosing the product id.

40

Product Questionnaires Form

It consists of questionnaires with multiple choices for the selected product.

41

Conversion Form

The data is converted into arff file format through this form.

42

Report Text Form

It shows the product review information in the form of text format. This data can be given as an input for clustering in the weka tool

43

Clustering Result This form shows the result for the product review after performing the clustering method in the weka tool.

44

Product Decision

This form shows decision for product review.

45

8. SYSTEM TESTING
8.1 Definition of System Testing The common view of testing held by users is that it is performed to improve that there are no errors in a program. This is extremely difficult since designer cannot prove to be one hundred percent accurate. Therefore, the most useful and practical approach is with the understanding that testing is the process of executing a program with explicit intention of finding errors that make the program fail. Testing has its own cycle. The testing process begins with the product requirements phase and from there, parallels the entire development process. In other words, for each phase of each phase of the development process there is an important testing activity. Successful testing requires a methodical approach. It requires the focus on basic critical factors: 8.2 Test Plan Before going for testing, first we have to decide upon the type of testing to be carried out. The following factors are taken into consideration: To ensure that information properly flows into and out of program. To find whether the local data structures maintaining their integrity during all steps in an algorithm execution or not. To ensure that the module operate properly at boundaries established to limit or restrict processing. To find out whether error-handling paths are working correctly or not. To find out whether the values are correctly updated or not check for validation. Testing is an important phase in the development life cycle of the product. Objectives of Testing Testing is done to ensure No bug occurrence in future usage of the Application. Quality Assurance standard is achieved. Planning Project and process control Risk management Organization and professionalism Inspections Measurement tools

46

Discover symptoms caused by bugs and provide clear diagnosis so that bugs can be
easily prevented.

8.3 Test Case Design Techniques


During testing the program to be tested is executed with a set of test cases and output of the program for the test cases is evaluated to determine if the program is performing as expected. To accomplish this objectives test case design techniques are used: White Box testing Black Box testing

White box Testing White box testing of software is predicted on a close examination of procedural detail. The status of the project may be tested at various points to determine whether the expected or asserted status is corresponding to the actual status. Using this, the following test cases can be derived: Exercise all logical conditions on their true and false side. Execute all loops within their boundaries and their operation bounds. Exercise internal data structure to ensure their validity.

Black box Testing Knowing the specified function that a product has been designed to perform, test can be conducted that each function is fully operational. Black box test are carried out to test that input to a function is properly accepted and output is correctly produced. A black box tests examines some aspects of a system with little regard for the internal structure of the software. Errors in the following categories were found through Black box testing: Incorrect or missing functions. Interface errors. Errors in database structure or external database access. Performance errors. Initialization and termination errors.

47

Test Cases Case Generation Report Test Case 1


Input: Already existed customer has entered their Customer Id and Password. For new users Click on the New User Process: Checks whether the entered Customer Id and Password pair are correct or wrong by comparing with the database. Output: If Customer Id and Password word pair matches, then features form is displayed, else appropriate error message displayed .

Test Case 2
Input: Customer Id and Customer Name, Customer E-mail Id, Occupation, Password are entered. Process: Checks whether the Customer Id is automatically generated or not by comparing with the database. Output: If Customer Id is generated, then customer registration details are created

successfully, else a label is displayed with a message.

Test Case 3
Input: One or more mandatory fields are not filled. Process: Checks whether all the mandatory fields are filled with data or not. Output: If all the mandatory fields are filled, then next step is carried out, else an

appropriate error message is displayed.

8.4 Testing Strategies


Testing is set of activities that can be planned in advanced and conducted systematically. A strategy for software testing must accommodate low- level tests that are necessary to verify that a small source code segment has been correctly implemented as well as high-level tests that validate major system function against customer requirements.

8.4.1 Validation Testing


It begins after the integration testing is successfully assembled. Validation succeeds when the software functions in a manner that can be reasonably accepted by the client. In this the majority of the validation is done during the data entry operation where there is a

48

maximum possibility of entering wrong data. Other validation will be performed in all process where correct details and data should be entered to get the required results.

8.4.2 Test Case Verification Test Case Verification 1.1


User name is vara and password is giri that exist in the database. Input: vara and giri is entered as username and password Process: Checks vara and giripair exist in the database. Output: If vara and giri pair does matches, then features form is displayed, else appropriate error message Username or password is wrong is displayed.

Test Case Verification 1.2


Customer id is C101, Customer name is vara, E-mail id is pr_vara@gmail.com, Occupation is student and Password is giri. Input: C101, vara, pr_vara@gmail.com, student, giri are entered. Process: Checks whether the C101 is automatically generated or not by comparing with the database. Output: If C101 is generated, then the above details are created successfully, else a label is displayed with a message.

Test Case Verification 1.3


Customer id is C101, Customer name is vara, Occupation is student and Password is giri is entered while the answer is left null. Input: C101, vara, student and giri is entered while the answer is left null Process: Checks whether all the mandatory fields are filled with data or not. Output: As the mandatory field answer is null, the following error message is displayed PLEASE FILL ALL THE STARRED FIELDS

49

9. CONCLUSION and FUTURE ENHANCEMENT


9.1 Conclusion In this project, I proposed a set of techniques for mining and summarizing product feedback based on opinion mining technique. The objective is to make effective decision for product and sales improvement in manufacturing business sector with the help of feedback opinion collected from various customers. The experimental results indicate that the proposed techniques are very promising in performing their tasks. It is believe that this problem will become increasingly important as more people are buying and also giving feedback on the web. Summarizing the reviews is not only useful to common shoppers, but also crucial to product manufacturers. 9.2 Future Enhancements The above algorithm which works for same product with different models, its not possible to work for different products. In future its may be possible to enhance in this system for different products with different models.

50

CASE STUDY FOR CLUSTERING Algorithm: K-Means. The K-means algorithm for partitioning, where each clusters center is represented by the mean value of the objects in the cluster. Input: K: the number of clusters, D: a data set containing n objects. Output: A set of k clusters Method: 1) Arbitrarily choose K objects from D as the initial cluster centers; 2) Repeat 3) (Re)assign each object to the cluster to which the object is the most similar, based on the mean value of the objects in the cluster; 4) Update the cluster means, i.e., calculate the mean value of the objects for each cluster; 5) Until no change;

51

REFERENCES

1. Technical Papers [1]. Ana-Maria, Extracting Product Features and Opinion from Reviews, Human Language Technology Conference and Conference on Empirical Methods in Natural Language, Vancouver , Canada, pp. 339-351, 2005. [2]. Bing Liu, Opinion Observer: Analyzing and Comparing Opinions on the Web, WWW 2005, Chiba, Japan, pp.342-251, 2005. [3]. Hu. M. and Liu, B., Mining and Summarizing Customer Reviews. In KDD04, (Seattle, WA, 2004), ACM. [4]. Thomas Y. Lee, Needs-based Analysis of Online Customer Reviews, ICEC07, Minneapolis, Minnesota, USA, pp. 303-309, 2007. [5]. Wei Haung, XinChen and Haibo Wang, Product Information Retrieval based on Opinion Mining. China: University of Hubei, 2010. 2. Books [1]. Data Mining: Concepts and Techniques, Jiaweni Han, Morgan Kaufmann Publishers, Second Edition. [2]. ASP.NET and VB.NET Web Programming, Matt J. Crouch, Pearson Education. [3]. Software Engineering: A Practitioners Approach, Roger S. Pressman, McGraw Hill, and Sixth Edition.

52

S-ar putea să vă placă și