Sunteți pe pagina 1din 10

INTRANET CHATTING

A PROJECT REPORT

Submitted to

BABU BANARSI DAS UNIVERSITY

By

JANMEJAY SINGH

NITIN RAI

BHAIYA MRITUNJAY SINGH

FALAK ALAM

NAJAF AHMED NEYAZI

In partial fulfillment for the award of the degree

Of

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SCHOOL OF ENGINEERING
LUCKNOW

November 2019
Introduction
Investment is a business activity on which most people are interested in this
globalization era. There are several objects that are often used for investment, for
example, gold, stocks and property. In particular, property investment has
increased significantly. Housing price trends are not only the concern of buyers and
sellers, but it also indicates the current economic situation. There are many factors
which has impact on house prices, such as numbers of bedrooms and bathrooms.
Even the nearby location, a location with a great accessibility to highways,
expressways, schools, shopping malls and local employment opportunities
contributes to the rise in house price. Manual house predication becomes difficult,
hence there are many systems developed for house price prediction.

We have proposed an advanced house prediction system using linear regression.


This system aim is to make a model which can give us a good house pricing
prediction based on other variables. We are going to use Linear Regression for this
dataset and hence it gives a good accuracy. This house price prediction project has
two modules namely, Admin and User. Admin can add location and view the
location. Admin has authority to add density on the basis of per unit area. User can
view the location and see the predicted housing price for the particular location.

In this project, we will develop and evaluate the performance and the predictive
power of a model trained and tested on data collected from houses. Once we get a
good fit, we will use this model to predict the monetary value of a house located at
the different location. A model like this would be very valuable for a real state
agent who could make use of the information provided in a daily basis.

Context:-
This project has been done as part of my course for the Bachelors of Technology
at Babu Banarasi Das University, Supervised by Ms. Nusrat Fatima. I have six
months to fulfill the requirements in order to succeed the module.
Motivations:-
Being extremely interested in everything having a relation with the Machine
Learning, the project was a great occasion to give me the time to learn and confirm
my interest for this field. The fact that we can make estimations, predictions and
give the ability for machines to learn by themselves is both powerful and limitless
in term of application possibilities. We can use Machine Learning in Finance,
Medicine, almost everywhere. That’s why I decided to conduct my project around
the Machine Learning.
Idea:-
As a first experience, I wanted to make my project as much interesting as possible
by approaching every different steps of the machine learning process and trying to
understand them deeply. I chose to take Real Estate Prediction as approach. The
goal was to predict the price of a given apartment according to the market prices
taking into account different “features” that will be developed in the following
sections.

Data:-

The crucial element in machine learning task for which a particular attention
should be clearly taken is the data. Indeed the results will be highly influenced by
the data based on where did we find them, how are they formatted, are they
consistent, is there any outlier and so on. At this step, many questions should be
answered in order to guarantee that the learning algorithm will be efficient and
accurate. Many sub steps are taken to get, clean and transform the data.

Getting the data:-

The first problem was where can I get the data to build a large enough dataset
since I want to be able to predict the price for a given apartment according to the
real estate agency chosen. To address this problem, I decided to use the “web
scraping“ which is a technique of extracting information from websites. The idea
is to simulate the human behavior on different websites and parse the information
to save them. To do this, I used a framework in python called Scrapy that is very
intuitive and fast to realize this kind of tasks.

Technology Used:-

Python:-

Python is an interpreted, high-level, general-purpose programming language,


created by Guido van Rossum and first released in 1991, Python's design
philosophy emphasizes code readability with its notable use of significant
whitespace. Its language constructs and object-oriented approach aim to help
programmers write clear, logical code for small and large-scale projects. Python
is dynamically typed and garbage-collected. It supports multiple programming
paradigms, including procedural, object-oriented, and functional programming.
Python is often described as a "batteries included" language due to its
comprehensive standard library. Python interpreters are available for
many operating systems.

Machine Learning:-
Machine learning (ML) is the scientific study of algorithms and statistical
models that computer systems use to perform a specific task without using explicit
instructions, relying on patterns and inference. It is seen as a subset of artificial
intelligence. Machine learning algorithms builds a mathematical model based on
sample data, known as "training data", in order to make predictions or decisions
without being explicitly programmed to perform the task. Machine learning
algorithms are used in a wide variety of applications, such as email
filtering and computer vision, where it is difficult or infeasible to develop a
conventional algorithm for effectively performing the task.
Machine learning is closely related to computational statistics, which focuses on
making predictions using computers. The study of mathematical
optimization delivers methods, theory and application domains to the field of
machine learning. Data mining is a field of study within machine learning, and
focuses on exploratory data analysis through unsupervised learning.

Linear Regression Algorithm:-

The idea of regression is pretty simple:- given enough data, you can observe the
relationship between your target parameter (the output) and other parameters (the
input), and then apply this relationship function to real observed data.
To show you how regression algorithm works we’ll take into account only one
parameter – a home’s living area – to predict price. It’s logical to suppose that there
is a linear relationship between area and price. And as we remember from high
school, a linear relationship is represented by a linear equation:
y = k0 + k1*x
In our case, y equals price and x equals area. Predicting the price of a home is as
simple as solving the equation (where k0 and k1 are constant coefficients):
price = k0 + k1 * area

We can calculate these coefficients (k0 and k1) using regression.


Objective
Nowadays, e-education and e-learning is highly influenced. Everything is shifting
from manual to automated systems. The objective of this project is to predict the
house prices so as to minimize the problems faced by the customer. The present
method is that the customer approaches a real estate agent to manage his/her
investments and suggest suitable estates for his investments. But this method is
risky as the agent might predict wrong estates and thus leading to loss of the
customer’s investments. The manual method which is currently used in the market
is out dated and has high risk. So as to overcome this fault, there is a need for an
updated and automated system. Data mining algorithms can be used to help
investors to invest in an appropriate estate according to their mentioned
requirements. Also the new system will be cost and time efficient. This will have
simple operations. The proposed system works on Linear Regression Algorithm.

With a large amount of unstructured resources and documents, the Real estate
industry has become a highly competitive business. The data mining process in
such an industry provides an advantage to the developers by processing those data,
forecasting future trends and thus assisting them to make favorable knowledge-
driven decisions. In this project, the main focus is on data mining method and its
approach to develop a model which not only predicts the most suitable area for a
customer according to his\her interests, and it also recognizes the most preferred
location of real estate in any given area by ranking them. This is used to predict a
favorable location by ranking method. It analyses a set of locations selected by the
customer. It broadly works on two basic phases:-

The first phase ranks a group of customer defined locations to find an ideal area
and the second phase predicts the most suitable area according to their
requirements and interest.

It uses a classical technique called linear regression and tries to give an analysis of
the results obtained. It helps establishes the relationship strength between
dependent variable and other changing independent variable known as label
attribute and regular attribute respectively. Regression displays continuous value of
the dependent variable i.e. label attribute that is used for prediction.
Project Requirements
 Hardware Requirements:-
 Processor –Core i3
 Hard Disk – 160 GB
 Memory – 1GB RAM
 Monitor

 Software Requirements:-

 Windows 7 or higher
 MySQL database
 Anaconda
Feasibility Study
Our projects are not feasible if given unlimited resources and infinite time!.
Unfortunately, the development of computer based system is more likely to be
plagued by a scarcity of resources and difficult delivery dates. It is both necessary
and prudent to evaluate the feasibility of the project at the earliest possible time.
Months or years of effort, Money loss and untold professional embarrassment can
be avoided if you better understand the project at its study time.

This type of study determines if an application has to be developed or not, then it


has been determined that, application is feasible. After that analyst can go ahead
and prepares the project specification, which finalizes project requirements.
Feasibility studies are undertaken within right time constraints. There are following
type of feasibility study:-

1. Technical Feasibility
2. Economic Feasibility

 Technical Feasibility:-
As we know the technical feasibility is concerned with specifying equipment
and software that will successfully satisfy the user requirement. The technical
needs of the system may vary considerably, but might include , how many
workstations are required, how these units are interconnected so that they
could operate and communicate smoothly.

 Economical Feasibility:-
Economic analysis is the most frequently used technique for evaluating the
effectiveness of the proposed system. More commonly known as cost/benefits
analysis, the procedure is to determine the benefits and savings that are
expected from the purposed system and compared with costs.
Methodology
 Getting the Data and Previous Preprocess:-
The dataset used in this project comes from the MySql Database Tables.
This data was collected and stored. Each of the entries represents aggregate
information about 14 features of homes from various locations

 Data Exploration:-
In the first section of the project, we will make an exploratory analysis of the
dataset and provide some observations.

 Feature Observation:-
Data Science is the process of making some assumptions and hypothesis on
the data, and testing them by performing some tasks Houses with more
rooms will worth more. Usually houses with more rooms are bigger and can
fit more people, so it is reasonable that they cost more money.
Neighbourhood with more lower class workers, will worth less. If the
percentage of lower working class people is higher, it is likely that they have
low purchasing power and therefore, they houses will cost less.

 Exploratory Data Analysis:-


We will start by creating a scatter plot matrix that will allow us to visualize
the pair-wise relationships and correlations between the different features.
It is also quite useful to have a quick overview of how the data is distributed.

 Developing a Model:-
we will develop the tools and techniques necessary for a model to make a
prediction. Being able to make accurate evaluations of each model’s
performance through the use of these tools and techniques helps to reinforce
greatly the confidence in the predictions.

 Cross-Validation
K-fold cross-validation is a technique used for making sure that our model is
well trained, without using the test set. It consist in splitting data into k
partitions of equal size. For each partition i, we train the model on the
remaining k-1 parameters and evaluate it on partition i. The final score is the
average of the K scores obtained.
Facilities Required for Proposed Work

People looking to buy a new home tend to be more conservative with their budgets
and market strategies. The existing system involves calculation of house prices
without the necessary prediction about future market trends and price increase. The
goal of the project is to predict the efficient house pricing for real estate customers
with respect to their budgets and priorities. By analyzing previous market trends
and price ranges, and also upcoming developments future prices will be predicted.
The functioning of this project involves a website which accepts customer’s
specifications and then combines the application of multiple linear regression
algorithm of data mining. This application will help customers to invest in an estate
without approaching an agent. It also decreases the risk involved in the transaction.

In today’s real estate world, it has become tough to store such huge data and extract
them for one’s own requirement. Also, the extracted data should be useful. The
system makes optimal use of the Linear Regression Algorithm. The system makes
use of such data in the most efficient way. The linear regression algorithm helps to
fulfill customers by increasing the accuracy of estate choice and reducing the risk
of investing in an estate. A lots of features that could be added to make the system
more widely acceptable. One of the major future scopes is adding estate database
of more cities which will provide the user to explore more estates and reach an
accurate decision. More factors like recession that affect the house prices shall be
added. In-depth details of every property will be added to provide ample details of
a desired estate. This will help the system to run on a larger level.
References
● www.yalantis.com

● www.99acres.com/property-rates-and-pricetrendsin-mumbai

● www.shsu-ir.tdl.org/shir/bitstream/handle

● www.ieeexplore.ieee.org/document/6208293/

● www.ieeexplore.ieee.org/document/4679917/

● www.scribd.com

● www.slideshare.com

S-ar putea să vă placă și