Sunteți pe pagina 1din 58

Stock Market Portfolio Optimization

CHAPTER 1

INTRODUCTION
1.1 Overview

Stock market plays a very important role in fast economic growth of the
developing country like India. Developing nation’s growth may depend on performance
of stock market. If stock market rises, then countries economic growth would be high. If
stock market falls, then countries economic growth would be down. In other words, we
can say that stock market and country growth is tightly bounded with the performance of
stock market. In any country, only 10% of the people engaging themselves with the stock
market investment because of the dynamic nature of the stock market. There is a
misconception about the stock market i.e. buying and selling of shares is an act of
gambling. This misconception can be changed and bringing the awareness across the
people.

The attempt to solve the investor's problem of what to buy and when to buy led to
the emergence of two distinct schools of thought regarding security valuation and stock
price behaviour in the early period of stock market study. They are popularly referred as
the Fundamental Analysis and the Technical Analysis. The recent invention in computer
science and accessibility of Internet leads to use latest methods in prediction of stock
market. ARIMA model is one of the standard models for the prediction of future direction
of time series. RNN is the model used the techniques of computer science.

The Fundamental Analysts maintain that at any instant an individual security has
an intrinsic value, which should be equal to the present value of the future stream of
income from that security discounted at an appropriate risk- related rate of interest. The
actual price of a security is considered to be a function of set of anticipated returns and
anticipated capitalization rates. The real worth of a security is estimated by considering
the key economic and financial variables such as earning, dividends, and growth in
earning, capital structure, size of the company etc.

Dept. of I.S.E., S.C.E. 2018-19 1


Stock Market Portfolio Optimization

Predicting the Stock Market has been the bane and goal of investors since its
existence. Everyday billions of dollars are traded on the exchange, and behind each dollar
is an investor hoping to profit in one way or another. Entire companies rise and fall daily
based on the behaviour of the market. Should an investor be able to accurately predict
market movements, it offers a tantalizing promise of wealth and influence. It is no wonder
then that the Stock Market and its associated challenges find their way into the public
imagination every time it misbehaves. The 2008 financial crisis was no different, as
evidenced by the flood of films and documentaries based on the crash. If there was a
common theme among those productions, it was that few people knew how the market
worked or reacted.

Forecasting techniques play important role in stock market which can search
uncover and hidden patterns and increasing the certain level of accuracy, where
traditional and statistical methods are lacking. There is huge amount of data are generated
by stock markets forced the researchers to apply forecasting to make investment
decisions.

The credit of originating the concept of investment value goes to John B. Williams
[1938], who has also presented an actual formula for determining the intrinsic value of
stocks. However, the concept of intrinsic value was popularized by B. Graham and D.
Dodd [1934] in their classic book 'Security Analysis'. Many researchers have suggested
further development in the theory of intrinsic value.

Fundamentalist forecast stock prices on the basis of economic condition of the


industry and company statistics. In a major study covering the period 1927 - 1960. King
[1966] has found that about 50% of the variance of the average stock returns is explained
by the overall market factors. In another study over the period January 1966 to June 1970
Livingston [1977] observed that approximately 23% of the variance in stock returns was
accounted by the market effect. Elton and Gruber [1987] also reported that about 25% to
50% of the variations in a company's annual earning are due to the state of the overall
economy.

Dept. of I.S.E., S.C.E. 2018-19 2


Stock Market Portfolio Optimization

1.2 Problem Statement


Investors are familiar with the saying, “buy low, sell high” but this does not
provide enough context to make proper investment decisions. Before an investor invests
in any stock, he needs to be aware how the stock market behaves. Investing in a good
stock but at a bad time can have disastrous results, while investment in a mediocre stock
at the right time can bear profits. Financial investors of today are facing this problem of
trading as they do not properly understand as to which stocks to buy or which stocks to
sell in order to get optimum profits. Predicting long term value of the stock is relatively
easy than predicting on day-to-day basis as the stocks fluctuate rapidly every hour based
on world events.

1.3 Objectives

• In the past decades, there is an increasing interest in predicting markets among


economists, policymakers, academics and market makers. The objective of the
proposed work is to study and improve the supervised learning algorithms to predict
the stock price.
• The system must be able to access a list of historical prices. It must calculate the
estimated price of stock based on the historical data.
• To predict approximate value of share price using Recurrent Neural Network.
• To provide analysis for users through User Interface.

1.4 Limitations
• Handling of time series data in neural networks is very complicated.
• For the prediction of stock market with time series analysis technique requires high
volumes of data for training.
• The performance studies neglect some important features of the financial markets like
transaction costs, limited volume at given prices.
• The non-quantifiable factors like natural disasters, changes in the company board,
company merges, etc cannot be considered.

Dept. of I.S.E., S.C.E. 2018-19 3


Stock Market Portfolio Optimization

1.5 Organization of the Report

This report gives a description Stock Market Portfolio Optimization on Recurrent


Neural Network. This report is organized as 5 chapters, namely, introduction, analysis,
design, implementation, and lastly conclusion and future enhancements.

Chapter 1 gives a brief description about stock market and how it works and the
models that can be used to predict the stock price. Also, the problem statement,
objectives, limitations.

Chapter 2 includes literature survey, which have been referred to develop the
model.

Chapter 3 deals with the analysis part of development. Details of existing system,
its drawbacks, the proposed system, its advantages, the functional and non-functional as
well as the hardware and software requirements are specified.

Chapter 4 specifies the design details. Design is the process of establishing a


system that will satisfy the previously identified functional and non-functional
requirements. It contains a mention of the system block diagram or the architecture, and
various diagrams like the use case diagram, the sequence diagram, and lastly the activity
diagram.

Chapter 5 includes the implementation part. Implementation is the process of


converting the system design into an operational one. This phase starts after the
completion of the development phase and must be carefully planned and controlled as it is
a key stage. It includes a list of main packages, some of the user-defined functions and
some sample code.

Chapter 6 includes the testing part which is an investigation conducted to provide


stakeholders with information of the quality of product or service under test. It also
gives a business an opportunity to understand the risks of software implementation.
Test techniques include, but are not limited to the process of executing a program or
application with the intent of finding software bugs.

Dept. of I.S.E., S.C.E. 2018-19 4


Stock Market Portfolio Optimization

CHAPTER 2

LITERATURE SURVEY
I.Svalina, et al works on [1], Stock Market are one of the important parts of the
economy of a country. Actually, it’s the most important way for the companies to raise
capital. Not only the investors but also the common peoples are also finding it as an
investment tool. As stock market influences individual and national economy heavily,
predicting the future values of stock market is essential task while taking the correct
decision whether to buy or sell the share [3]. But it was very difficult to predict the stock
price trends [14] efficiently because many factors such as economics, politics,
environment etc were deciding parameters.

Adaptive Network-Based Fuzzy Inference System (ANFIS) has been used for
stock prediction of Istanbul Stock Exchange [2]. [1] also uses an ANFIS based model for
stock price prediction. A three-stage stock market prediction system is introduced in
paper [13]. [5] presents an integrated system where wavelet transforms and recurrent
neural network (RNN) based on artificial bee colony (abc) algorithm (called ABC-RNN)
are combined for stock price forecasting. A review of used data mining techniques used
in this purpose is analysed in [10].

M. A. Boyacioglu et al works on [2], The financial market is a complex,


evolutionary, and non-linear dynamical system. The field of financial forecasting is
characterized by data intensity, noise, non-stationary, unstructured nature, high degree of
uncertainty, and hidden relationships [2]. Many factors interact in finance including
political events, general economic conditions, and traders’ expectations. Therefore,
predicting finance market price movements is quite difficult. Increasingly, according to
academic investigations, movements in market prices are not random. Rather, they
behave in a highly non-linear, dynamic manner. The standard random walk assumption of
futures prices may merely be a veil of randomness that shrouds a noisy non-linear
process.

Dept. of I.S.E., S.C.E. 2018-19 5


Stock Market Portfolio Optimization

Support vector machine (SVM) is a very speci1c type of learning algorithms


characterized by the capacity control of the decision function, the use of the kernel
functions and the sparsity of the solution [6–8]. Established on the unique theory of the
structural risk minimization principle to estimate a function by minimizing an upper
bound of the generalization error, SVM is shown to be very resistant to the over-1tting
problem, eventually achieving a high generalization performance. Another key property
of SVM is that training SVM is equivalent to solving a linearly constrained quadratic
programming problem so that the solution of SVM is always unique and globally optimal,
unlike neural networks training which requires nonlinear optimization with the danger of
getting stuck at local minima.

E. F. Fama et al works on [3], Prediction of stock prices is very challenging and


complicated process because price movement just behaves like a random walk and time
varying. In recent years various researchers have used intelligent methods and techniques
in stock market for trading decisions. Here, we present a brief review of some of the
significant researchers. A Sheta [7] has used Takagi- Sugeno (TS) technique to develop
fuzzy models for two nonlinear processes. They were estimated software effort for a
NASA software projects and the prediction of the next week S&P 500 for stock market.

The development process of the TS fuzzy model can be achieved in two steps 1)
the determination of the membership functions in the rule antecedents using the model
input data; 2) the estimation of the consequence parameters. They used least-square
estimation to estimate these parameters. The results were promising. M.H. FazelZarandiet
al. [8] have developed a type-2 fuzzy rule based expert system for stock price analysis.
Interval type-2 fuzzy logic system permitted to model rule uncertainties and every
membership value of an element was interval itself. The proposed type-2 fuzzy model
applied the technical and fundamental indexes as the input variables. S
AbdulsalamiSulaiman Olaniyi et al [11] have proposed a linear regression method of
analysing coupled behaviour of stocks in the market.

T.-J. Hsieh et al works on [4], Modelling functions of neural networks are being
applied to a widely expanding range of applications in addition to the traditional areas
such as pattern recognition and control. Its non-linear learning and smooth interpolation

Dept. of I.S.E., S.C.E. 2018-19 6


Stock Market Portfolio Optimization

capabilities give the neural network an edge over standard computers and expert systems
for solving certain problems. Accurate stock market prediction is one such problem.
Several mathematical models have been developed, but the results have been
dissatisfying. We chose this application as a means to check whether neural networks
could produce a successful model in which their generalization capabilities could be used
for stock market prediction.

Fujitsu and Nikko Securities are working together to develop TOPIX’s a buying
and selling prediction system. The input consists of several technical and economic
indexes. In our system, several modular neural networks leamed the relationships
between the past technical and economic indexes and the timing for when to buy and sell.
A prediction system that was made up of modular neural networks was found to be
accurate. Simulation of buying and selling stocks using the prediction system showed an
excellent profit. Stock price fluctuation factors could be extracted by analysing the
networks.

Dept. of I.S.E., S.C.E. 2018-19 7


Stock Market Portfolio Optimization

CHAPTER 3

ANALYSIS

3.1 Existing System

3.1.1 Description

Linear regression is widely used throughout Finance in a plethora of applications.


Linear regression is a method used to model a relationship between a dependent variable
(y), and an independent variable (x). With simple linear regression, there will only be one
independent variable x. There can be many independent variables which would fall under
the category of multiple linear regression. In this circumstance, we only have one
independent variable which is the date. The date will be represented by an integer starting
at 1 for the first date going up to the length of the vector of dates which can vary
depending on the time series data. Our dependent variable, of course, will be the price of
a stock.

3.1.2 Drawbacks

The accuracy of the prediction by Linear Regression is actually not high enough
to make a good decision on stock trading. Linear Regression is limited to linear
relationships. The algorithm already assume the system is a straight-line. However, for
stock trading, the values of the system could be either a raise, a drop or remain constant.
The data values are scattered and fluctuated. Apart from that, Linear Regression is not a
complete description of relationships among variable. It only provides the functionality to
investigate on the mean of the dependent variable and the independent variable. However,
it is not applicable for the situation we encountered in stock market. And hence, the
prediction is actually suppressed by this constraint

Dept. of I.S.E., S.C.E. 2018-19 8


Stock Market Portfolio Optimization

3.2 Proposed System

3.2.1 Description

Analysts making forecasts often have extensive domain knowledge about the
quantity they are forecasting, but limited statistical knowledge. In the Prophet model
specification, there are several places where analysts can alter the model to apply their
expertise and external knowledge without requiring any understanding of the underlying
statistics.

Capacities: Analysts may have external data for the total market size and can
apply that knowledge directly by specifying capacities. Changepoints: Known dates of
changepoints, such as dates of product changes, can be directly specified. Holidays and
seasonality: Analysts that we work with have experience with which holidays impact
growth in which regions, and they can directly input the relevant holiday dates and the
applicable time scales of seasonality. Smoothing parameters: By adjusting τ an analyst
can select from within a range of more global or locally smooth models.

The seasonality and holiday smoothing parameters allow the analyst to tell the
model how much of the historical seasonal variation is expected in the future. With good
visualization tools, analysts can use these parameters to improve the model fit. When the
model fit is plotted over historical data, it is quickly apparent if there were changepoints
that were missed by the automatic changepoint selection.

3.2.2 Advantages

• Our analyst-in-the-loop modeling approach is an alternative approach that


attempts to blend the advantages of statistical and judgmental forecasts by
focusing analyst effort on improving the model when necessary rather that directly
producing forecasts through some unstated procedure.
• We find that our approach closely resembles the “transform-visualizemodel” loop
proposed by Wickham & Grolemund (2016), where the human domain knowledge
is codified in an improved model after some iteration. Typical scaling of

Dept. of I.S.E., S.C.E. 2018-19 9


Stock Market Portfolio Optimization

forecasting would rely on fully automated procedures, but judgmental forecasts


have been shown to be highly accurate in many applications (Lawrence et al.
2006).
• Our proposed approach lets analysts apply judgment to forecasts through a small
set of intuitive model parameters and options, while retaining the ability to fall
back on fully automated statistical forecasting when necessary.

3.3 Requirements Specifications

The direct result of requirements analysis is Requirements specification.


Hardware requirements specifications list the necessary hardware for the proper
functioning of the project. Software requirements specifications is a description of a
software system to be developed, laying out functional and non-functional requirements,
and may include a set of use cases that describe interactions the users will have the
software. In software engineering, a functional requirement defines the function of a
system and its components. A function is described as a set of inputs, the behaviour, and
outputs. A non-functional requirement that specifies the criteria that can be used to judge
the operation of a system, rather than specific behaviour.

3.3.1 Functional Requirements

The functional Requirements Specification documents the operation and activities


that a system must be able to perform.

Functional requirements include:

• Descriptions of how data is collected and stored.


• Descriptions of data cleaning and pre-processing methods.
• Descriptions of work-flows performed by the system.
• Descriptions of outputs.
• How the system meets applicable regulatory requirements.

3.3.2 Non-Functional Requirements

The non-functional requirement specifies the criteria that can be used to judge the
operation of a system, rather than specific behaviours.

Dept. of I.S.E., S.C.E. 2018-19 10


Stock Market Portfolio Optimization

• User Interfaces: The external users are the clients. All the clients can use
this software for indexing and searching.
• Hardware Interfaces: The external hardware interface used for indexing
and searching is personal computers of the clients. The PC’s may be laptops
with wireless LAN as the internet connections provided will be wireless.
• Software Interfaces: The Operating Systems can be any version of
Windows.
• Performance Requirements: The PC’s used must be atleast have i5
processor so that they can give optimum performance of the product.

3.3.3 Hardware Requirements

Hardware requirements specifications list the necessary hardware for the


proper functioning of the project.

• Hard Disk : 10 GB
• RAM : 8 GB RAM
• Processor : Multicore processor, i5-i7
• Processor speed : 2.6 GHz and above

3.3.4 Software Requirements

Software requirements specifications is a description of a software system to be


developed, laying out functional and non-functional requirements, and may include a set
of use cases that describe interactions the users will have the software.

• Operating System : Windows 10 or Linux


• Programming Language : Python 3.7
• IDE : Jupyter Notebook
• API’s : Quandl

Dept. of I.S.E., S.C.E. 2018-19 11


Stock Market Portfolio Optimization

CHAPTER 4
DESIGN

4.1 SYSTEM ARCHITECTURE


The figure 4.1 gives the overall system architecture of our project. It shows the
working of our project where the data is extracted from the global dataset which
undergoes the data processing. Then the filtered data is sent to the Prophet Model. Then
the model predicts the test results.

GLOBAL
DATASET

QUANDL API DATA PRE-


DATA FRAME
PROCESSING

TEST PROPHET
TESTING SET
RESULTS MODEL

Figure 4.1 System Architecture

Dept. of I.S.E., S.C.E. 2018-19 12


Stock Market Portfolio Optimization

4.2 USE CASE DIAGRAM


In the figure 4.2 actor is the user which is represented by stick diagram. The
following use case diagram shows the interaction between user and the system. The user
inputs the date from which the stock market attributes are fetched from quandl API. The
prediction range can also be input by the user.

Figure 4.2 Use Case Diagram

4.3 PROPHET MODEL


Prophet is optimized for the business forecast tasks that were encountered at
Facebook.Prophet’s default settings to produce forecasts that are often accurate as those
produced by skilled forecasters, with much less effort. With Prophet, you are not stuck
with the results of a completely automatic procedure if the forecast is not satisfactory —
an analyst with no training in time series methods can improve or tweak forecasts using a
variety of easily-interpretable parameters. We have found that by combining automatic
forecasting with analyst-in-the-loop forecasts for special cases, it is possible to cover a
wide variety of business use-cases. The following diagram illustrates the forecasting
process we have found to work at scale.

Dept. of I.S.E., S.C.E. 2018-19 13


Stock Market Portfolio Optimization

Figure4.3 Prophet Model

4.4 SEQUENCE DIAGRAM

A Sequence diagram is a structured in such a way that it represents a timeline


which begins at the top and descends gradually to mark the sequence of interactions. Each
object has a column and the message exchanged between them are represented by arrows.

As illustrated in the figure 4.4 the uncleaned training data is sent to the data
processing model which results in parsed processed data (parse data), where machine
learning techniques applied and sent to the training model. The results are predicted and
with the help of matplot library the product attribute graph is obtained.

Dept. of I.S.E., S.C.E. 2018-19 14


Stock Market Portfolio Optimization

Figure4.4 Sequence Diagram

4.5 ACTIVITY DIAGRAM

Activity diagrams are graphical representations of workflows of stepwise


activities and actions with support for choice, iteration and concurrency. In the Unified
Modelling Language, activity diagrams are intended to model both computational and
organizational processes as well as the data flows intersecting with the related
activities. Although activity diagrams primarily show the overall flow of control, they can
also include elements showing the flow of data between activities through one or more
data stores.

Dept. of I.S.E., S.C.E. 2018-19 15


Stock Market Portfolio Optimization

Figure4.5 Activity Diagram

Dept. of I.S.E., S.C.E. 2018-19 16


Stock Market Portfolio Optimization

CHAPTER 5

IMPLEMENTATION

5.1 Main Modules

The main models are

• Data Collection
• Data Pre-processing
• Normalization
• Model Fitting
• Testing/Validation

5.1.1 Data Collection

Data collection is the process of gathering and measuring information on targeted


variables in an established system, which then enables one to answer relevant questions
and evaluate outcomes. Data collection is a component of research in all fields of study
including physical and social sciences, humanities, and business. While methods vary by
discipline, the emphasis on ensuring accurate and honest collection remains the same. The
goal for all data collection is to capture quality evidence that allows analysis to lead to the
formulation of convincing and credible answers to the questions that have been posed.

Quandl's data products come in many forms and contain various objects, including
time-series and tables. Through our APIs and various tools (R, Python, Excel, etc.), users
can access/call the premium data to which they have subscribed. (Our free data can be
accessed by anyone who has registered for an API key.)

Dept. of I.S.E., S.C.E. 2018-19 17


Stock Market Portfolio Optimization

5.1.2 Data Pre-processing

It is a data mining technique that transforms raw data into an understandable


format. Raw data (real world data) is always incomplete and that data cannot be sent
through a model. That would cause certain errors. That is why we need to pre-process data

before sending through a model.

5.1.2.1 Steps in Data Pre-processing


1. Import Libraries
2. Read Data
3. Checking for Missing Values
4.Checking for Categorical Variable
5.Satndarized the data

5.1.3 Normalization
Normalization is a technique often applied as part of data preparation for machine
learning. The goal of normalization is to change the values of numeric columns in the
dataset to a common scale, without distorting differences in the ranges of values. For
machine learning, every dataset does not require normalization. It is required only when

features have different ranges.

5.1.4 Model Fitting


Keras is an incredible library: it allows us to build state-of-the-art models in a few
lines of understandable Python code. Although other neural network libraries may be
faster or allow more flexibility, nothing can beat Keras for development time and ease-of-
use.
With the training and validation data prepared, the network built, and the
embeddings loaded, model to learn how to write patent abstracts

5.1.5 Testing/Validation
Test Dataset: The sample of data used to provide an unbiased evaluation of a final
model fit on the training dataset. The Test dataset provides the gold standard used to

Dept. of I.S.E., S.C.E. 2018-19 18


Stock Market Portfolio Optimization

evaluate the model. It is only used once a model is completely trained (using the train and
validation sets).

Validation Dataset: The sample of data used to provide an unbiased evaluation of


a model fit on the training dataset while tuning model hyperparameters. The evaluation
becomes more biased as skill on the validation dataset is incorporated into the model
configuration.

5.2 Main Packages

Some of the main packages used in this project are as mentioned below:

1. Fbprophet
Prophet follows the sklearn model API. An instance off the Prophet class is created
and then call its fit and predict methods.
2. pytrends
Unofficial API for Google Trends (fork). Allows simple interface for automating
downloading of reports from Google Trends. Main feature is to allow the script to
login to Google on your behalf to enable a higher rate limit.
3. Pandas
Pandas is a software library written for the Python programming language for data
manipulation and analysis. In particular, it offers data structures and operations for
manipulating numerical tables and time series.
4. NumPy
NumPy is a package in Python used for Scientific Computing. NumPy package is
used to perform different operations.
5. from pytrends.request import TrendReq
It is used to connect to google.
6. matplotlib. pyplot as plt
Matplotlib is a Python 2D plotting library which produces publication quality figures
in a variety of hardcopy formats and interactive environments across platforms.

Dept. of I.S.E., S.C.E. 2018-19 19


Stock Market Portfolio Optimization

7. Quandl
The API can be used to deliver more complex datasets. This call gets the quarterly
percentage change in AAPL stock between 1985 and 1997, closing prices only, in
JSON format.

5.3 Main User Defined Functions

A User-Defined Function (UDF) is a function provided by the user of program or


environment, in a context where the usual assumption is that functions are built into
program or environment. A user defined function is a programmed routine that has its
parameters set by the user of the system. Below mentioned are the user-defined functions
used.

Function Name create_prophet_model()


Syntax create_prophet_model(self, days=0, resample=False)

Description Method to fit the dataframe into a fbprophet forcasting


model,predict future values for ‘days’ number of days and
plot the values.
Parameters days, resample
Called Function create_model(), fit(), predict()
Return Value model, predicted data

Table 5.1 Function create_prophet_model () details

Function Name evaluate_prediction()


Syntax evaluate_prediction(self, start_date=None, end_date=None,
nshares = None)
Description Method to evaluate the accuracy of the predictions of the
model.
Parameters start_date, end_date, nshares
Called Function make_future_dataframe(), fit(), predict()
Return Value void

Table 5.2 Function evaluate_prediction () details

Dept. of I.S.E., S.C.E. 2018-19 20


Stock Market Portfolio Optimization

Function Name changepoint_prior_analysis()


Syntax changepoint_prior_analysis(self, changepoint_priors=[0.001,
0.05, 0.1, 0.2], colors=['b', 'r', 'grey', 'gold'])
Description Method to analyse the effect of different changepoint values
and plot a graph to depict the same.
Parameters changepoint_priors, colors
Called Function make_future_dataframe(), fit(), predict()
Return Value Void

Table 5.3 Function changepoint_prior_analysis() details

Function Name buy_and_hold()


Syntax buy_and_hold(self, start_date=None, end_date=None,
nshares=1)
Description Preditcts the value of ‘n’ number of share values.
Parameters start_date, end_date, nshares
Called Function handle_dates()
Return Value Void

Table 5.4 Function buy_and_hold() details

Function Name predict_future()


Syntax predict_future(self, days=30)
Description Predicts the future values for ‘days’ number of days and
plots a graph for the same.
Parameters days
Called Function create_model(), fit(), predict()
Return Value void

Table 5.5 Function predict_future() details

Dept. of I.S.E., S.C.E. 2018-19 21


Stock Market Portfolio Optimization

Function Name changepoint_prior_validation ()


Syntax changepoint_prior_validation(self, start_date=None,
end_date=None,changepoint_priors = [0.001, 0.05, 0.1,
0.2])
Description Evaluates the prediction errors and accuracy.
Parameters start_date, end_date, changepoint_priors
Called Function make_future_dataframe(), fit(), predict()
Return Value Void

Table 5.6 Function changepoint_prior_validation () details

4.4 Main Built-In Functions.

A function that is built into an application and can be accessed by end-users.


Some of the built-in applications used in the code are:

Function name get()


Syntax quandl.get('%s/%s' % (exchange, ticker))
Description To import the data from quandl repository
Parameters exchange, ticker
Return value dataframe

Table 5.7 Function get () details

Function name Prophet()

Syntax model = fbprophet.Prophet()

Description creates a fbprophet model

Parameters none

Return value model

Table 5.8 Function Prophet () details

Dept. of I.S.E., S.C.E. 2018-19 22


Stock Market Portfolio Optimization

Function name fit()


Syntax model.fit(train)
Description Fits the ‘train’ dataframe to the fbprophet model.
Parameters dataframe for training
Return value ----

Table 5.9 Function fit () details

Function name make_future_dataframe()


Syntax model.make_future_dataframe(periods=180, freq='D')
Description Make dataframe with future dates fro forecasting
Parameters periods, freq
Return value dataframe

Table 5.10 Function make_future_dataframe ()

details

Function name predict()


Syntax model.predict(future)
Description Predict the values for ‘future’ given a model ‘model’
Parameters dataframe
Return value dataframe

Table 5.11 Function predict () details

5.5 Sample Code


file.py
import fbprophet

import pytrends

import pandas as pd

import numpy as np

Dept. of I.S.E., S.C.E. 2018-19 23


Stock Market Portfolio Optimization

from pytrends. request import TrendReq

import matplotlib.pyplot as plt

import matplotlib

import quandl

class Stocker():

# Initialization requires a ticker symbol

def __init__(self, ticker, exchange='WIKI'):

# Enforce capitalization

ticker = ticker.upper()

# Symbol is used for labeling plots

self.symbol = ticker

# Retrieval the financial data

try:

stock = quandl.get('%s/%s' % (exchange, ticker))

except Exception as e:

print('Error Retrieving Data.')

print(e)

return

# Set the index to a column called Date

stock = stock.reset_index(level=0)

# Columns required for prophet

stock['ds'] = stock['Date']

if ('Adj. Close' not in stock.columns):

stock['Adj. Close'] = stock['Close']

stock['Adj. Open'] = stock['Open']

stock['y'] = stock['Adj. Close']

stock['Daily Change'] = stock['Adj. Close'] - stock['Adj. Open']

Dept. of I.S.E., S.C.E. 2018-19 24


Stock Market Portfolio Optimization

# Data assigned as class attribute

self.stock = stock.copy()

# Minimum and maximum date in range

self.min_date = min(stock['Date'])

self.max_date = max(stock['Date'])

# Find max and min prices and dates on which they occurred

self.max_price = np.max(self.stock['y'])

self.min_price = np.min(self.stock['y'])

self.min_price_date = self.stock[self.stock['y'] == self.min_price]['Date']

self.min_price_date = self.min_price_date[self.min_price_date.index[0]]

self.max_price_date = self.stock[self.stock['y'] == self.max_price]['Date']

self.max_price_date = self.max_price_date[self.max_price_date.index[0]

# The starting price (starting with the opening price)

self.starting_price = float(self.stock.ix[0, 'Adj. Open'])

# The most recent price

self.most_recent_price = float(self.stock.ix[len(self.stock) - 1, 'y']

# Whether or not to round dates

self.round_dates = True

# Number of years of data to train on

self.training_years = 3

# Prophet parameters

# Default prior from library

self.changepoint_prior_scale = 0.05

self.weekly_seasonality = True

self.daily_seasonality = True

self.monthly_seasonality = True

self.yearly_seasonality = True

Dept. of I.S.E., S.C.E. 2018-19 25


Stock Market Portfolio Optimization

self.changepoints = None

print('{} Stocker Initialized. Data covers {} to {}.'.format(self.symbol,

self.min_date.date(), self.max_date.date()))

"""

Make sure start and end dates are in the range and can be

converted to pandas datetimes. Returns dates in the correct format

"""

def handle_dates(self, start_date, end_date):

# Default start and end date are the beginning and end of data

if start_date is None:

start_date = self.min_date

if end_date is None:

end_date = self.max_date

try:

# Convert to pandas datetime for indexing dataframe

start_date = pd.to_datetime(start_date)

end_date = pd.to_datetime(end_date)

except Exception as e:

print('Enter valid pandas date format.')

print(e)

return

valid_start = False

valid_end = False

Dept. of I.S.E., S.C.E. 2018-19 26


Stock Market Portfolio Optimization

# User will continue to enter dates until valid dates are met

while (not valid_start) & (not valid_end):

valid_end = True

valid_start = True

if end_date.date() < start_date.date():

print('End Date must be later than start date.')

start_date = pd.to_datetime(input('Enter a new start date: '))

end_date= pd.to_datetime(input('Enter a new end date: '))

valid_end = False

valid_start = False

else:

if end_date.date() > self.max_date.date():

print('End Date exceeds data range')

end_date= pd.to_datetime(input('Enter a new end date: '))

valid_end = False

if start_date.date() < self.min_date.date():

print('Start Date is before date range')

start_date = pd.to_datetime(input('Enter a new start date: '))

valid_start = False

return start_date, end_date

"""

Return the dataframe trimmed to the specified range.

"""

def make_df(self, start_date, end_date, df=None)

# Default is to use the object stock data

if not df:

Dept. of I.S.E., S.C.E. 2018-19 27


Stock Market Portfolio Optimization

df = self.stock.copy()

start_date, end_date = self.handle_dates(start_date, end_date)

# keep track of whether the start and end dates are in the data

start_in = True

end_in = True

# If user wants to round dates (default behavior)

if self.round_dates:

# Record if start and end date are in df

if (start_date not in list(df['Date'])):

start_in = False

if (end_date not in list(df['Date'])):

end_in = False

# If both are not in dataframe, round both

if (not end_in) & (not start_in):

trim_df = df[(df['Date'] >= start_date.date()) &

(df['Date'] <= end_date.date())]

else:

# If both are in dataframe, round neither

if (end_in) & (start_in):

trim_df = df[(df['Date'] >= start_date.date()) &

(df['Date'] <= end_date.date())]

else:

# If only start is missing, round start

Dept. of I.S.E., S.C.E. 2018-19 28


Stock Market Portfolio Optimization

if (not start_in):

trim_df = df[(df['Date'] > start_date.date()) &

(df['Date'] <= end_date.date())]

# If only end is imssing round end

elif (not end_in):

trim_df = df[(df['Date'] >= start_date.date()) &

(df['Date'] < end_date.date())]

else:

valid_start = False

valid_end = False

while (not valid_start) & (not valid_end):

start_date, end_date = self.handle_dates(start_date, end_date)

# No round dates, if either data not in, print message and return

if (start_date in list(df['Date'])):

valid_start = True

if (end_date in list(df['Date'])):

valid_end = True

# Check to make sure dates are in the data

if (start_date not in list(df['Date'])):

print('Start Date not in data (either out of range or not a trading day.)')

start_date = pd.to_datetime(input(prompt='Enter a new start date: '))

elif (end_date not in list(df['Date'])):

print('End Date not in data (either out of range or not a trading day.)')

Dept. of I.S.E., S.C.E. 2018-19 29


Stock Market Portfolio Optimization

end_date = pd.to_datetime(input(prompt='Enter a new end date: ') )

# Dates are not rounded

trim_df = df[(df['Date'] >= start_date.date()) &

(df['Date'] <= end_date.date())]

return trim_df

# Basic Historical Plots and Basic Statistics

def plot_stock(self, start_date=None, end_date=None, stats=['Adj. Close'],


plot_type='basic')

self.reset_plot()

if start_date is None:

start_date = self.min_date

if end_date is None:

end_date = self.max_date

stock_plot = self.make_df(start_date, end_date)

colors = ['r', 'b', 'g', 'y', 'c', 'm']

for i, stat in enumerate(stats)

stat_min = min(stock_plot[stat])

stat_max = max(stock_plot[stat])

stat_avg = np.mean(stock_plot[stat])

date_stat_min = stock_plot[stock_plot[stat] == stat_min]['Date']

date_stat_min = date_stat_min[date_stat_min.index[0]].date()

date_stat_max = stock_plot[stock_plot[stat] == stat_max]['Date']

Dept. of I.S.E., S.C.E. 2018-19 30


Stock Market Portfolio Optimization

date_stat_max = date_stat_max[date_stat_max.index[0]].date()

print('Maximum {} = {:.2f} on {}.'.format(stat, stat_max, date_stat_max))

print('Minimum {} = {:.2f} on {}.'.format(stat, stat_min, date_stat_min))

print('Current {} = {:.2f} on {}.\n'.format(stat, self.stock.ix[len(self.stock) - 1,

stat], self.max_date.date())

# Percentage y-axis

if plot_type == 'pct':

# Simple Plot

plt.style.use('fivethirtyeight');

if stat == 'Daily Change':

plt.plot(stock_plot['Date'], 100 * stock_plot[stat],color = colors[i], linewidth

= 2.4, alpha = 0.9,label = stat)

else:

plt.plot(stock_plot['Date'], 100 * (stock_plot[stat] - stat_avg) / stat_avg,

color = colors[i], linewidth = 2.4, alpha = 0.9,

label = stat)

plt.xlabel('Date'); plt.ylabel('Change Relative to Average (%)'); plt.title('%s


Stock History' % self.symbol);

plt.legend(prop={'size':10})

plt.grid(color = 'k', alpha = 0.4);

# Stat y-axis

elif plot_type == 'basic':

Dept. of I.S.E., S.C.E. 2018-19 31


Stock Market Portfolio Optimization

plt.style.use('fivethirtyeight');

plt.plot(stock_plot['Date'], stock_plot[stat], color = colors[i], linewidth = 3,

label = stat, alpha = 0.8)

plt.xlabel('Date'); plt.ylabel('US $'); plt.title('%s Stock History' % self.symbol);

plt.legend(prop={'size':10})

plt.grid(color = 'k', alpha = 0.4);

plt.show();

# Reset the plotting parameters to clear style formatting

# Not sure if this should be a static method

@staticmethod

def reset_plot():

# Restore default parameters

matplotlib.rcParams.update(matplotlib.rcParamsDefault)

# Adjust a few parameters to liking

matplotlib.rcParams['figure.figsize'] = (8, 5)

matplotlib.rcParams['axes.labelsize'] = 10

matplotlib.rcParams['xtick.labelsize'] = 8

matplotlib.rcParams['ytick.labelsize'] = 8

matplotlib.rcParams['axes.titlesize'] = 14

matplotlib.rcParams['text.color'] = 'k'

# Method to linearly interpolate prices on the weekends

def resample(self, dataframe):

# Change the index and resample at daily level

Dept. of I.S.E., S.C.E. 2018-19 32


Stock Market Portfolio Optimization

dataframe = dataframe.set_index('ds')

dataframe = dataframe.resample('D')

# Reset the index and interpolate nan values

dataframe = dataframe.reset_index(level=0)

dataframe = dataframe.interpolate()

return dataframe

# Remove weekends from a dataframe

def remove_weekends(self, dataframe):

# Reset index to use ix

dataframe = dataframe.reset_index(drop=True)

weekends = []

# Find all of the weekends

for i, date in enumerate(dataframe['ds']):

if (date.weekday()) == 5 | (date.weekday() == 6):

weekends.append(i)

# Drop the weekends

dataframe = dataframe.drop(weekends, axis=0)

return dataframe

# Calculate and plot profit from buying and holding shares for specified date range

def buy_and_hold(self, start_date=None, end_date=None, nshares=1):

self.reset_plot()

start_date, end_date = self.handle_dates(start_date, end_date)

# Find starting and ending price of stock

Dept. of I.S.E., S.C.E. 2018-19 33


Stock Market Portfolio Optimization

start_price = float(self.stock[self.stock['Date'] == start_date]['Adj. Open'])

end_price = float(self.stock[self.stock['Date'] == end_date]['Adj. Close'])

# Make a profit dataframe and calculate profit column

profits = self.make_df(start_date, end_date)

profits['hold_profit'] = nshares * (profits['Adj. Close'] - start_price)

# Total profit

total_hold_profit = nshares * (end_price - start_price)

print('{} Total buy and hold profit from {} to {} for {} shares = ${:.2f}'.format

(self.symbol, start_date.date(), end_date.date(), nshares, total_hold_profit))

# Plot the total profits

plt.style.use('dark_background')

# Location for number of profit

text_location = (end_date - pd.DateOffset(months = 1)).date()

# Plot the profits over time

plt.plot(profits['Date'], profits['hold_profit'], 'b', linewidth = 3)

plt.ylabel('Profit ($)'); plt.xlabel('Date'); plt.title('Buy and Hold Profits for {} {} to

{}'.format( self.symbol, start_date.date(), end_date.date()))

# Display final value on graph

plt.text(x = text_location,

y = total_hold_profit + (total_hold_profit / 40),

s = '$%d' % total_hold_profit,

color = 'g' if total_hold_profit > 0 else 'r', size = 14)

plt.grid(alpha=0.2)

Dept. of I.S.E., S.C.E. 2018-19 34


Stock Market Portfolio Optimization

plt.show();

# Create a prophet model without training

def create_model(self):

# Make the model

model = fbprophet.Prophet(daily_seasonality=self.daily_seasonality,

weekly_seasonality=self.weekly_seasonality,

yearly_seasonality=self.yearly_seasonality,

changepoint_prior_scale=self.changepoint_prior_scale,

changepoints=self.changepoints)

if self.monthly_seasonality:

# Add monthly seasonality

model.add_seasonality(name = 'monthly', period = 30.5, fourier_order = 5)

return model

# Graph the effects of altering the changepoint prior scale (cps)

def changepoint_prior_analysis(self, changepoint_priors=[0.001, 0.05, 0.1, 0.2],

colors=['b', 'r', 'grey', 'gold']):

# Training and plotting with specified years of data

train = self.stock[(self.stock['Date'] > (max(self.stock['Date']) –

pd.DateOffset(years=self.training_years)).date())]

# Iterate through all the changepoints and make models

for i, prior in enumerate(changepoint_priors):

# Select the changepoint

Dept. of I.S.E., S.C.E. 2018-19 35


Stock Market Portfolio Optimization

self.changepoint_prior_scale = prior

# Create and train a model with the specified cps

model = self.create_model()

model.fit(train)

future = model.make_future_dataframe(periods=180, freq='D')

# Make a dataframe to hold predictions

if i == 0:

predictions = future.copy()

future = model.predict(future)

# Fill in prediction dataframe

predictions['%.3f_yhat_upper' % prior] = future['yhat_upper']

predictions['%.3f_yhat_lower' % prior] = future['yhat_lower']

predictions['%.3f_yhat' % prior] = future['yhat']

# Make and predict for next year with future dataframe

future = model.make_future_dataframe(periods = days, freq='D')

future = model.predict(future)

if days > 0:

# Print the predicted price

print('Predicted Price on {} = ${:.2f}'.format(

future.ix[len(future) - 1, 'ds'].date(), future.ix[len(future) - 1, 'yhat']))

title = '%s Historical and Predicted Stock Price' % self.symbol

else:

Dept. of I.S.E., S.C.E. 2018-19 36


Stock Market Portfolio Optimization

title = '%s Historical and Modeled Stock Price' % self.symbol

# Set up the plot

fig, ax = plt.subplots(1, 1)

# Plot the actual values

ax.plot(stock_history['ds'], stock_history['y'], 'ko-', linewidth = 1.4, alpha = 0.8, ms =

1.8, label = 'Observations')

# Plot the predicted values

ax.plot(future['ds'], future['yhat'], 'forestgreen',linewidth = 2.4, label = 'Modeled')

# Plot the uncertainty interval as ribbon

ax.fill_between(future['ds'].dt.to_pydatetime(), future['yhat_upper'],

future['yhat_lower'], alpha = 0.3, facecolor = 'g', edgecolor = 'k', linewidth = 1.4,

label = 'Confidence Interval')

# Plot formatting

plt.legend(loc = 2, prop={'size': 10}); plt.xlabel('Date'); plt.ylabel('Price $');

plt.grid(linewidth=0.6, alpha = 0.6)

plt.title(title);

plt.show()

return model, future

# Evaluate prediction model for one year

def evaluate_prediction(self, start_date=None, end_date=None, nshares = None):

# Default start date is one year before end of data

# Default end date is end date of data

Dept. of I.S.E., S.C.E. 2018-19 37


Stock Market Portfolio Optimization

if start_date is None:

start_date = self.max_date - pd.DateOffset(years=1)

if end_date is None:

end_date = self.max_date

start_date, end_date = self.handle_dates(start_date, end_date)

# Training data starts self.training_years years before start date and goes up to start

train = self.stock[(self.stock['Date'] < start_date.date()) &

(self.stock['Date'] > (start_date - pd.DateOffset(years=self.training_years)).date())]

# Testing data is specified in the range

test = self.stock[(self.stock['Date'] >= start_date.date()) & (self.stock['Date'] <=

end_date.date())]

# Calculate percentage of time actual value within prediction range

test['in_range'] = False

for i in test.index:

if (test.ix[i, 'y'] < test.ix[i, 'yhat_upper']) & (test.ix[i, 'y'] > test.ix[i, 'yhat_lower']):

test.ix[i, 'in_range'] = True in_range_accuracy = 100 * np.mean(test['in_range']

if not nshares:

# Date range of predictions

print('\nPrediction Range: {} to {}.'.format(start_date.date(),

end_date.date()))

# Final prediction vs actual value

print('\nPredicted price on {} = ${:.2f}.'.format(max(future['ds']).date(),

future.ix[len(future) - 1, 'yhat']))

Dept. of I.S.E., S.C.E. 2018-19 38


Stock Market Portfolio Optimization

print('Actual price on {} = ${:.2f}.\n'.format(max(test['ds']).date(),

print('Average Absolute Error on Training Data = test.ix[len(test) - 1, 'y']))

${:.2f}.'.format(train_mean_error))

print('Average Absolute Error on Testing Data = (test_mean_error))

# Direction accuracy

print('When the model predicted an increase, the price increased {:.2f}% of the )

print('When the model predicted a decrease, the price decreased {:.2f}% of the )

print('The actual value was within the {:d}% confidence interval {:.2f}% of the

time.'.format(int(100 * model.interval_width), in_range_accuracy))

# Reset the plot

self.reset_plot()

# Set up the plot

fig, ax = plt.subplots(1, 1)

# Plot the actual values

ax.plot(train['ds'], train['y'], 'ko-', linewidth = 1.4, alpha = 0.8, ms = 1.8, label =

'Observations')

ax.plot(test['ds'], test['y'], 'ko-', linewidth = 1.4, alpha = 0.8, ms = 1.8, label =

'Observations')

# Plot the predicted values

ax.plot(future['ds'], future['yhat'], 'navy', linewidth = 2.4, label = 'Predicted');

# Plot the uncertainty interval as ribbon

ax.fill_between(future['ds'].dt.to_pydatetime(), future['yhat_upper'],

facecolor = 'gold', edgecolor = 'k', linewidth = 1.4, label = 'Confidence Interval')

Dept. of I.S.E., S.C.E. 2018-19 39


Stock Market Portfolio Optimization

# Put a vertical line at the start of predictions

plt.vlines(x=min(test['ds']).date(), ymin=min(future['yhat_lower']),

ymax=max(future['yhat_upper']), colors = 'r', linestyles='dashed', label =

'Prediction Start')

# Plot formatting

plt.legend(loc = 2, prop={'size': 8}); plt.xlabel('Date'); plt.ylabel('Price $');

plt.grid(linewidth=0.6, alpha = 0.6)

plt.title('{} Model Evaluation from {} to {}.'.format(self.symbol,

start_date.date(), end_date.da

# Default start date is two years before end of data

# Default end date is one year before end of data

if start_date is None:

start_date = self.max_date - pd.DateOffset(years=2)

if end_date is None:

end_date = self.max_date - pd.DateOffset(years=1)

# Convert to pandas datetime for indexing dataframe

start_date = pd.to_datetime(start_date)

end_date = pd.to_datetime(end_date)

start_date, end_date = self.handle_dates(start_date, end_date)

# Select self.training_years number of years

train = self.stock[(self.stock['Date'] > (start_date - (self.stock['Date'] <

start_date.date())]

Dept. of I.S.E., S.C.E. 2018-19 40


Stock Market Portfolio Optimization

plt.legend(prop={'size':10})

plt.show();

# Plot of training and testing average uncertainty

self.reset_plot()

plt.plot(results['cps'], results['train_range'], 'bo-', ms = 8, label = 'Train Range')

plt.plot(results['cps'], results['test_range'], 'r*-', ms = 8, label = 'Test Range')

plt.xlabel('Changepoint Prior Scale'); plt.ylabel('Avg. Uncertainty ($)');

plt.title('Uncertainty in Estimate as Function of CPS')

plt.grid(color='k', alpha=0.3)

plt.xticks(results['cps'], results['cps'])

plt.legend(prop={'size':10})

plt.show();

Dept. of I.S.E., S.C.E. 2018-19 41


Stock Market Portfolio Optimization

CHAPTER 6

TESTING
In this chapter, an overview of testing is provided to verify the correctness and the
functionality of the system. Software testing is the process of analysing a software item to
detect the differences between the existing and required conditions and to evaluate the
features of software item. Software testing is an activity that should be done throughout
the development process. Software testing is a task intended to detect defects in software
by contrasting a computer program’s expected results with its actual results for given set
of inputs.

The aim of testing phase is to discover defects or errors by testing individual


program components. During a system testing, these components are integrated to form a
complete system. At this stage, testing was focused on establishing that the system met its
functional requirements, and does not behave in an unexpected way. Test data were
inputs which had been devised to test the system and the outputs were predicted from
these inputs if the system operates according to its specification. Testing was done to
examine the behaviour in a cohesive system. The test cases were selected to ensure that
the system behaviour can be examined in all possible combination of conditions.

Accordingly, the expected behaviour of the system under different combinations


were given. Therefore, test cases were selected which had inputs and the outputs were on
expected lines. Inputs that were not valid and for which suitable messages had to be given
and the inputs that did not occur frequently were regarded as special cases.

Test Environment

A testing environment is a setup of software and hardware on which the testing


team is going to perform the testing of the newly built software product. This setup
consists of the physical setup which includes hardware, and logical setup that includes
Server Operating system, client operating system, database server, front end running
environment, browser (if web application), or any other software components required to
run this software product. This testing setup is to be built on both the ends.

Dept. of I.S.E., S.C.E. 2018-19 42


Stock Market Portfolio Optimization

Test Case
Set of test inputs, execution conditions, and expected results were developed for a
particular objective, such as to exercise a particular program path or to verify compliance
with a specific requirement. It included the following.
• Features to be tested

• Items to be tested

• Purpose of testing

• Pass/Fail criteria

6.1 Testing in Machine Learning

A DataScience/MachineLearning career has primarily been associated with


building models that could do numerical or class-related predictions. This is unlike
conventional software development, which is associated with both development and
"testing" the software. And the related career profiles are software developer/engineers
and test engineers/QA professionals. However, in the case of Machine Learning, the
career profile is a data scientist. The usage of the word "testing " in relation to Machine
Learning models is primarily used for testing the model performance in terms of
accuracy/precision of the model. It can be noted that the word, "testing" means different
for conventional software development and Machine Learning models development.

Hence as mentioned above the traditional unit/integration testing would not work
on machine learning models hence it is tested based on its accuracy and prediction.

Accuracy is one metric for evaluating classification models.


Informally, accuracy is the fraction of predictions our model got right. Formally,
accuracy has the following definition:

Accuracy=Number of correct predictions/Total number of predictions

For binary classification, accuracy can also be calculated in terms of positives and
negatives as follows:

Accuracy=TP+TN/TP+TN+FP+FN

Dept. of I.S.E., S.C.E. 2018-19 43


Stock Market Portfolio Optimization

Where TP = True Positives, TN = True Negatives, FP = False Positives, and FN =


False Negatives.

When it comes to forecasting the models are evaluated based on the expected
results they predict, In case of stock market forecasting, we have divided the data into
training set and testing set again it is split into training dataset and validation dataset in
the training set. We train our model using the training dataset and validation dataset is
used to test the trained data. A validation dataset is a sample of data held back
from training your model that is used to give an estimate of model skill while tuning
model's hyperparameters.

A test dataset is a dataset that is independent of the training dataset, but that
follows the same probability distribution as the training dataset. If a model fit to the
training dataset also fits the test dataset well.

Figure: 5.1 Model Evaluation

As you can see in the above graph dotted vertical line passing through the y axis is
the point from which our prediction starts and the prices depicted in blue line is our
predicted stocks values and the black line is the observed value. Hence by observing the
predicted vs observed value we can tell how well our model works.

Dept. of I.S.E., S.C.E. 2018-19 44


Stock Market Portfolio Optimization

6.2 System Testing

System testing is the testing conducted on a complete, integrated system to


evaluate the system compliance with its specified requirements. System testing involves
putting the new program in many different environments to ensure that the program
works in typical customer environments with various versions and types of operating
systems and/or applications.

System testing is actually a series of different tests whose primary purpose is to


fully exercise the computer-based system. Although each test has a different purpose, the
main purpose is to verify that all the system elements have been properly integrated and
perform the allocated functions.

Dept. of I.S.E., S.C.E. 2018-19 45


Stock Market Portfolio Optimization

CONCLUSION AND FUTURE ENHANCEMENTS

Conclusion

The popularity of stock market trading is growing rapidly, which is encouraging


researchers to find out new methods for the prediction using new techniques. The
forecasting technique is not only helping the researchers but it also helps investors and any
person dealing with the stock market. In order to help predict the stock indices, a
forecasting model with good accuracy is required. In this work, we have used one of the
most precise forecasting technology using Recurrent Neural Network and Long Short-
Term Memory unit which helps investors, analysts or any person interested in investing in
the stock market by providing them a good knowledge of the future situation of the stock
market.

Future Enhancements

In the future, the stock market prediction system can be further improved by
utilizing a much bigger dataset than the one being utilized currently. This would help to
increase the accuracy of our prediction models. Furthermore, other models of Machine
Learning could also be studied to check for the accuracy rate resulted by them.

Dept. of I.S.E., S.C.E. 2018-19 46


Stock Market Portfolio Optimization

SNAPSHOTS

Snapshot 1 : Data Frame

Snapshot 2: Stock values

Dept. of I.S.E., S.C.E. 2018-19 47


Stock Market Portfolio Optimization

Snapshot 3: Prediction without training

Snapshot 4:Stock history

Dept. of I.S.E., S.C.E. 2018-19 48


Stock Market Portfolio Optimization

Snapshot 5: Prediction without training

Snapshot 6: stock prediction

Dept. of I.S.E., S.C.E. 2018-19 49


Stock Market Portfolio Optimization

Snapshot 7: Changepoint Prior Scale

Snapshot 8:Testing and Training curves

Dept. of I.S.E., S.C.E. 2018-19 50


Stock Market Portfolio Optimization

Snapshot 9: Hold and buy v/s prediction

Snapshot 10: Predicted increase/decrease

Dept. of I.S.E., S.C.E. 2018-19 51


Stock Market Portfolio Optimization

Snapshot 11: Accuracy of prediction

Snapshot 12: Prediction with range

Dept. of I.S.E., S.C.E. 2018-19 52


Stock Market Portfolio Optimization

Snapshot 13: Enter company name

Snapshot 14: Predicted stock values

Dept. of I.S.E., S.C.E. 2018-19 53


Stock Market Portfolio Optimization

ANNEXURE A

GLOSSARY

Accuracy

Accuracy is a metric by which one can examine how good is the machine learning model.

Assets

Everything a company or person owns, including money, securities, equipment and real
estate. Assets include everything that is owed to the company or person.

Bar Chart

Bar chart are the type of graph that are used to display and and compare the numbers,
frequencyor other measures.

Classification

The identification of which of two or more categories an item falls under; a classic
machine learning task. Deciding whether an email message is spam or not classifies it
among two categories, and analysis of data about movies might lead to classification of
them among several genres.

Confidence Interval

A range specified around an estimate to indicate margin of error, combined with a


probability that a value will fall in that range. The field of statistics offers specific
mathematical formulas to calculate confidence intervals

Covariance

A measure of the relationship between two variables whose values are observed at the
same time; specifically, the average value of the two variables diminished by the product
of their average values.

Dept. of I.S.E., S.C.E. 2018-19 54


Stock Market Portfolio Optimization

Capital Stock

All shares representing ownership of a company, including preferred and common shares.

Close Price

The price of the last board lot trade executed at the close of trading

Dependent variable

The value of a dependent value “depends” on the value of the independent variable. If
you're measuring the effect of different sizes of an advertising budget on total sales, then
the advertising budget figure is the independent variable and total sales is the dependent
variable

Reinforcement Learning

A class of machine learning algorithms in which the process is not given specific goals to
meet but, as it makes decisions, is instead given indications of whether it’s doing well or
not.

Root Mean Squared Error

Also, RMSE. The square root of the Mean Squared Error. This is more popular than Mean
Squared Error because taking the square root of a figure built from the squares of the
observation value errors gives a number that’s easier to understand in the units used to
measure the original observations.

Recurrent Neural Networks

A recurrent neural network is a class of artificial neural network where connections


between nodes form a directed graph along a temporal sequence.

Stock Price Index

A statistical measure of the state of the stock market, based on the performance of certain
stocks. Examples include the S&P/TSX Composite Index and the S&P/TSX Venture
Composite Index.

Dept. of I.S.E., S.C.E. 2018-19 55


Stock Market Portfolio Optimization

ANNEXURE B

ACRONYMS

ARIMA Autoregressive integrated moving average

ANFIS Adaptive Network-Based Fuzzy Inference System

ABC-RNN Artificial bee colony (abc) algorithm

API Application Program Interface

IDE Integrated Development Environment

JSON JavaScript Object Notation

RNN Recurrent Neural Network

RMSE Root Mean Square Error

SVM Support Vector Machine

S&P Standard & Poor

Dept. of I.S.E., S.C.E. 2018-19 56


Stock Market Portfolio Optimization

BIBLIOGRAPHY

[1] I. Svalina, V. Galzina, R. Luji, and G. Imunovi, "An adaptive network- based fuzzy
inference system (ANFIS) for the forecasting: The case of close price indices,"
Expert systems with applications, vol. 40, no. 15, pp. 60556063, 2013.

[2] M. A. Boyacioglu and D. Avci, "An adaptive network-based fuzzy inference system
(ANFIS) for the prediction of stock market return: the case of the Istanbul stock
exchange," Expert Systems with Applications, vol. 37, no. 12, pp. 79087912, 2010.

[3] E. F. Fama and K. R. French, "Common risk factors in the returns on stocks and
bonds," Journal of financial economics, vol. 33, no. 1, pp. 356, 1993.

[4] T.-J. Hsieh, H.-F. Hsiao, and W.-C. Yeh, "Forecasting stock markets using wavelet
transforms and recurrent neural networks: An integrated system based on artificial
bee colony algorithm," Applied soft comput- ing, vol. 11, no. 2, pp. 25102525,
2011.

[5] Hall JW. Adaptive selection of US stocks with neural nets. In: Deboeck GJ, editor.
Trading on the edge: neural, genetic, and fuzzy systems for chaotic financial
markets. New York: Wiley; 1994. p. 45–65.

[6] Tay FEH, Cao LJ. Application of support vector machines in 1nancial time series
forecasting. Omega 2001; 29:309–17.

[7] Eugene F. Fama “The Behavior of Stock Market Prices”, the Journal of Business,
Vol 2, No. 2, pp. 7–26, January 1965.

[8] Cao LJ, Tay FEH. Financial forecasting using support vector machines. Neural
Computing Applications 2001; 10:184–92.

Dept. of I.S.E., S.C.E. 2018-19 57


Stock Market Portfolio Optimization

[9] Zhen Hu, Jibe Zhu, and Ken Tse “Stocks Market Prediction Using Support
Vector Machine”, International Conference on Information Management,
Innovation Management and Industrial Engineering, 2013.M.

[10] Wei Huang, Yoshiteru Nakamori, Shou-Yang Wang, “Forecasting stock


market movement direction with support vector machine”, Computers &
Operations Research, Volume 32, Issue 10, October 2005, Pages 2513–2522.

[11] N. Ancona, Classification Properties of Support Vector Machines for Regression,


Technical Report, RIIESI/CNR- Nr. 02/99.

[12] K. jae Kim, “Financial time series forecasting using support vector
machines,” Neurocomputing, vol. 55, 2003.

[13] Debashish Das and Mohammad shorif uddin data mining and neural network
techniques in stock market prediction: a methodological review, international
journal of artificial intelligence & applications, vol.4, no.1, January 2013

Dept. of I.S.E., S.C.E. 2018-19 58

S-ar putea să vă placă și