Python A.I. Stock Prediction

DSS
ALGO-TRADE
2018
APRIL 17
PUNYAWEE POSRI(KUKKUI)
1
Table of Contents
Contents
Table Of Contents .................................................................................................................................................... 2
DSS ALGO TRADING.................................................................................................................................................. 3
INTRODUCTION ........................................................................................................................................................ 4
DATA GATHERING .................................................................................................................................................... 5
SPLIT DATA FOR TEST AND TRAIN ............................................................................................................................ 7
Overfitting ............................................................................................................................................................. 7
Underfitting............................................................................................................................................................ 7
Validation.............................................................................................................................................................. 7
KNOWLEDGE IMPLEMENTATION ............................................................................................................................. 9
MODEL SELECTION ................................................................................................................................................. 10
TRAINING THE MODEL & MAKE DECISIONS........................................................................................................... 12
TUTORIAL ............................................................................................................................................................... 14
2
DSS ALGO TRADING
- PYTHON 3+ / JS
- TENSORFLOW
- NUMPY
- SCIKITLEARN
- PANDAS
- DATAFRAME
- JUPYTER NOTEBOOK
- ETC.
3
INTRODUCTION
The prediction of stock prices has always been a challenging task. It has been
observed that the stock price of any company does not necessarily depend on the
economic situation of the country. It is no more directly linked with the economic
development of the country or particular area. Thus the stock prices prediction has become
even more difficult than before. These days stock prices are affected due to many reasons
like company related news, political events, natural disasters ... etc. The fast data processing
of these events with the help of improved technology and communication systems has
caused the stock prices to fluctuate very fast. Thus many banks, financial institutions, large
scale investors and stock brokers have to buy and sell stocks within the shortest possible
time. Thus a time span of even few hours between buying and selling is not unusual
4
DATA GATHERING
Backtesting offers analysts, traders, and investors a way to evaluate and optimize
their trading strategies and analytical models before implementing them. The notion is that
a strategy that would have worked poorly in the past will probably work poorly in the future,
and vice versa. But as you can see, a key part of backtesting is the risky assumption that
past performance predicts future performance.
let's assume you devise a model that you think consistently predicts the future value
of the S&P 500. By using historical data, you can backtest the model to see whether it would
have worked in the past. By comparing the predicted results of the model against the actual
historical results, backtesting can determine whether the model has predictive value.
Nearly any method for predicting anything can be backtested. For example, an analyst
can backtest his or her methods for predicting a company's net income, the degree of
volatility of a particular stock, key ratios, or return percentages. Technical traders are the
most common users of backtesting, and most backtesting today is done with computer
software.
5
6
SPLIT DATA FOR TEST AND TRAIN
Overfitting
Overfitting is most common than Underfitting, but none should happen in order to avoid
affect the predictability of the model. Overfitting can happen when the model is too
complex. Overfitting means that the model we trained has trained “too well” and fit too
closely to the training dataset. But if it’s too well, why there’s a problem? The problem is
that the accuracy on the training data will unable accurate on untrained or new data. To
avoid it, the data can’t have many features/variables compared to the number of
observations.
Underfitting
Underfitting can happen when the model is too simple and means that the model does
not fit the training data. To avoid it, the data need enough predictors/independent
variables. Before, we’ve mentioned Validation.
Validation
Cross Validation is when scientists split the data into (k) subsets, and train on k-1 one of
those subset. The last subset is the one used for the test. Some libraries are most
common used to do training and testing.
• Pandas: used to load the data file as a Pandas data frame and analyze it.
• Sklearn: used to import the datasets module, load a sample dataset and run a
linear regression.
• Matplotlib: using pyplot to plot graphs of the data.
7
• Split a train,validate and test data ratio
8
KNOWLEDGE IMPLEMENTATION
Implementing a machine learning algorithm will give you a deep and practical
appreciation for how the algorithm works. This knowledge can also help you to internalize
the mathematical description of the algorithm by thinking of the vectors and matrices as
arrays and the computational intuitions for the transformations on those structures.
There are numerous micro-decisions required when implementing a machine

learning algorithm and these decisions are often missing from the formal algorithm
descriptions. Learning and parameterizing these decisions can quickly catapult you to
intermediate and advanced level of understanding of a given method, as relatively few
people make the time to implement some of the more complex algorithms as a learning
exercise.
9
MODEL SELECTION
In the case of predicting stocks, that means that as we feed the neural network more and
more years of data, eventually it’ll forget about the earliest data and won’t be able to get
as much value from it. The solution to this is called LSTM (aka. Long-Short-Term Memory).
This is a technique used in artificial intelligence that uses “memory cells” in neural
networks that act as brains.
Each LSTM layer is made up of multiple mini neural networks that are trained to optimally
use memory to make accurate predictions.
- The memory gate collects the possible outputs the network can come up with and
stores the relevant ones for later use.
- The selection gate uses the memories to select a final output from all the
possibilities that the network comes up with.
- The ignore/forget gate decides which memories are irrelevant to the decision-
making process and gets rid of them.
10
LSTMs become especially valuable for stock prediction because they can use historical
trends and data to decide how a stock will move.
11
TRAINING THE MODEL & MAKE DECISIONS
1. The data I used to train my model came directly from NASDAQ’s website. I got 5
years of Facebook’s historical price data (2012–2017).
2. My network has 3 LSTM layers that each have 50 nodes, meaning that there are 50
features the network looks for.
3. Every time the network goes through one training cycle, it uses its memory cells to
keep parts of its older versions to make better decisions.
4. My network ends with an output layer with 1 node that outputs the AI’s guess for
the opening price for Facebook’s stock the next day.
5. It measures its inaccuracy using a loss function called “mean squared error”. This
measures how far off the predictions were from the actual values and squares
them. Then it adds up those squares.
6. The neural network uses a popular optimizer called “adam” to reduce the error
and make the network more accurate.
7. After training the neural network for 100 training cycles (called epochs) I tested my
data on stock data that the network had never seen before to simulate how it
would perform in the real world.
8. The network was told to buy the stock if it predicted a certain threshold of drop in
the stock and to sell it if it predicted a certain threshold of increase.
12
This graph shows the performance of my neural network over a year. The blue line
is my AI’s prediction and the purple line is what the stock price actually was. The green
dots represent a buy decision and a red dot represents a sell decision. Though the neural
network still has much room for improvement it was able to generate a 36.22% return on
investment.
13
TUTORIAL&
14
15
16
17
18
19
20
21
22
23
24

Python A.I. Stock Prediction

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Python A.I. Stock Prediction

Încărcat de

Drepturi de autor:

Formate disponibile

DSS

There are numerous micro-decisions required when implementing a machine

S-ar putea să vă placă și