DeepThought FinML

DeepThought 1.4.
2
Machine Learning for Financial Trading Systems
Deep Thought Software (NZ) Ltd

www.deep-thought.co
c
September,
2014
Contents
1 Introduction
1.1 Software Requirements
1.2 Data . . . . . . . . . .
1.3 Configuration . . . . .
1.4 Output Files . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 Data
2.1 Importing Historical Data from MT4 . . . . . . . . . . . . . . . .
2.1.1 Exporting data as CSV from MT4 . . . . . . . . . . . . .
2.1.2 Importing MT4 CSV data into a DeepThought Database
2.2 Importing from Dukascopy . . . . . . . . . . . . . . . . . . . . .
2.3 Importing from HistData . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
2
2
.
.
.
.
.
3
3
3
4
4
4
3 Terminology
4 Machine Learning
4.1 Support Vector Machines (SVM)
4.2 Gradient Boosted Trees (GBT) .
4.3 Random Forests . . . . . . . . .
4.4 Extremely Randomised Trees . .
4.5 Multi-Layer Perceptron . . . . .
4.6 Ensembles . . . . . . . . . . . . .
4.7 Continuous Features . . . . . . .
4.7.1 Feature Normalisation . .
4.8 Categorical Features . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Backtesting
5.1 Backtesting Setup . . . . . . . . . . . . . . .
5.2 Recording and Using Recorded Signals . . . .
5.3 Order Fill Simulation . . . . . . . . . . . . . .
5.4 Paper Trading . . . . . . . . . . . . . . . . .
5.5 Files Produced During Backtesting and Paper
6 Genetic Algorithm for Parameter
6.1 Configuration . . . . . . . . . . .
6.2 Running the Genetic Algorithm .
6.2.1 Database . . . . . . . . .
6.3 Genetic Algorithm Results . . . .
6.4 Using Recorded Results . . . . .
6.5 The Condor Submit File . . . . .
6.6 Trouble Shooting . . . . . . . . .
Search
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
7
7
7
8
8
9
9
9
. . . . .
. . . . .
. . . . .
. . . . .
Trading
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
12
12
13
13
.
.
.
.
.
.
.
14
15
16
18
18
18
18
19
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
II
7 Live and Paper Trading

7.1 Manual Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Automated Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Trouble Shooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
20
21
22
8 Python Scripting
8.1 Python Installation . . . . . . . . . . . . .
8.2 Python Feature . . . . . . . . . . . . . . .
8.3 Python Target . . . . . . . . . . . . . . .
8.4 Python Predictor . . . . . . . . . . . . . .
8.5 Python Signal Generation . . . . . . . . .
8.6 The deep thought intf Interface Object
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
24
24
25
28
30
32
33
9 Configuration Details
9.1 bar-series . . . . . . . . . . . . . . . . . . . . . .
9.1.1 Renko Bars . . . . . . . . . . . . . . . . . .
9.1.2 Summary of bar-series Options . . . . .
9.2 bar-series-collection . . . . . . . . . . . . . .
9.3 model . . . . . . . . . . . . . . . . . . . . . . . . .
9.4 Features . . . . . . . . . . . . . . . . . . . . . . . .
9.4.1 hour-of-day . . . . . . . . . . . . . . . . .
9.4.2 day-of-week . . . . . . . . . . . . . . . . .
9.4.3 bar-diff . . . . . . . . . . . . . . . . . . .
9.4.4 bar-attribute . . . . . . . . . . . . . . . .
9.4.5 moving-average . . . . . . . . . . . . . . .
9.4.6 csv-feature . . . . . . . . . . . . . . . . .
9.4.7 python-script . . . . . . . . . . . . . . . .
9.5 Targets . . . . . . . . . . . . . . . . . . . . . . . .
9.5.1 bars-in-future . . . . . . . . . . . . . . .
9.5.2 python-script . . . . . . . . . . . . . . . .
9.6 Predictors . . . . . . . . . . . . . . . . . . . . . . .
9.6.1 svm-predictor . . . . . . . . . . . . . . . .
9.6.2 linear-svm-predictor . . . . . . . . . . .
9.6.3 gbt-predictor . . . . . . . . . . . . . . .
9.6.4 random-forest-predictor . . . . . . . . .
9.6.5 extremely-randomised-trees-predictor
9.6.6 multi-layer-perceptron-predictor . . .
9.6.7 python-predictor . . . . . . . . . . . . . . .
9.7 predictor-ensemble . . . . . . . . . . . . . . . .
9.8 signal-generator . . . . . . . . . . . . . . . . . .
9.9 trader . . . . . . . . . . . . . . . . . . . . . . . . .
9.10 backtest . . . . . . . . . . . . . . . . . . . . . . .
9.11 genetic-algo . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
35
38
40
40
41
42
43
44
45
47
49
51
53
55
55
56
58
58
62
64
65
66
66
68
68
70
72
73
74
10 Commandline Tools
10.1 Candle Statistics (--stats) . . . . . . . . . . . . . . . . . . .
10.2 Generate Bars (--generate-bars) . . . . . . . . . . . . . . .
10.3 Generating a Manual Signal . . . . . . . . . . . . . . . . . . .
10.4 Generating Feature Statistics (--generate-feature-stats) .
10.5 Extracting a Training Set (--extract-training-set) . . . .
10.6 SVM Grid Search (--svm-param-search-c) . . . . . . . . . .
10.7 GBT Grid Search (--gbt-param-search-c) . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
77
78
79
80
80
81
81
82
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
III
10.8 Printing XML Configuration Documentation (--print-config) . . . . . . . . . .
83
11 Fundamental Indicators (Experimental)

11.1 Fundamental Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
92
12 Tutorial: Preparing the Commandline

12.1 Step 1: Open the commandline . . . . .
12.2 Step 2: Open the defaults window . . .
12.3 Step 3: Change the font . . . . . . . . .
12.4 Step 4: Change the default window size
.
.
.
.
94
94
95
95
96
.
.
.
.
.
.
.
97
97
98
99
99
99
100
102
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13 Tutorial: Backtesting in DeepThought and MT4

13.1 Step 1: Edit the configuration . . . . . . . . . . . .
13.2 Step 2: Start the DeepThought backtest . . . . . .
13.3 Step 3: Copy files to Metatrader . . . . . . . . . .
13.4 Step 4: Modify the EA . . . . . . . . . . . . . . . .
13.5 Step 5: Running Metatrader Strategy Tester . . .
13.6 Step 6: Optimisation with MT Strategy Tester . .
13.7 Step 7: Analyse the Results . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A Sample Configuration
104
B Condor Setup and Operation

B.1 Installation . . . . . . . . . .
B.1.1 Adding a Condor User
B.2 Useful Commands . . . . . .
B.2.1 condor status . . . .
B.2.2 condor q . . . . . . .
B.2.3 condor rm . . . . . . .
108
108
112
113
113
113
114
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
List of Figures
4.1
4.2
Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Multi-Layer Perceptron with 3 inputs, 5 hidden and 2 output neurons. . . . . . .
7
8
9.1
9.2
Type 1 Renko Bars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Type 2 Renko Bars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
38
12.1
12.2
12.3
12.4
Opening the DeepThought Commandline

Opening the Defaults Window . . . . . .
Changing the Font . . . . . . . . . . . . .
Changing the Window Size/Layout . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
94
95
95
96
13.1 Editing the Configuration . . . . . . . . . . . . . . .

13.2 Starting the Backtest . . . . . . . . . . . . . . . . . .
13.3 The Completed Backtest . . . . . . . . . . . . . . . .
13.4 Metatrader tester setup . . . . . . . . . . . . . . . .
13.5 The Completed Backtest . . . . . . . . . . . . . . . .
13.6 Enabling the Genetic Optimisation in Metatrader . .
13.7 Selecting which Parameters to Optimise . . . . . . .
13.8 Enabling Optimisation in the Strategy Tester . . . .
13.9 List of the Best Results of the Metatrader Optimiser
13.10Report of the Optimum Settings . . . . . . . . . . .
13.11Graph of a Test With Optimum Settings . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
98
98
99
100
100
101
101
102
102
102
103
B.1
B.2
B.3
B.4
B.5
B.6
B.7
B.8
B.9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
108
109
109
110
110
111
111
112
112
Condor
Condor
Condor
Condor
Condor
Condor
Condor
Condor
Condor
Setup
Setup
Setup
Setup
Setup
Setup
Setup
Setup
Setup
1
2
3
4
5
6
7
8
9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
IV
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
List of Tables
4.1
4.2
Normalisation and Scaling Schemes . . . . . . . . . . . . . . . . . . . . . . . . . .

Binarising categorical variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
10
5.1
Files Produced During Backtesting/Paper Trading . . . . . . . . . . . . . . . . .
13
6.1
parameter configuration options for the genetic-algo configuration section. . .
16
7.1
DeepThought parameters for the Metatrader EA. . . . . . . . . . . . . . . . . . .
22
8.1
Summary of the deep thought intf interface object . . . . . . . . . . . . . . . .
34
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
9.10
9.11
9.12
9.13
9.14
9.15
9.16
9.17
9.18
9.19
9.20
9.21
9.22
9.23
9.24
9.25
9.26
9.27
9.28
9.29
9.30
9.31
Sections in the XML configuration file . . . . . . . . . . . . . . . . . . . . . .

The effect of the delay-minutes-offset parameter on intraday candles. . .
bar-series configuration options. . . . . . . . . . . . . . . . . . . . . . . . .
Features used as independent inputs to machine learning models. . . . . . . .
hour-of-day feature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
day-of-week feature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Price difference examples for the bar-diff feature. . . . . . . . . . . . . . . .
bar-diff feature parameter options. . . . . . . . . . . . . . . . . . . . . . . .
bar-attribute feature parameter options. . . . . . . . . . . . . . . . . . . .
moving-average feature parameter options. . . . . . . . . . . . . . . . . . . .
csv-feature parameter options. . . . . . . . . . . . . . . . . . . . . . . . . .
python-script feature parameter settings. . . . . . . . . . . . . . . . . . . .
bars-in-future target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
python-script target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
svm-predictor configuration options. . . . . . . . . . . . . . . . . . . . . . .
params configuration options for the svm-predictor. . . . . . . . . . . . . . .
linear-svm-predictor configuration options. . . . . . . . . . . . . . . . . .
params configuration options for the linear-svm-predictor. . . . . . . . . .
params configuration options for the gbt-predictor. . . . . . . . . . . . . . .
gbt-predictor options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
params configuration options for the random-forest-predictor. . . . . . . .
Random Forest random-forest-predictor options. . . . . . . . . . . . . . .
Multi-layer Perceptron multi-layer-perceptron-predictor options. . . . .
params configuration options for the multi-layer-perceptron-predictor. .
python-predictor parameter settings. . . . . . . . . . . . . . . . . . . . . . .
retrain-period options for the predictor-ensemble. . . . . . . . . . . . . .
signal-generator configuration options . . . . . . . . . . . . . . . . . . . . .
trader configuration options. . . . . . . . . . . . . . . . . . . . . . . . . . . .
backtest options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
genetic-algo options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
parameter configuration options for the genetic-algo configuration section.
36
37
40
42
43
44
45
46
47
50
52
53
55
57
60
61
63
63
64
64
65
65
67
67
69
69
70
72
73
75
76
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
LIST OF TABLES
VI
10.1 Column meanings using the --stats commandline option. . . . . . . . . . . . . .

10.2 --generate-bars parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
80
11.1 title values for the fundamental-indicator feature. . . . . . . . . . . . . . . .
93
B.1 The Job States in Condor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Listings
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
Python
Python
Python
Python
Python
Python
Python
Python
feature configuration. . . . . . . . . . .
script example defining a feature. . . .
target configuration. . . . . . . . . . . .
script example defining a target. . . . .
predictor configuration example. . . . .
script example for a predictor. . . . . .
signal generator configuration example.
script example for the signal generator.
VII
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
25
27
28
29
30
31
32
33
Chapter 1
Introduction
DeepThought is a sophisticated software package for creating trading systems utilising state of
the art machine learning algorithms. Currently supported are Support Vector Machines (SVM),
Linear Support Vector Machines (LSVM), Gradient Boosted Trees (GBT) and Random Forests.
Other methods will be added over time if there is a potential benefit to trading.
This software tool is designed for people who are serious about their trading. It does have a
learning curve, so be prepared to spend some time understanding and researching before rushing
to live trading. If you are looking for a $99 get rich quick EA1 and do not want to spend any
time developing your own system, then this probably isnt the tool for you. If you believe that
$99 get rich quick EAs actually exist and do what they claim, then this definitely isnt the tool
for you.
Predicting financial markets is a difficult problem. The patterns we are attempting to forecast are
extremely weak. There are many academic papers that discuss which machine learning algorithm
is best SVM versus Neural Networks versus Random Forests, etc. We have found that the
features that make up an observation which is used for forecasting is much more important
than the actual algorithm. If you have a feature set which does not contain any patterns, then
whichever technique you use will not work. Thus is it better to spend the majority of your
time working on feature selection and engineering rather than fuss about SVM versus Neural
Networks.
DeepThought is a command line tool that operates on XML configuration files. A DLL version
integrates with Metatrader for live trading. Both the DLL and the command line EXE are built
from the same source code. The configuration file contains all the settings necessary for both
backtesting and live trading, thus once a good configuration has been found, the DLL can use
the same configuration without modification. A genetic algorithm is able be used for parameter
search.
1.1
Software Requirements
Scripts in Python are provided to perform analysis on backtested configurations, and to import
data into DeepThought databases. It is suggested that Python(x,y) is used as it provides a
full Python development environment, including common scientific, mathematical and plotting
libraries. It can be downloaded from code.google.com/p/pythonxy/
1
EA stands for Expert Advisor, Metatraders terminology for a script which trades automatically without
human intervention.
CHAPTER 1. INTRODUCTION
1.2
Data
DeepThought needs access to historical data for backtesting (not connected to a trading platform), live and paper-trading, so data is stored in separate Sqlite databases. These can be
inspected using any Sqlite tools such as Sqliteman available for free from sqliteman.com.
If you have access to reliable historical data, then you should import this into the database.
Python scripts have been supplied for this purpose see sections 2.1.2 and 2.3 for more details.
This database is also used for live trading. When Metatrader is running, the DeepThought EA
collects market ticks and passes them to the DLL. The ticks are used to create 1 min candles
which are then stored in the database which is used for signal generation. Metatrader is only
used as a order placement/management system all signal logic is contained in the DLL.
1.3
Configuration
This is the heart of the system. Use a text editor such as Notepad++ to edit XML configuration
files. A few samples are supplied in the examples file. Configuration files must be in their own
unique directory with the filename config.xml or config.xml. This is because you will likely
be working on several config files at the same time, or want to keep previous config files with
their output for reference purposes. It is easier to keep all related files together in the same
directory as it minimises clutter.
1.4
Output Files
During backtesting, paper-trading and live trading various files are produced recording the
signals generated, log file, PnL and the feature statistics. These are listed in table 5.1 on page
13.
Chapter 2
Data
Data is stored in Sqlite databases. Each instrument has its own database. These are generally
stored in the directory C:\FX Database (in Windows). These databases are used both for
backtesting and for live trading. There are certain limits with EA access to data in MT4
which can only be overcome by not using MT4 for historical data access. Data can also be
imported from other sources such as www.histdata.com, and a python script is provided for
this purpose.
The DeepThought EA running in MT4 builds 1 minute candles from ticks passed from MT4.
These are automatically stored in the database as they are created.
2.1
Importing Historical Data from MT4
Data can be exported as CSV files from MT4. The first task is to ensure MT4 has the maximum
amount of data available from the broker.
2.1.1
Exporting data as CSV from MT4
You can export data from MT4 to a CSV file in the following way:
Open a 1 minute chart on the instrument that you want data.
Select Tools Options and in the Charts tab, make sure the Max bars in history and
Max bars in chart are set to something huge. If not, enter something like 9999999999999.
Make sure auto-scrolling is off by checking Charts Auto Scroll.
Press and hold the Page Up Key. This forces MT4 to download older data and is more
reliable than Tools History Centre Download. This can take a while and the
amount of data available depends on your broker.
Once the chart stops downloading data, select Tools History Center, and navigate
to the 1 Minute (M1) option of the desired instrument. Click on Export and save as a
csv file.
CHAPTER 2. DATA
2.1.2
Importing MT4 CSV data into a DeepThought Database
The python directory contains scripts to import historical data. To import data from an exported CSV from from MT4, use the following command:
python import_mt4_csv.py -d C:\FX_database\EURUSD.db -c EURUSDm1.csv -n
The above command, run from the python directory, will create a database EURUSD.db in the
C:\FX Database directory and import the data in EURUSDm1.csv. The script assumes the file
EURUSDm1.csv is in the same directory as the script. The -n parameter will create a new
database. If you have an existing database, omit this parameter and the new data will be
merged with existing data. It will not overwrite any conflicts, but will fill in the gaps of any
missing data.
It is useful to run the above script once a week, maybe at the weekend, to ensure any data gaps
caused due to network outages, or other interruptions to the DeepThought EA running in MT4,
are filled.
2.2
Importing from Dukascopy
Dukascopy www.dukascopy.com makes historical tick data available for free. This can be
downloaded using a free tool at www.strategyquant.com/tickdatadownloader/. Note that
DeepThought is not associated or affiliated with Dukascopy or Strategyquant in any way.
Once the tick data has been downloaded using the above tool, it can be imported using the
command:
deepthought --import-dukascopy-csv D:\TickDataDownloader\tickdata\EURUSD.csv
--dbname C:\FX Database\EURUSD.db
where the tick downloader has downloaded and created a single CSV file in
D:\TickDataDownloader\tickdata\EURUSD.csv. A new database will be created if it doesnt
exist, otherwise the new data will be merged with an existing database. When merging the old
data will not be overwritten.
2.3
Importing from HistData
In the python directory there is a script for importing historical data files downloaded from
www.histdata.com. Run this script similar to the MT4 import script:
python import_histdata.py --db <db file> --dir <mt4 csv file> --createdb --unzip
The --createdb and --unzip parameters are optional. If the --createdb option is present, a
new database will be created. The process will fail if a database with the same name already
exists to prevent accidental overwriting. HistData files are downloaded in zip format. You can
unzip these manually yourself, or supply the --unzip option to have the script do this before
importing.
Chapter 3
Terminology
We define a Feature as a type of information used in the training/forecasting sets. Examples
for features are Hour of Day and Close price difference between two candles. Each
feature has at least one attribute. An attribute is the actual number or value used in the
training/forecasting set. The Hour of Day feature has one attribute, the hour, and the Close
price difference between two candles feature can have as many attributes as defined.
A feature vector is a series of feature attributes that form an observation. This could comprise,
for example, Hour of Day, Day of Week, 30 differences in close price and 10 differences in moving
averages, thus the feature vector wold have 42 attributes.
An attribute is classed as either a continuous or a categorical variable. Continuous variables
are variables whose values are real numbers such as a change in price. Categorical variables
are variables that can only take specific values such as day of week which can only be one of
{sun,mon,tue,wed,thu,fri,sat,sun}.
A label is the forecast variable, i.e. the thing we are trying to predict. When used during the
(supervised) training phase, each of the features used for training must be assigned a label. The
set of feature vectors together with their labels is the training set.
The current version of DeepThought focuses mainly on classification problems. That is the
labels is 1 (for true), and -1 (for false). Regression problems attempt to forecast magnitude
as well as direction. There is limited support for regression problems, and this will be enhanced
in future versions. We have found that it is hard enough to predict whether the market will
move up or down, let alone by how much.
A label is typically something like the close price is higher/lower at the end of the next candle
in the future. A label of 1 would indicate higher, and a label -1 would indicate lower.
A model is the collection of parameters that define a feature vector. This would include
parameters such as how many previous close price difference to include, and how to scaled the
values.
A predictor is a self contained forecaster, such as an individual SVM or GBT.
After the training phase, a final model is built for each predictor. Note, this a different usage of
the term model to the one given above. Currently these are stored in memory as retraining is
frequent. The model is used to forecast a label for an unlabelled feature vector.
Chapter 4
Machine Learning
It is beyond the scope of this manual to describe each of the machine learning algorithms in
detail. The interested reader should consider the Stanford and/or Caltech machine learning
course offered via iTunesU (for free).
Machine learning problems tend to be divided into two main approaches: Classification where
the goal is to forecast discrete classes ; and regression where the goal is to forecast a real
number.
DeepThought supports both classification and regression. For trading system, it is probably
best to focus on classification as it is a difficult enough to forecast market up or down, let alone
by how much. Most classification problems are two class. Multi-class is possible generally by
reducing the problem into several two class problems, or a one-versus-all setting.
Often the process in applying machine learning to trading systems involves an offline step of
building a model, then deploying the model to the trading system. DeepThought takes a different
approach by enabling the system to continuously retrain. While it is possible to build a single
model then forecast using only this model, the preferred mode of operation is to retrain after
the forecast and signal has been sent to the market. The sequence of events is:
1. At system spin-up, train all predictors.
2. New candle (or Renko bar) received and saved to database.
3. Forecasts made by ensemble of predictors and combined into a single signal.
4. Signal sent to trading platform (e.g. Metatrader).
5. All predictors retrained, ready for the next candle to complete to trigger the next forecast.
Thus your system can continuously adapt to the market.
4.1
Support Vector Machines (SVM)
The parameters associated with SVMs are: Kernel type, C (penalty), g (for radial basis function
kernel), e (only for regression). Generally the radial basis function kernel is used, with classification so the only parameters to select are C and g. The DeepThought command line tool has an
option to perform a grid search using 5 fold cross validation. This means the results provided
are for out of sample data, avoiding over-fitting. See 10.6 on page 81 for details on how to do a
grid search.
CHAPTER 4. MACHINE LEARNING
DeepThought supports linear SVMs and kernel SVMs. Linear SVMs are faster than kernel
SVMs, but may not perform well if the problem is non-linear ; i.e. the dependent variable
(the thing we are forecasting) is not a linear combination of the inputs. Kernel SVMs use a
kernel function such as a gaussian to map inputs into a higher dimensional space, then a linear
algorithm is applied to these higher dimensional features. This enables them to model non-linear
relationships. The trade-off is that they can be prone to overfitting and care must be taken to
avoid this by properly evaluating on out-of-sample test data. Overfitting is where the model has
very accurately fitted the training data but does not generalise the underlying relationship well.
This is illustrated in figure 4.1.
Figure 4.1: Overfitting where the green line has overfitted the training data. A better fit is the
black line where the underlying function has been modelled by allowing a few mis-classifications
in the training data.
4.2
Gradient Boosted Trees (GBT)
GBT is a decision tree process. It works by creating an initial decision tree then creates successive
trees that are trained on the errors of the previous trees. This is termed a greedy algorithm. The
more trees the better, and this method is good at avoiding over fitting. An advantage of decisiontree based methods, including Random Forests detailed below, is that no normalisation or
outlier removal is required. Two parameters are required by the GBT: number of trees and tree
depth.
4.3
Random Forests
Random forests work by combining the results of many weak predictors to form a stronger
predictor. The weak predictor is a decision tree, and each decision tree is built on a random
subset of features and samples. When forecasting, the final prediction is the most common class
for classification, or an average of each trees prediction for regression.
4.4
Extremely Randomised Trees
This is a variation of Random Forests where a different method is used to select the features/data for each tree.
4.5
Multi-Layer Perceptron
Also known as Neural Networks. Probably the most widely familiar form of machine
learning. DeepThought supports two methods of training a multi-layer Perceptron: backpropagation and Rprop. For details on back-propagation see http://en.wikipedia.org/
wiki/Backpropagation, and for Rprop see http://en.wikipedia.org/wiki/Rprop.
A multi-layer perceptron is made up of a number of neurons, connected in layers. The first
layer takes the input so the number of neurons in this layer is always equal to the number
of attributes in the input. The output layer is where we read the forecast so the number of
neurons is equal to the number of attributes that make up the forecast variable. For a two-class
classification problem there will be two output neurons, and for a regression problem there will
only be one.
The multi-layer perceptron can also contain hidden layers. Normally there is only one hidden
layer, but we can have more than one. The topology is illustrated in figure 4.2.
Figure 4.2: Multi-Layer Perceptron with 3 inputs, 5 hidden and 2 output neurons.
4.6
Ensembles
Ensembles have been described as the closest thing to a free lunch in machine learning. Much
effort has been put into the implementation of ensembles in DeepThought . Each of the predictor
types can have one or more sets of parameters. One of the drawbacks of SVMs is that hyperparameters must be selected. For classification these are C and . As the patterns we are
attempting to predict are extremely weak, we can never be sure that a single set of specific
values will perform as well as indicated during cross-validation. A way around this is simply to
use an ensemble of all values and use the majority vote as the final prediction. This is what
each predictor does.
We can also mix different predictor types, and have multiple models per predictor type. The limit
to the number and variety of predictors is limited only by computational power. DeepThought
will use all available cores on your PC during backtesting, but it can still be slow when large
ensembles are used. A future version will be able to spread a single backtest across several
machines. Note that the genetic algorithm can use an unlimited cluster of machines by utilising
the Condor system.
4.7
Continuous Features
Continuous features are features whose value is a floating point number, i.e they can take any
value. An example is the difference between two close prices. Different features can have different
ranges. For example comparing a feature of the difference between two close prices, and a feature
of the difference between moving average values 100 bars apart we can see that the latter will
have a greater range than the former. To adjust for this, feature values must be normalised
to bring them into more-or-less the same range. This prevents a feature with a large range
squashing or overwhelming features with smaller ranges. DeepThought has several methods of
approaching this.
4.7.1
Feature Normalisation
Normalisation is the process that scales the values of each feature in the same range. The
parameters found during normalisation are used to scale the feature vector used for forecasting.
DeepThought supports several techniques for normalising training/forecasting features, listed in
table 4.1.
Table 4.1: Normalisation and Scaling Schemes
Scaling Type
min-max
zscore
div-sd
div-max
log10
Description
scale all values between 1 and 1 using the minimum and
maximum values for the feature value
for each feature value, subtract the mean and divide by the
standard deviation. The resulting scaled feature has a mean
of zero and a standard deviation of one.
divide each feature by the standard deviation
divide each feature by the maximum of the absolute values
of the maximum and minimum
take the base-10 logarithm of each feature value
Which scheme works best is a task for trial-and-error, but starting with min-max and zscore is
recommended.
4.8
Categorical Features
Categorical features have specific values. An example is day of week. We could map days of
week to an integer in the range 0. . . 6 and use that as a continuous value and treat it as above, or
we could binarise it into multiple attributes. DeepThought supports both approaches. When a
feature is binarised, it is mapped into multiple attributes that can take values of zero or one. A
set of attributes for the feature can have only one attribute with the 1 value and all others are
0. This is sometimes called a one-hot vector approach. Table 4.2 illustrates this process.
10
Table 4.2: Binarising categorical variables.

Day of Week
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Binarised encoding
1000000
0100000
0010000
0001000
0000100
0000010
0000001
Chapter 5
Backtesting
You will probably spend a lot of time in backtesting as there is a lot trial-and-error involved in
creating a system. The built in backtester is capable of simulating market and limit orders, take
profit, stop loss and move to break even. It operates on 1 minute candles. The backtester also
operates when in live trading mode so you can compare actual results with simulated in real
time. It also functions as as a paper-trader when running in live trading mode and the EA is
set to not place any actual trades.
Chapter 9 on page 9 describes configuration settings in detail. This chapter focuses on the
process of backtesting.
5.1
Backtesting Setup
The first step is to create a unique directory that contains a configuration file. This has the
filename either config.xml or config.xml. The latter filename ensures the configuration
file is always at the top of a directory listing. Each configuration file is kept in a separate
directory as files are created during the backtest for later analysis. If you are working on several
configurations at the same time, or want to keep previous configuration files with their results,
then having each in a separate directory avoids clutter.
The configuration file contains a section named backtest. A typical setup is shown below:
1
2
3
4
5
6
7
<backtest>
<start-date>2013-01-01</start-date>
<stop-date>2013-12-08</stop-date>
<use-recorded-signals>True</use-recorded-signals>
<display-progress>False</display-progress>
<execute-when-complete>python "C:\DeepThought\python\analyse_backtest_results.py" %CONFIG_LOCATION%</
execute-when-complete>
</backtest>
The display-progress setting turns on/off display of trades as they are closed in the console.
Windows display of text in the console is slow (compared to Linux), so if you are using recorded
signals as described below, turning the progress display off can speed the backtest further. You
dont lose anything by turning the progress off as all results including progress is logged in the
various output files.
The actual backtest is started by the command:
deepthought --backtest C:\configs\EURUSD MA TEST
11
CHAPTER 5. BACKTESTING
12
where C:\configs\EURUSD MA TEST is the directory where the config.xml is located.
5.2
Recording and Using Recorded Signals
During the backtesting (and paper-trading) process, the signals are recorded to a file and stored
in the same directory as the configuration file. If you are using a large ensemble of machine
learning predictors, a backtest over a year can take hours or even days. Sometimes you may
not be changing the machine learning settings, but experimenting with other settings such as
take profit, or trigger. In this situation you can run the backtest once to generate and record
the signals. Before running the next backtest set use-recorded-signals setting to True and
the next time the backtest is run the machine learning training and forecasting will be bypassed
and the signal looked up from the recorded signals file. This dramatically shortens the time to
run a backtest provided none of the machine learning settings have been altered.
Another use of the recorded signals file is for backtesting in Metatrader. An EA has been
provided which uses these signals in Metatraders strategy tester. The recorded signals file is
named recorded.signals.csv and must be copied to:
<Metatrader-install-dir>\tester\files
For example, if your broker was InterbankFX this directory would be
C:\Program Files (x86)\InterbankFX\tester\files
This is a restriction by Metatrader as EAs run in the strategy tester cannot access files outside
this location. The source has been provided for this EA so you could adapt an existing system
to utilise the signals.
5.3
Order Fill Simulation
At the close of each 1 minute candle, the simulator looks at the high and low prices and decides
if order prices have been hit. Orders can have optional take-profit and stop-loss prices.
There is also an optional break-even setting. If this has been set, a stop-loss is automatically
set with a price equal to entry price plus 1 pip for a buy order, and entry price minus 1 pip for
a sell order. The typical sequence of events in order fill simulation is:
1. Signal indicates an order is to be placed.
2. Order is placed. If it is a market order, it is immediately filled as the last known bid price
for a sell, and last known bid + spread price for a buy. If it is a limit order, the price is
checked at the end of the next 1 minute candle.
3. At the end of the 1 minute candle limit orders are checked for fills by looking at the candles
high and low prices.
4. Check take-profit, stop-loss and break-even. If the take-profit or stop-loss has
been hit, close the position. If break-even has been set and the price has been reached
set a stop-loss at break even +1 pip.
CHAPTER 5. BACKTESTING
5.4
13
Paper Trading
The provided MT4 EA has a setting do live trade. When this is set to false then no live
trades are placed, but DeepThought will continue to simulate trades using live market data, and
populate the database. It is worthwhile to always run a paper-trader for the purpose of keeping
the database for each instrument you use up to date. The files produced during paper-trading
are identical to the files produced during backtesting as the same process is used.
5.5
Files Produced During Backtesting and Paper Trading
Various files are produced by the trade simulator during backtesting and paper trading. These
are detailed in table 5.1.
Table 5.1: Files Produced During Backtesting/Paper Trading
Filename
backtest.log
daily.returns.csv
pnl.csv
recorded.signals.csv
statistics.h4-features.csv
svm-c-rbf.forecasts.csv
Description
The log file detailing all events during backtesting. Useful
for debugging.
The daily returns of the backtest. The trade open date-time
is used to group trades to the same day.
A record of each individual trade.
The signals generated by the ensemble. Used in playback
during subsequent backtests, and used by MT4 in the strategy tester.
The statistics (min,max,mean,stddev) of each attribute in
a model. Useful to spot data errors as values should be
reasonably stable over time. Any sharp and or large changes
should be investigated. This filename example is for a model
named h4-features in the configuration.
For each predictor, a file is generated detailing the forecasts
it made. Useful for external analysis. This filename example
is for a predictor named svm-c-rbf in the configuration.
Chapter 6
Genetic Algorithm for Parameter

Search
A genetic algorithm is a way of searching a large search space using methods inspired by biology.
In DeepThought the genetic algorithm is used for parameter selection. We could use a bruteforce approach and test every combination of parameters available however the (usually) very
large number of combinations makes this infeasible.
A genome defines a list of parameters. This list of parameters is tested in a backtest to produce
a score which is used to rank the parameter set. In genetic algorithm terminology the backtest
generates the objective function which is specified in the configuration as Sharpe Ratio,
Accuracy or Profit (in Pips). You could potentially run separate genetic algorithms and
optimise on all objective functions and combine the results in an ensemble.
DeepThought uses the Condor high performance computing clustering system. It is available for
download from http://research.cs.wisc.edu/htcondor/. Condor is a system that clusters
individual computers together to form a high performance cluster. It can operate on a single
computer, so you still run the genetic algorithm if you only have access to a single computer.
Condor is a system for running many jobs in parallel, so has uses well beyond our use of it for
genetic algorithms. Although it is beyond the scope of this document to provide detailed info
on installing and using condor, we explain the parts relevant to DeepThought. Further detail is
given in appendix B.
The genetic algorithm in DeepThought operates in the following way:
1. DeepThought is started as a GA Server.
2. A random population of genomes is created. This is the first generation.
3. A configuration file is produced from a template for each genome.
4. A Condor submit file is produced and the population submitted to Condor. Each individual configuration file is run on one core of the cluster in parallel with other configuration
files.
5. DeepThought listens on a TCP port for backtests to finish and send a summary of results.
6. As each backtest completes, a summary is transmitted via UDP to DeepThought GA
Server. The log files and other outputs are sent back to the server and stored in individual
directories for later analysis if required.
7. After a configurable timeout has been reached, all running jobs are terminated. This step
is skipped if all jobs complete before the timeout.
14
CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH
15
8. The backtest results produced by the genomes are assessed and using parameters detailed
in section 9.11 the next generation of genomes is produced.
9. Steps 3 to 8 are repeated until the number of generations specified in the configuration
has been reached. Alternatively, the GA will stop if all possible combinations have been
tested.
DeepThought keeps a list of all genomes and their results. This is to prevent the same parameter
combination being tested more than once. This file is persisted to disk after each generation
and it also operates as a save point and can be used to resume a genetic algo in the event that
it was interrupted.
6.1
Configuration
A configuration file is supplied to DeepThought in the same format as for backtesting and
live/paper trading. It is used as a template as configuration files are created for each parameter
combination (genome) to be tested. The configuration file must contain a genetic-algo section
similar to the configuration snippet below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<genetic-algo>
<ga-server>tcp://wraith</ga-server>
<ga-server-port>55566</ga-server-port>
<genome-id>-1</genome-id>
<objective-function>sortino</objective-function>
<timeout-minutes>360</timeout-minutes>
<population-size>20</population-size>
<mutation-probability>10</mutation-probability>
<num-breeders-percent>30</num-breeders-percent>
<min-num-breeders>30</min-num-breeders>
<num-new-random-genomes>2</num-new-random-genomes>
<num-generations>10</num-generations>
<parameter id="stop-loss"
type="integer" low="10" high="200" step="5" />
<parameter id="take-profit" type="integer" low="10" high="200" step="5" />
<parameter id="time-of-day" type="categorical" values="h1,h4,single,none" />
<parameter id="SVM-Penalty" type="exp-2" low="1" high="15" step="1" />
<parameter id="SVM-gamma"
type="exp-2" low="-8" high="2" step="1" />
</genetic-algo>
See section 9.11 for a detailed explanation of the options. To have the genetic algorithm modify
values in a configuration file, the file must have XML tag ga-subst defined for each value that
can vary where the value of ga-subst is equal to the parameter id defined in the genetic-algo
section. The example below illustrates this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<feature>
<type>hour-of-day</type>
<period ga-subst="time-of-day">h4</period>
</feature>
<feature>
<type>bar-attribute</type>
<attribute-type>average-close</attribute-type>
<number ga-subst="average-close-num">30</number>
<value-type>diff</value-type>
<bar-series>EURUSDh4</bar-series>
<scale-type>min-max</scale-type>
</feature>
...
<svm-predictor>
<identifier>svm-c-rbf</identifier>
<model>h4-features</model>
<continuous-tune>false</continuous-tune>
<params> 
<penalty ga-subst="SVM-penalty">8</penalty>
<gamma ga-subst="SVM-gamma">0.015625</gamma>
<forecast-weight>1.0</forecast-weight>

22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
<svm-type>SVC</svm-type>
<kernel>rbf</kernel>
</params>
<num-training-observations>2000</num-training-observations>
<num-training-skip>1</num-training-skip>
</svm-predictor>
...
<signal-generator>
<entry-times>
<hour>all</hour>
<day-of-week>all</day-of-week>
</entry-times>
<entry-threshold>0.0</entry-threshold>
<forecast-type>SVC</forecast-type>
<take-profit ga-subst="take-profit">0.0</take-profit>
<stop-loss ga-subst="stop-loss">0.0</stop-loss>
<break-even>20.0</break-even>
<exit-all-hour>-1</exit-all-hour>
<trade-bar-series>EURUSDm1</trade-bar-series>
<reverse-all>False</reverse-all>
</signal-generator>
The parameter types are defined in table 6.1.

Table 6.1: parameter configuration options for the genetic-algo configuration section.
Option
integer
categorical
exp-2
6.2
Description
Used when the parameter can be modelled as an integer. The options
available for the integer type are:
low
The lowest value that the integer can take.
high
The highest value that the integer can take.
step
The value to increment/decrement for different values of this
parameter.
Used when the parameter can only take certain (string) values. The
options available for the categorical type are:
values Comma separated list of values that this parameter can take.
Used when the parameter is best suited to an exponential grid search.
For example SVM penalty, SVM gamma and SVM epsilon are
best searched using an exponential grid search. This means that
rather than use values that are linearly spaced such as 5, 10, 15, 20, ...,
we use values such as 21 , 22 , 23 , 24 , .... This results in final values of
2, 4, 8, 16, .... Note that negative numbers can be used and result in
the final values being less than 1. e.g. 25 , 24 , 23 , 22 , ... become
0.03125, 0.0625, 0.125, 0.25, .... The options available for the exp-2 type
are:
low
The lowest value that the exponent can take.
high
The highest value that the exponent can take.
step
The value to increment/decrement the exponent.
Running the Genetic Algorithm
The genetic algorithm is started with the following command:

DeepThought --genetic-algo C:\configs\EURUSD MA TEST
16
17
This will use the file config.xml or \config.xml in the directory C:\configs\EURUSD MA TEST
in the same way as for backtesting described in section 5.1.
The progress of the genetic algorithm is printed to the console similar to the example below. In
this example, we are using a population of 20 on a single machine with 8 cores. As each backtest
completes, a summary of the results are displayed. The beginning of each line contains three
numbers. The first is the generation number, the second the genome number and the last, the
number of genomes in a population. The Compute host is the name of the machine that the
backtest ran on. Useful for monitoring a cluster of machines to see which machines are quicker
and running more backtests.
C:\DeepThought>DeepThought --genetic-algo C:\DeepThought_Configs\EURUSD_GA
DeepThoughtLib::GeneticAlgo::SubmitToCluster
Submitting job(s)....................
20 job(s) submitted to cluster 41.
Submitted to Condor cluster 41
2014-Jan-12 17:08:59.307337 Info: DeepThoughtLib::GeneticAlgo::WaitForResults Waiting for (20) results for generation 3. Num of jobs is 20. Max wait time is 06:00:00
3/1/20 3003: Obj=-0.102615 Sharpe=-0.102615 PnL=-259.2 dd=-2357.7 num=1616 %=50.5569
Compute host=Slartibartfast svm-gamma=-8 svm-penalty=7 . Time left is 05:38:49.
3/2/20 3006: Obj=-0.474801 Sharpe=-0.474801 PnL=-1489 dd=-4545.9 num=1616 %=49.1337
Compute host=Slartibartfast svm-gamma=0 svm-penalty=12 . Time left is 05:38:31.
Received enough results (20) for generation 3
All jobs in cluster 41 have been marked for removal
***********************************
Best 20 results for generation 3
***********************************
2024: Obj=1.04909 Sharpe=1.04909 PnL=2929.9 dd=-1373 num=1619 %=53.7986
Compute host=Slartibartfast svm-gamma=-8 svm-penalty=3
2025: Obj=1.04909 Sharpe=1.04909 PnL=2929.9 dd=-1373 num=1619 %=53.7986
1002: Obj=0.167274 Sharpe=0.167274 PnL=298.4 dd=-1526.8 num=1620 %=60.1852
1003: Obj=0.0850919 Sharpe=0.0850919 PnL=170.2 dd=-1966.8 num=1620 %=60.6173
3044: Obj=-0.0356163 Sharpe=-0.0356163 PnL=-83 dd=-3019.5 num=1619 %=50.8956
3003: Obj=-0.102615 Sharpe=-0.102615 PnL=-259.2 dd=-2357.7 num=1616 %=50.5569
18

6.2.1
Database
A copy of the database must exist on all machines in the cluster in an identical location. Normally
the database(s) are in C:\FX Database and this should be copied to all machines in the cluster.
This is so that the genetic algorithm does not need to send a copy of the data to each node
(there would be 8 copies of the same data on a single machine with 8 cores).
6.3
Genetic Algorithm Results
While the genetic algorithm is running, results are accumulated in the directory given in the
commandline. In the above example this is C:\configs\EURUSD MA TEST. A separate directory
is created for each generation, named generation-1, generation-2, etc. For each individual
genome a results.zip file is created containing the generated configuration and all output files.
This filename is prepended with the genome-id.
In the directory containing the configuration template, a file genetic-algo-cache.xml is created and updated each time a backtest completes on a Condor node. This contains the
genome-id along with a summary of results and a list of values assigned to the parameters
that the genetic algorithm is optimising. A sample of this file is given below. The best results
are always at the top. This is the file that is also used as a save point in the event that the
genetic algorithm was interrupted.
6.4
Using Recorded Results
If use-recorded-results is set to True in the backtest configuration section you must ensure
that a file named recorded.signals.csv is in the same directory as the configuration. This
file is generated by a backtest as explained in section 5.2.
6.5
The Condor Submit File
Condor operates on submit files. These are plain text files that list the jobs to be run on the
cluster. DeepThought generates submit files for each generation. These are created in the same
directory as the genetic algorithm configuration file. You should not normally need to view
19
these files, and altering them will have no effect as they are always generated by the genetic
algorithm.
6.6
Trouble Shooting
Most problems occur because of a problem with the configuration file. First check the backtest.log file for errors. Other things to check are dates: are the backtest start/stop dates contained
within the data? Also check database filenames and that the database exists in the same
directory on all machines in the cluster and is populated. Normally the database(s) are in C:\
FX Database and this should be copied to all machines in the cluster.
The genetic algorithm is slightly harder to debug as there is less direct access to what is happening, and there is a reliance on a third party component (Condor). In the results.zip file of a
genome located in the generation-n directory of generation n, check the backtest.log file for
errors. If it is empty, or the problem is not evident, try directly backtesting the configuration
file.
If you are using a multi-machine cluster, try disabling firewalls and other things that may prevent
network access. You can also check the Condor log file. Although this tends to be a little cryptic
it may provide clues where to start looking.
Chapter 7
Live and Paper Trading

Once you have been through the research and development process and have found a configuration that you are happy with the next stage is to paper trade. We strongly suggest doing
this before live trading to ensure that paper trading results match (in a statistical sense) your
backtested results.
The process for live and paper trading is identical with the exception that orders are not placed
in a live market.
7.1
Manual Trading
DeepThought can be traded manually. One use of manual trading is for end-of-day systems
where it is feasible for a human to make every trade manually.
One challenge with manual trading is populating the database. To do this we suggest using
the provided Metatrader EA as described below, but setting the parameter do-live-trade to
false. This will populate the database while placing no trades. The EA can be left running as
it is possible for more than one program to access the database at any one time.
Manual trading is done with two commands. The first (--manual-trade-train-and-persist)
will train all models in the configuration and save in the configuration directory. The command
below is an example of manually training from the configuration in C:\DeepThought Configs\EURUSD Strategy 1:
deepthought --manual-trade-train-and-persist C:\DeepThought Configs\EURUSD Strategy 1
The output should simply show Ok if the training could be done. If not check the log file in the
configuration directory for hints on what went wrong. Once the models have been trained the
forecasts can be generated using the --manual-trade-generate-signal option:
deepthought --manual-trade-generate-signal C:\DeepThought Configs\EURUSD Strategy 1
The output will be similar to the following:
DeepThought built on Jan
BUY
Consensus=25
NumberOfPredictors=45
7 2014 at 16:48:35
20
CHAPTER 7. LIVE AND PAPER TRADING
21
The output is formatted in this way to make it easy for other scripts to parse the output if
DeepThought manual signals form part of a larger trading strategy. When generating signals
manually the normal sequence of events should be:
1. Run --manual-trade-train-and-persist to generate the initial model.
2. Wait for the candle to complete on the time-frame that you are forecasting on.
3. When the candle completes run --manual-trade-generate-signal and act on the forecast.
4. After the forecast has been processed, re-run --manual-trade-train-and-persist to
re-generate the models on the latest data.
The events are sequenced in this way as forecasting can take a while if you are using a large
ensemble. Using the sequence above we have as much time as it takes a candle to complete to
train the models.
7.2
Automated Trading
DeepThought is able to auto-trade with Metatrader 4 using the supplied Expert Advisor. Links
to other trading platforms will be added over time. Please contact us with a request to create
a link to the platform you are using (if not Metatrader) the more requests, the higher the
priority will become for implementation.
An Expert Advisor (EA), Metatraders terminology for an automated trading script is provided which accesses the DeepThought DLL. The source of the EA is provided so you can add
your own trading logic to the signals generated by DeepThought. This EA provides basic trading of the signals generated by DeepThought and you should add your own trading logic. This
EA is intended to be starting point as a place to add your own trading rules. You can backtest
any trading logic using recorded signals following the process described in section 5.2 of page
12.
The EA is in the DeepThought installation directory Metatrader, named DeepThought.mq4.
It must be placed in the experts directory where Metatrader is installed. For example if your
broker was InterbankFX this directory would be
C:\Program Files (x86)\InterbankFX\experts
The DLL, named DeepThought.Dll located in the DeepThought installation directory needs
to be copied to the experts\library directory where Metatrader is installed. For example if
your broker was InterbankFX this directory would be
C:\Program Files (x86)\InterbankFX\experts\library
When you next start Metatrader the DeepThought EA should be available in the Experts folder
in Metatrader. Add this to a chart in the normal way. It can be added to any time-frame as ticks
are used to generate candles, however we suggest adding it to the 1 minute time frame.
You also need to place the licence file you received when purchasing DeepThought in the
same directory as the Metatrader executable. For example if your broker was InterbankFX this
directory would be
C:\Program Files (x86)\InterbankFX\
If you are trading several instruments, we suggest a separate Metatrader instance for each
instrument as Metatrader will likely crash when loading the same DLL into more than one EA.
22
To create a new instance of Metatrader, simply copy the Metatrader installation to another
location so each instance has a completely separate set of files.
Table 7.1: DeepThought parameters for the Metatrader EA.
Data Type
string
Parameter
files location
Default
int
gmt offset
int
max trade duration seconds
string
deep thought db
EURUSDm1
double
bool
trade lot size

do limit orders
0.1
true
double
limit order offset
0.0002
int
magic number
1600
bool
do live trade
false
bool
add to position
true
7.3
Description
The directory where the XML
configuration is location
The hour offset from GMT of
your broker. If your historical
data is in UTC (i.e. GMT)
time then you will need an offset to ensure there are no gaps
in the data caused by timezone changes.
Automatically close trades after this many seconds. Set to
0 to leave trades open (they
will close with an opposite signal).
The identifier of the
1 minute bar-series in
DeepThought that price ticks
will be sent to, to build 1
minute candles.
The trade size in lots.
Use limit orders. If set to
false, market orders will be
used.
The price offset to use for
placing limit orders.
The number that Metatrader
inserts with trade info. Enables you to track which
trades came from what system if you are running multiple systems.
Set to true for live trading;
set to false for paper trading.
If set to true, will add positions to existing positions,
sometimes known as pyramiding.
Trouble Shooting
Metatrader can be unstable when it is working with external DLLs. It can be particularly bad
when changing parameters in the EA and a Metatrader crash is unfortunately all too common.
We hope that these stability issues will be fixed in Metatrader 5.
If you are having problems changing parameters in the DeepThought EA, follow these
23
steps:
1. Save the EA parameters:
2. Delete the EA from the chart.
3. Exit Metatrader.
4. Open the windows task-manager and check if Terminal is still running. If it is, highlight
it and click End Process.
5. Restart Metatrader.
6. Add the EA back to the chart, load the parameters saved in step 1 and make the changes.
This may seem a bit odd but since as the stability of Metatrader is beyond our control this
is all we can offer. If Metatrader still crashes, then a reboot of the computer is probably
required.
Chapter 8
Python Scripting
DeepThought uses the Python language for scripting. There are no restrictions on the Python
scripts. DeepThought uses the Python system installed on your PC thus you are able to use
whatever libraries and modules (e.g. scipy, numpy, pandas etc) you require. When a function
is called from DeepThought , an interface object is passed to your script enabling your script
to access elements in DeepThought such as candle data and to pass back various values such as
the forecast or training label value.
The use of embedded Python enables unlimited customisation in the following areas:
1. Features - custom features from any datasource accessible to your Python scripts.
2. Target and trigger - define a trigger when forecasts are made and define a target that the
predictors is forecasting.
3. Predictor - works with the builtin machine learning predictors, or supply your own. Does
not need to be machine learning based, essentially allowing DeepThought to be used as a
standard algorithmic platform.
4. Signal Generator - combine forecasts from the predictors to produce buy/sell signals.
The use of Python is optional and it is entirely possible to produce a working system without
the use of Python.
There is a <python> section in the configuration file.
This contains one or more
<script-filename> entries. Each script can contain one or more functions, or all functions
can be in a single script file. If you are using multiple script files, they all operate in the same
namespace. An example <python> section is given below.
1
2
3
<python>
<script-filename>target_num_pips.py</script-filename>
</python>
Your Python scripts reside in the same directory as the config.xml configuration file.
8.1
Python Installation
DeepThought uses Python version 2.7. It uses the 32-bit version on windows. The installation
process installs an optional Python distribution MiniConda. This is a cut-down version of the
free Anaconda distribution available at http://store.continuum.io/cshop/anaconda/. You
24
CHAPTER 8. PYTHON SCRIPTING
25
can bypass the installation provided with DeepThought and install the full Anaconda distribution, or install another distribution. The only requirement is that it is version 2.7 32-bit
(Windows). The Linux and MacOS versions of DeepThought use 64-bit. We use the 2.7 version
rather than the 3.3 version as most large distributions (e.g. Anaconda, Python(x,y)) are still
based on 2.7.
We strongly recommend the usage of the Numpy and Pandas libraries for numberical and
time-series processing and Matplotlib for visualisation. These are installed by default with
Anaconda and can be installed if you are using Miniconda with the following in a commandline
prompt:
1
2
conda install pandas

conda install matplotlib
The Numpy library automatically will be installed as both Pandas and Matplotlib depend on
it.
8.2
Python Feature
A feature is comprised of one or more numical values (attributes). You can have as many
Python generated features as you wish. Each feature must be generated using a unique function
name.
To add a Python generated feature to your model, add a feature of type python-script. A
complete example configuration is given below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
<config>
<bar-series>
<identifier>EURUSDm1</identifier>
<bar-series-type>const-time</bar-series-type>
<source type="database">eurusd.db</source>
<price-to-pip-multiplier>10000.0</price-to-pip-multiplier>
<average-spread>0.0</average-spread>
<bar-duration-minutes>1</bar-duration-minutes>
<const-bar-price>0.0</const-bar-price>
</bar-series>
<bar-series>
<identifier>EURUSDh4</identifier>
<source type="bar-series">EURUSDm1</source>
<history-source-type>bar-series</history-source-type>
<delay-minutes-offset>0</delay-minutes-offset>
</bar-series>
<bar-series-collection>
<data-file-dir>C:\FX_Database</data-file-dir>
</bar-series-collection>
<python>
<script-filename>ema diff feature.py</script-filename>
</python>
<model>
<identifier>h4-features</identifier>
<target>
<type>bars-in-future</type>
<identifier>target-1-bar-in-future</identifier>
<number>1</number>
<price-type>up-down</price-type>
</target>
<feature>

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
</feature>
<feature>
<type>python-script</type>
<set-parameter-value-func-name>SetParameterValue</set-parameter-value-func-name>
<get-number-of-attributes-func-name>GetNumberOfAttributes</get-number-of-attributes-func-name>
<get-features-func-name>GetFeatures</get-features-func-name>
<parameter name="ma short period" type="int">20</parameter>
<parameter name="ma long period" type="int">50</parameter>
<identifier>python-test-1</identifier>
</feature>
<feature>
</feature>
</model>
<svm-predictor>
<params> 
<penalty ga-subst="svm-penalty">512</penalty>
<gamma ga-subst="svm-gamma">0.0625</gamma>
</params>
</svm-predictor>
<predictor-ensemble>
<retrain-period>Weekly</retrain-period>
</predictor-ensemble>
<signal-generator>
<entry-times>
<hour>all</hour>
</entry-times>
<target-trigger>h4-features</target-trigger>
<take-profit>0.0</take-profit>
<stop-loss>0.0</stop-loss>
</signal-generator>
<trader>
<hold-minutes>0</hold-minutes>
<hold-bars>0</hold-bars>
<max-drawdown>100000</max-drawdown>
<close-at-weekend>False</close-at-weekend>
<scale-out>False</scale-out>
<max-position>100</max-position>
<limit-orders offset="0.0">False</limit-orders>
</trader>
<backtest>
<use-recorded-signals>False</use-recorded-signals>
<display-progress>True</display-progress>
<execute-when-complete>python C:\DeepThought\python\analyse_backtest_results.py %CONFIG_LOCATION%</
</backtest>
</config>
Listing 8.1: Python feature configuration.

Here we have defined three functions:
26
27
SetParameterValue
We can set parameters in the configuration file which will be passed to this function once on
spin-up. These parameters can be controlled by the Genetic Algorithm described in Chapter
6 on page 14. The parameters are defined in the configuration using <parameter> entries as
shown in the example above.
GetNumberOfAttributes
A function that returns the number of attributes that make up the feature.
GetFeatures
A function that is responsible for generating the actual numerical attributes. A DeepThought
interface object is provided to this function to pass the attributes back to DeepThought .
An example script that implements the above functions is given below.
1
2
3
import pandas as pd
import numpy as np
import sys
4
5
6
7
8
ma_short_period = None # int

ma_long_period = None # int
number_of_diffs = 30
num_required_candles = None
9
10
11
12
13
14
15
def ExpMovingAverage(values, period):

weights = np.exp(np.linspace(-1., 0., period))
weights /= weights.sum()
ema = np.convolve(values, weights)[:len(values)]
ema[:period] = ema[period]
return ema
16
17
18
def GetNumberOfAttributes(deep_thought_intf):
deep_thought_intf.SetNumAttributes(2)
19
20
21
22
23
24
25
26
27
28
29
30
def SetParameterValue(param_name, param_value):

global ma_short_period
global ma_long_period
global num_required_candles
if param_name == "ma_short_period":
ma_short_period = param_value
elif param_name == "ma_long_period":
ma_long_period = param_value
num_required_candles = ma_long_period + number_of_diffs + 2
else:
print("Unknown parameter:", param_name)
31
32
33
34
35
def GetFeatures(deep_thought_intf):
if (ma_short_period == None):
print("Error: ma_short_period has not been set!")
return -1
36
37
38
csv_file_name = deep_thought_intf.GetLastBars(num_required_candles, "EURUSDh4")

candles = pd.read_csv(csv_file_name, index_col=False)
39
40
41
if len(candles.index) < num_required_candles:

return -1
42
43
44
close_values = candles[close].values
reversed_close_values = close_values[::-1]
45
46
47
ema_short = ExpMovingAverage(reversed_close_values, ma_short_period)

ema_long = ExpMovingAverage(reversed_close_values, ma_long_period)
48
49
50
for i in range(1, number_of_diffs, 1):

deep_thought_intf.SetAttribute(i-1, ema_short[-i] - ema_long[-i])
Listing 8.2: Python script example defining a feature.

This script uses the Python libraries pandas which provides data analysis functions including a
28
data-frame and numpy for numeric analysis. An interface object deep thought intf is passed to
the GetNumberOfAttributes() and GetFeatures() functions. This interface is the mechanism
for passing data back and forth between DeepThought and your scripts.
The methods of the deep thought interface object are detailed in table 8.1 on page 34.
8.3
Python Target
The target script has two functions; to detect a forecast trigger and to label a training instance
with a target. Detecting a forecast trigger can be as simple as forecasting each 4-hourly bar, or
more complex such as only forecasting when a pair of moving averages has crossed. If a trigger
has been detected, your GetTargetIsTrigger() function returns True and a training sample is
created. If the criteria has not been met, your script returns False.
Your script must also supply a GetTarget() function. This function is passed a candle at
observation time for a sample where the target trigger was met, along with the current candle.
Your function can then compare and decide if the target criteria has been met.
The following examples should make this a little clearer.
This example is the setup for a system that forecasts whether a 20 pip target will be hit first by
price moving up or price moving down. A new forecast is created each four hours so that every
four hours this system will enter a trade with a target of +20 pips for an up forecast and -20
pips for a down forecast. To do this the scripts use the 1 minute candles. The script is setup
in the config in the <python> section. The target is configured in the <model> section. More
detail on configuration is given in chapter 9 on page 35. Detail on the <model> configuration
section is on page 41.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
<python>
<script-filename>target num pips.py</script-filename>
</python>
<model>
<target>
<identifier>target-next-pip-movement</identifier>
<bar-series>EURUSDm1</bar-series>
<parameter name="pip-movement" type="double">20.0</parameter>
<check-target-trigger-func-name>GetIsTargetTrigger</check-target-trigger-func-name>
<get-target-func-name>GetTarget</get-target-func-name>
<set-parameter-value-func-name>TargetSetParameterValue</set-parameter-value-func-name>
</target>
<feature>
</feature>
<feature>
</feature>
<feature>
<type>moving-average</type>
<ma-attribute-type>average-close</ma-attribute-type>
<period>5</period>
<selection-list>1,2,3,4,5,7,9,13,16,20,25,31,45,55,70,100</selection-list>
</feature>

36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
29
<feature>
<period>10</period>
</feature>
<feature>
<period>20</period>
</feature>
<feature>
<period>50</period>
</feature>
<feature>
<period>100</period>
</feature>
</model>
Listing 8.3: Python target configuration.

The following is the listing of the target num pips.py script defined in the configuration file.
We have defined a single parameter pip-movement in the configuration. This is passed to the
script using the TargetSetParameterValue() function when DeepThought starts up.
The function GetIsTargetTrigger() tests to see if the trigger criteria has been reached. As
we are calling this script every time a 1 minute candle complete (setup in the config using the
<bar-series> element in the <target> section), the script must check to see if a four hour
candle has just completed.
The function GetTarget() set the target of either -1.0 or 1.0 if a target has been reached. If no
target has been reached, a target is not set.
1
2
3
4
import pandas as pd
import numpy as np
import sys
from datetime import datetime
5
6
7
num_pips = None # double

last_close_datetime = "none"
8
9
10
11
12
def TargetSetParameterValue(param_name, param_value):

if param_name == "pip-movement":
global num_pips
num_pips = param_value/10000.0
13
14
15
16
17
18
def GetIsTargetTrigger(deep_thought_intf, latest_candle):

# Check if we are at the close of an H4 candle. We need to do it this way as we are
# triggering from M1 candles.
csv_file_name = deep_thought_intf.GetLastBars(2, "EURUSDh4")

19
20
21
22
23
24
25
26
30
if len(candles.index) < 1:
return False
candle_close_datetime = candles.iloc[0][close_date_time]
global last_close_datetime
if (last_close_datetime != candle_close_datetime):
last_close_datetime = candle_close_datetime
return True
return False
27
28
29
30
31
32
def GetTarget(deep_thought_intf, candle_at_observation, latest_candle):

if (latest_candle.ClosePrice() - candle_at_observation.ClosePrice() >= num_pips):
deep_thought_intf.SetTarget(1.0)
elif (latest_candle.ClosePrice() - candle_at_observation.ClosePrice() <= -num_pips):
deep_thought_intf.SetTarget(-1.0)
Listing 8.4: Python script example defining a target.
8.4
Python Predictor
A predictor is a component that takes training data, performs training, builds a model and when
given an unlabelled example performs a forecast. There a built-in predictors in DeepThought
which implement algorithms such as Support Vector Machines, Gradient Boosted Trees, Neural
Networks, etc. You can use this in conjunction with your own predictor, or simply use your own
predictor(s) by themselves.
Your predictor does not have to implement a machine learning algorithm. You could, for example, use a simple moving average cross as a predictor. You could also use Pythons sklearn
library (http://scikit-learn.org) to use other machine learning algorithms.
A complete configuration file using a single Python predictor is given below followed by the
Python script predictor.py. This example implements a moving-average cross predictor. It
will trade for one bar when a fast moving average crosses a slow moving average. No machine
learning is used in this example. This is an example of usage only for demonstration purposes
we dont recommend the use of a moving average cross by itself. In the example below, the
model does not require any features as the Python script only requires the bar-series data
(close prices) which it gets from the deep thought intf object.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<config>
<bar-series>
<load-from-date>2012-06-01</load-from-date>
</bar-series>
<bar-series>
</bar-series>
<python>
<script-filename>predictor.py</script-filename>
</python>
<model>

28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
<target>
<number>1</number>
</target>
</model>
<python-predictor>
<identifier>python-predictor-h4</identifier>
<predictor-weight>1.0</predictor-weight>
<predict-func>Predict</predict-func>
<train-func>Train</train-func>
<parameter name="ma-long" type="int">20</parameter>
<parameter name="ma-short" type="int">5</parameter>
</python-predictor>
<retrain-each-bar>True</retrain-each-bar>
<signal-generator>
<entry-times>
<hour>all</hour>
</entry-times>
</signal-generator>
<trader>
</trader>
<backtest>
</backtest>
</config>
Listing 8.5: Python predictor configuration example.

1
2
import numpy as np
import pandas as pd
3
4
5
6
# global variables
ma_long_period = None #int
ma_short_period = None #int
7
8
9
10
11
12
13
def ExpMovingAverage(values, period):

weights = np.exp(np.linspace(-1., 0., period))
weights /= weights.sum()
ema = np.convolve(values, weights)[:len(values)]
ema[:period] = ema[period]
return ema
14
15
16
def Sign(value):
if (value >= 0):
31

17
18
19
32
return (1)
else:
return (-1)
20
21
22
23
24
25
26
27

global ma_long_period
global ma_short_period
if param_name == "ma-long":
ma_long_period = param_value
if param_name == "ma-short":
ma_short_period = param_value
28
29
30
31
32
33
def Train(deep_thought_intf, training_csv):

# This example does not need to training anything but we could use
# the following line to read a training set into a Pandas data frame:
# training_df = pd.read_csv(training_csv, index_col=False)
return True
34
35
36
37
38
39
40
def Predict(deep_thought_intf, attributes_csv):

# Get an array of close prices
csv_file_name = deep_thought_intf.GetLastBars(ma_long_period + 4, "EURUSDh4")
close_values = candles[close].values
reversed_close_values = close_values[::-1]
41
42
43
44
45
46
47
48
# As numpy convolve (moving average) calculates from lowest index to highest,

# we must reverse the array of values as a bar series has the most recent
# values with the lowest index and we want to compute the moving average
# moving forward in time (i.e. from the back of the array of close prices
# forwards).
ema_short = ExpMovingAverage(reversed_close_values, ma_short_period)
ema_long = ExpMovingAverage(reversed_close_values, ma_long_period)
49
50
51
52
53
# Calculate the difference between the moving averages at the most

# recent candle, and the one before that
diff_current = ema_short[-1] - ema_long[-1]
diff_previous = ema_short[-2] - ema_long[-2]
54
55
56
57
58
59
60
61
62
# Look for a cross. If we find one, predict in the direction of

# the cross.
if Sign(diff_current) != Sign(diff_previous):
deep_thought_intf.SetForecast(Sign(diff_current))
else:
# Not strictly necessarily as forecast defaults to 0 if
# not set, but set here for completeness.
deep_thought_intf.SetForecast(0)
Listing 8.6: Python script example for a predictor.
8.5
Python Signal Generation
The signal generator is the component that transforms the forecasts produced by the predictors
into buy and sell signals. It also controls trading parameters such as take profit and stop loss.
More detail on the signal generator can be found in section 9.8 on page 70.
As there is only one signal generator, you only need to provide optional Python function names
to the signal-generator component. An example is given below.
1
2
3
4
5
6
7
8
9
10
<python>
<script-filename>signal generator.py</script-filename>
</python>
<signal-generator>
<entry-times>
<hour>all</hour>
</entry-times>
33
<combine-forecasts-func-name>CombineForecasts</combine-forecasts-func-name>
<parameter name="threshold" type="double">20.0</parameter>
</signal-generator>
11
12
13
14
Listing 8.7: Python signal generator configuration example.

1
2
3
4
5
6
#
#
#
#
#
#
Simple demonstration of the signal generator calling a Python function.

This example simply buys/sells if the combined forecasts of the
predictor ensemble threshold have exceeded a threshold given by the
"threshold" parameter in the configuration file.
7
8
9
import pandas as pd
import numpy as np
10
11
12
# Globals
threshold = None # double
13
14
15
16
17
# We could make these parameters in the configuration, but hard code them here
# for the moment.
take_profit = 20.0
stop_loss = 25.0
18
19
20
21
22
23

if param_name == "threshold":
global threshold
threshold = param_value
print("set threshold to ", threshold)
24
25
26
27
def CombineForecasts(deep_thought_intf, predictions_csv):

# read the predictions into a Pandas dataframe
predictions = pd.read_csv(predictions_csv)
28
29
30
31
# Remove all limit orders and tag them in the log file as "Missed"
deep_thought_intf.DeleteLimitOrders("Missed")
average_forecast = predictions.forecast.mean()
32
33
34
35
36
37
38
if average_forecast >= threshold:

deep_thought_intf.SendBuyOrder("EURUSDm1", take_profit, stop_loss)
elif average_forecast <= -threshold:
deep_thought_intf.SendSellOrder("EURUSDm1", take_profit, stop_loss)
else:
deep_thought_intf.CloseAllTrades("Threshold not reached")
Listing 8.8: Python script example for the signal generator.
8.6
The deep thought intf Interface Object
Some of the Python functions described above are passed a deep thought inft object. This is
an interface that is used between DeepThought and your script to pass values back and forth.
It is used to set values such as targets and feature values as well as enable your script to access
historical candles, forecasts, etc. Table 8.1 summaries the methods provided on this object.
Note that not all methods would be used in a given function. For example, the functions used
to set features should not set targets in fact if you do this the target will be ignored. You
would need to set the target in the appropriate target function.
34
Table 8.1: Summary of the deep thought intf interface object

Method
GetLastBars(num bars,
bar series)
GetNumAttributes()
SetAttribute(index, value)
SetNumAttributes(num attributes)
SetTarget(value)
SetForecast(value)
SendBuyOrder(bar series id,
take profit, stop loss)
SendSellOrder(bar series id,

take profit, stop loss)
CloseAllTrades(comment)
DeleteLimitOrders(comment)
Description
Returns the file name of a CSV file containing the
last num bars candles of bar series bar series.
Returns the number of attributes of this feature.
Set using SetNumAttributes().
Set the value of the attribute with the given index.
Indexes are zero indexed.
Set the number of attributes of this features.
Set the target value if a target has been reached
when a GetTarget() function is called.
Set the forecast value when a Predict() function
has been called.
Send a buy order to the bar series sepcifiecd by
bar series at the current market price. Optionally set take profit or stop loss to be non-zero
if required.
Send a sell order to the bar series sepcifiecd by
bar series at the current market price. Optionally set take profit or stop loss to be non-zero
if required.
Close all open trades with an optional comment.
The comment appears in the log file.
Remove all unfilled limit orders with an optional
comment. The comment appears in the log file.
Chapter 9
Configuration Details
DeepThought is driven by XML configuration files. A GUI will be available in a future version.
The XML is divided into sections as listed in table 9.1. Some sections are required, some
are optional and some can contain their own sections. Also, some sections can have only one
definition (e.g. bar-series-collection), while others can have as many as desired, e.g. model.
Each section is detailed individually in this chapter.
Where the option value is a string, e.g. True, False, RBF, the text is case insensitive. You can
check the first part of the log file to check what default values were used for missing values.
Table 9.1 details the configuration sections.
9.1
bar-series
A bar-series is the raw data. For Forex trading, DeepThought operates by using 1 minute
candles to generate longer duration bars (candles) . Renko bars can also be generated. An
example configuration snippet that generates 90 minute candles is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<bar-series>
<bar-duration-minutes>1</bar-duration-minutes>>
</bar-series>
<bar-series>
</bar-series>
In the above example, a pip as defined by the broker is a price change of 0.0001. Therefore we
must multiply by 10000 to use pips in a human readable form, so a price change of 0.0005 is 5
pips. This is controlled by the price-to-pip-multiplier setting. It exists entirely to make
things easier to read for humans. When this is set, all price movements in the configuration
must be set accordingly. So a take profit of 20 pips can be set as 20 rather than 0.002. We could
set price-to-pip-multiplier to 1.0 (the default if not defined) and we would need to enter a
20 pip take profit as 0.002. All output files use this multiplier.
35
CHAPTER 9. CONFIGURATION DETAILS
36
Table 9.1: Sections in the XML configuration file

XML Section
bar-series
bar-series-collection
python
model
svm-predictor
linear-svm-predictor
gbt-predictor
random-forest-predictor
extremely-randomised-trees-predictor
predictor-ensemble
signal-generator
trader
backtest
genetic-algo
Description
Require at least one. Normally have at least two: one to
define the 1 minute candles, and one to define a longer duration candle series that is used by the predictor.
Required, and only one definition allowed. Defines parameters common to all bar-series, e.g. database locations.
Optional. Only required if using Python scripting for one or
more other components.
Require at least one, and can have as many as desired. You
would normally have several in an ensemble learning setting.
Defines the features to be used by a predictor. Different
predictors can use the same model.
Optional. Defines the parameters for a Support Vector
Machine predictor.
Optional. Defines the parameters for a Linear Support
Vector Machine predictor.
Optional. Defines the parameters for a Gradient Boosted
Tree predictor.
Optional. Defines the parameters for a Random Forest
predictor.
Optional. Defines the parameters for an Extremely Random Trees predictor.
Required and only one definition allowed. Defines parameters common to all predictors.
Required and only one definition allowed. Defines how forecasts from individual predictors are combined to create a
signal (buy, sell, do nothing).
Required and only one definition allowed. Defines trading
parameters such as take-profit and stop-loss.
Only required if backtesting. Defines parameters for backtesting such as start and stop dates.
Only required if running a genetic algorithm. Parameters
which control a genetic algorithm for parameter search. Detailed in chapter 6 on page 14.
Each bar-series has an identifier parameter. This is used throughout the configuration by
other sections that need to use bar series data.
The first bar-series defined has a source type parameter of database. This means it will
use the database defined in source in the directory specified in the
bar-series-collection section.
The second bar-series defined has a bar-series-type of const-time. This means it will
generate constant duration candles with the number of minutes defined by the setting
bar-duration-minutes. In this example 90 minutes is used. We could use 240 to generate H4
candles, 20 to generate 20 minute candles, etc. We are not limited to the standard candle duration. Also defined is the source type of bar-series and the source of EURUSDm1. This means
it will source its data from the previously defined bar-series. The delay-minutes-offset is
set to 0 in this example. This means that the candle series will be generated from 00:00 on the
date of the first candle in the historical set. We could use a setting of 15 for example, meaning
37
that the candles will be generated 15 minutes later than the previous setting.
Table 9.2 illustrates the effect of this setting. It can be used as a way of not starting candles
until news announcements have been absorbed by the market.
Table 9.2: The effect of the delay-minutes-offset parameter on intraday candles.
Candle Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Open
00:00
01:30
03:00
04:30
06:00
07:30
09:00
10:30
12:00
13:30
15:00
16:30
18:00
19:30
21:00
22:30
Offset 0
Time Close Time
01:30
03:00
04:30
06:00
07:30
09:00
10:30
12:00
13:30
15:00
16:30
18:00
19:30
21:00
22:30
00:00
Open
00:15
01:45
03:15
04:45
06:15
07:45
09:15
10:45
12:15
13:45
15:15
16:45
18:15
19:45
21:15
22:45
Offset 15
Time Close Time
01:45
03:15
04:45
06:15
07:45
09:15
10:45
12:15
13:45
15:15
16:45
18:15
19:45
21:15
22:45
00:15
9.1.1
38
Renko Bars
DeepThought can generate Renko bars. These are where the price is constant and the duration
is variable. Two types are available, the difference being when a new bar is generated. Figures
9.1 and 9.2 illustrate the difference between the two types.
Figure 9.1: Type 1 Renko Bars
Figure 9.2: Type 2 Renko Bars

For type 1 Renko bars, a new bar is created when the price moves up or down by
const-bar-price pips. For type 2 bars a new bar is created when the price moves
const-bar-price pips above the previous high, or const-bar-price pips below the previous low. Note that for Renko bars, the high is equal to either the open for a down bar and equal
to the close for an up bar and conversely for the low. For Renko type 2 bars the price must move
twice const-bar-price pips when the bar reverses direction to the previous bar. This can have
the effect of twice the loss (gain) if getting a reversal forecast wrong (correct) compared with
forecasting a Renko bar in the same direction as the previous bar. In figures 9.1 and 9.2 note
the relative positions of the bars labelled 1 and 2.
A configuration snippet that generates Renko bars with a 20 pip price movement is given below.
1
2
<bar-series>

3
4
5
6
7
8
9
10
11
12
13
14
15
16
</bar-series>
<bar-series>
<bar-series-type>const-price-method-2</bar-series-type>
</bar-series>
Fixed duration and renko bars can be mixed.
39
9.1.2
40
Summary of bar-series Options
Table 9.3 summarises the options available when defining bar-series objects.
Table 9.3: bar-series configuration options.
Setting
identifier
bar-series-type
source type
price-to-pip-multiplier
average-spread
bar-duration-minutes
const-bar-price
delay-minutes-offset
load-from-date
9.2
Description
A unique identifier that identifies this bar series to other
configuration sections.
One of:
const-time for fixed duration candles.
const-price-method-1 for type 1 Renko bars.
const-price-method-2 for type 2 Renko bars.
One of:
database if the bar series is 1 minute candles stored in a
database.
bar-series if the bar series is generated from a 1 minute
source.
The source value defines which database or bar-series to
use.
The multiplier to multiply the smallest price change to 1
pip. For example if a pip is defined (by the broker) to be a
price change of 0.0001, the multiplier would be 10000 to get
1 pip.
The average spread in pips to use during backtesting and
paper-trading.
If the bar-series-type has been defined as const-time,
this option is mandatory. It is the number of minutes of the
duration of a candle.
If the bar-series-type has been defined as either
const-price-method-1 or const-price-method-2, this is
the price movement, in pips, that defines a bar.
Optional parameter if the bar-series-type has been defined as const-time. Specifies the offset in minutes as described above.
Optional parameter. Used to load data from a specific date
in the format YYYY-MM-DD. Useful for live trading to
speed spin-up time by not loading data that is not required.
This should be set to a date no later than is required to
create a training set.
bar-series-collection
This section specifies where Sqlite databases are located. The configuration snippet is given
below.
1
2
3
The data-file-dir option set the directory of the Sqlite databases.
9.3
41
model
The model defines the features, target and when to forecast. It contains a single target section
which also defines the trigger (i.e. when to forecast) and as many feature sections as as we
need. A sample configuration snippet is given below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
<model>
<target>
<number>1</number>
</target>
<feature>
<period>h4</period>
</feature>
<feature>
<number>30</number>
<outlier-percentile>1</outlier-percentile>
</feature>
<feature>
<period>5</period>
</feature>
<feature>
<period>10</period>
<scale-type>min-max</scale-type>>
</feature>
<feature>
<period>20</period>
</feature>
</model>
This example model comprises a target of whether the price will be higher or lower by 1 bar in
the future and the following features:
The hour of day binarised into 6 attributes.
The close price differences between the previous 30 bars.
Exponential moving averages of periods 5, 10, 20 with the attributes being 16 differences
spread across the past 100 values.
All continuous features are normalised using the min-max scheme.
9.4
42
Features
The currently available feature list is given in table 9.4. There are plans to increase this list
in future releases and include the ability to create your own features using Python scripts. All
features must define a type which tells the system what the feature is. The configuration for
each feature is placed within a model section as a model comprises features and one target.
Table 9.4: Features used as independent inputs to machine learning models.
Feature
hour-of-day
day-of-week
bar-diff
bar-attribute
moving-average
python-script
Description
The hour of day. Can be continuous (023), categorical to
nearest H4 (6 binarised attributes), or categorical to H1 (24
binarised attributes)
Day of the week. Can be continuous (06) or categorical
with 7 binarised attributes.
The difference between two candle prices such as high-high
or close-close.
Similar to bar-diff. Can be absolute values such as volume,
or difference between candle attributes such as average price
(the average close of the 1 minute candles contained within
the candle)
The popular and ever-present moving average.
Custom feature in Python script.
Each individual feature and its options are detailed in the following sections.
9.4.1
43
hour-of-day
An example hour-of-day configuration snippet is given below and table 9.5 details the parameters options.
1
2
3
4
<feature>
<period>h4</period>
</feature>
Table 9.5: hour-of-day feature.

Option
type
period
Description
Must be hour-of-day.
Defines the way in which the hour is encoded as attributes.
Takes one of the following values:
H1
Binarises with 24 attributes. Refer to section 4.8 for details
on binarising features.
H4
Discretises the hour to the most recent H4 open time so
we have 6 possible values. Thus hours 0,1,2,3 become
1,0,0,0,0,0 and 4,5,6,7 become 0,1,0,0,0,0 and 20,21,22,23 become 0,0,0,0,0,1 etc.
single Treats the hour as a continuous variable, using min-max scaling.
none
Disable this feature. Useful as a value for the genetic algorithm to turn this feature on and off.
9.4.2
day-of-week
The day-of-week configure snippet is:

1
2
3
4
<feature>
<type>day-of-week</type>
<representation>binary</representation>
</feature>
Table 9.6 details the options for the day-of-week feature.

Table 9.6: day-of-week feature.
Option
type
representation
Description
Must be day-of-week.
Defines the way in which the day is encoded as attributes.
binary Binarises with 7 attributes. Refer to section 4.8 for details
on binarising features.
single Treats the day as a continuous variable, using min-max scaling.
none
Disable this feature. Useful as a value for the genetic algorithm to turn this feature on and off.
44
9.4.3
45
bar-diff
An example bar-diff snippet that extracts prices differences between the close price is:
1
2
3
4
5
6
7
8
9
<feature>
<type>bar-diff</type>
<diff-type>close</diff-type>
<min-max-clamp>0.015</min-max-clamp>
<scale-type>none</scale-type>
<selection-list>1,2,3,5,7,13,20,55</selection-list>
</feature>
This example create eight attributes with the values given in table 9.7.
Table 9.7: Price difference examples for the bar-diff feature.
Attribute Number
1
2
3
4
5
6
7
8
Description
The price difference between the close price of the last completed candle at the time of forecast/sample time and the
close price of 1 candle before.
The price difference between 1 and 2 candles before the forecast/sample time.
The price difference between 7 and 13 candles before the
forecast/sample time.
These price differences are set by the selection-list option. We can also use the number
option instead of selection-list. For example:
1
2
3
4
5
6
7
8
<feature>
<diff-type>close</diff-type>
<min-max-clamp>0.015</min-max-clamp>
<scale-type>zscore</scale-type>
<number>80</number>
</feature>
will generate 80 attributes of the price differences between the 80 candles immediately before
the forecast/sample time.
Table 9.8 lists all the options available for the bar-diff feature.
Table 9.8: bar-diff feature parameter options.

Option
type
diff-type
selection-list
bar-series
scale-type
outlier-percentile
number
Description
Must be bar-diff.
The type of data used to generate the attributes. Takes one of the
following values:
close
The close price.
high
The price of the high.
low
The price of the low.
high-open
The high price minus the open price of the
same candle.
high-close
The high price minus the close price of the
same candle.
close-low
The close price minus the low price of the
same candle.
open-to-close
The close price minus the open price of the
same candle.
prev-close-to-open The open price minus the close of the previous candle.
Comma separated list of candle indexes from the candle at the time
of forecast/sample for which to calculate the price differences.
The identifier of the bar-series.
The type of scaling used to normalise the features. Takes one of
the following values:
min-max
Scale all values between 1 and 1 using
the minimum and maximum values for the
feature value.
zscore
For each feature value, subtract the mean
and divide by the standard deviation. The
resulting scaled feature has a mean of zero
and a standard deviation of one.
div-sd
Divide each feature by the standard deviation.
div-max
Divide each feature by the maximum of
the absolute values of the maximum and
minimum.
log10
Take the base-10 logarithm of each feature
value.
none
Do not use any scaling.
Optional.
The percentile to use to remove outliers.
If
outlier-percentile is not supplied, then no outliers are removed. If set to 1, this setting will use the values at the 1%
and 99% percentiles as the min and max. All values higher/lower
than this will be trimmed to these percentile values.
The number of attributes in this feature.
Use this if
selection-list is not used.
46
9.4.4
47
bar-attribute
This feature has many options. In a future version this feature configuration and the bar-diff
feature configuration will be merged. An example bar-attribute snippet that extracts the
most recent 30 prices difference between the average-close price is:
1
2
3
4
5
6
7
8
<feature>
</feature>
The average-close is the average close price of all the 1 minute candles contained within an
individual candle in the EURUSDh4 bar-series. Table 9.9 lists all the options available for the
bar-attribute feature.
Table 9.9: bar-attribute feature parameter options.
Option
type
attribute-type
value-type
bar-series
selection-list
scale-type
Description
Must be bar-attribute.
following values:
average-close
The average of the close prices of the 1
minute candles contained within the candle.
average-hlc
The average of the high, low and close
prices.
volume
The volume traded during the candle. For
Forex this is the number of times the price
changed, a sort of defacto for volume.
minute-high
The number of minutes since the open
that the high price was reached.
minute-low
that the low price was reached.
mins-between-high-low The number of minutes between the time
that the high price was reached and the
time that the low price was reached.
The way the feature is constructed. Takes one of the following values:
value
The raw value of the feature. Use with
care so as not to expose the predictors to
values they have not seen in training.
diff
The average of the high, low and close
prices.
Comma separated list of candle indexes from the candle at the time of
forecast/sample for which to calculate the moving average differences.
The type of scaling to use to normalise the features. Takes one of the
following values:
min-max
feature value.

div-sd
div-max
minimum.
log10
value.
none
Optional.
If
outlier-percentile is not supplied, then no outliers are removed.
If set to 1, this setting will use the values at the 1% and 99% percentiles as the min and max. All values higher/lower than this will
be trimmed to these percentile values.
The number of attributes in this feature.
zscore
outlierpercentile
number
48
9.4.5
49
moving-average
This feature is the ubiquitous moving average. Future versions of DeepThought will enable you
to code your own indicators using Python or similar. A sample configuration snippet:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<feature>
<period>10</period>
</feature>
<feature>
<period>20</period>
</feature>
This example is for two features with 16 attributes each. The selection-list option controls
which candle indexes are use to generate the attributes as shown in table 9.7. The options for
the moving-average feature are in table 9.10.
50
Table 9.10: moving-average feature parameter options.

Option
type
ma-attribute-type
period
bar-series
scale-type
outlier-percentile
Description
Must be moving-average.
following values:
open
The open price.
high
The price of the high.
low
The price of the low.
close
The price at the close of the candle.
average-close
The average of the 1 minute candles that
are contained within the candle.
average-hlc
The average of the high, low and close.
volume
The volume traded during the candle. For
Forex this is the number of times the price
changed, a sort of defacto for volume.
bar-duration
The duration in minutes of the Renko bar.
Not relevant for fixed-duration candles.
mins-between-high-low The number of minutes between the time
the high was reached and the time that
the low was reached.
time-high
that the high price was reached.
time-low
that the low price was reached.
The period of the moving average.
The type of scaling to use to normalise the features. Takes one of the
following values:
min-max
feature value.
zscore
div-sd
div-max
minimum.
log10
value.
none
Optional.
If
outlier-percentile is not supplied, then no outliers are removed.
If set to 1, this setting will use the values at the 1% and 99% percentiles as the min and max. All values higher/lower than this will
be trimmed to these percentile values.
9.4.6
51
csv-feature
The CSV Feature enables you to use your own data in CSV format. It can be generated by
Metatrader via a script to generate data for backtesting and a script has been provided for this.
It can be found in the Metatrader Scripts folder in the DeepThought install directory. The
CSV must also be generated by the EA when live/paper trading and the example expert advisors
demonstrate how to do this. If you are using another trading platform, you should be able to
generate this file from that platform. A sample configuration snippet is given below.
1
2
3
4
5
6
7
<feature>
<type>csv-feature</type>
<filename>C:\IBFX-MT4-AU\experts\files\EURUSD_CCI.csv</filename>
<identifier>cci_feature</identifier>
<value-type>value</value-type>
<selection-list>1,2,3,5,8,13,21,34</selection-list>
</feature>
File Format
The file format of the CSV file is:
YYYY.mm.DD,HH:MM,%value%
where %value% is a double/float value. The order of dates must be decreasing from the top of
the file, i.e newest at the beginnig, oldest at the end of the file. When backtesting DeepThought
will look in the CSV file to find the closest previous value with date equal to or earlier than
the date the the forecast is being taken. The <selection-list> indexes are defined from this
point. The first few entries of an example CSV file is given below:
2014.02.04,08:00,-16.56445518
2014.02.04,04:00,-7.55706836
2014.02.04,00:00,-46.66780044
2014.02.03,20:00,-29.3612079
2014.02.03,16:00,-24.97550049
2014.02.03,12:00,-46.92997607
2014.02.03,08:00,-64.23733597
2014.02.03,04:00,-97.95026325
2014.02.03,00:00,-97.8667956
2014.02.02,23:00,-108.98997431
2014.01.31,20:00,-123.76219274
2014.01.31,16:00,-147.9225595
2014.01.31,12:00,-171.87121273
2014.01.31,08:00,-111.88052857
...
Table 9.11 lists the options for the csv-feature.
Table 9.11: csv-feature parameter options.

Option
type
file-name
identifier
value-type
selection-list
scale-type
outlier-percentile
Description
Must be csv-feature.
The fully qualified location of the file where the CSV data is located. e.g. C:\IBFX-MT4-AU\experts\files\EURUSD CCI.csv
A unique identifier for this feature. Used to assign the correct
scaling and identify items in output file. Inbuilt features are able
to generate this from their parameters however as this is a userdefined feature the identifier must be supplied manually.
Defines how to process the CSV values. Takes one of the following
values:
diff
The difference between values whose indexes are specified by the selection-list. For example if the list is
specified as 1,2,5 then the differences use in the model
will be the difference between the value at 0 and 1, 1
and 2, 2 and 5..
value
Use the value directly. The first element in the
selection-list should be 0.
The indexes of the values to take from the CSV file. The indexes
are relative to the date of the sample that is being extracted. For
example if the backtester is forecasting on 5th May 2013 at 8 am,
index 0 (if value-type is diff) will correspond to the closest
matching prior value to 2013.05.05,08:00. If there is a value at
this time it will be used otherwise use the first value before this
date.
The type of scaling to use to normalise the features. Takes one of
min-max Scale all values between 1 and 1 using the minimum
and maximum values for the feature value.
zscore
For each feature value, subtract the mean and divide
by the standard deviation. The resulting scaled feature has a mean of zero and a standard deviation of
one.
div-sd
div-max Divide each feature by the maximum of the absolute
values of the maximum and minimum.
log10
Take the base-10 logarithm of each feature value.
none
Optional.
If
outlier-percentile is not supplied, then no outliers are removed. If set to 1, this setting will use the values at the 1%
and 99% percentiles as the min and max. All values higher/lower
than this will be trimmed to these percentile values.
52
9.4.7
53
python-script
The python-script feature is detailed in section 8.2 on page 25 with scripting example usage. It
enables you to implement virtually any feature using Python scripting. An example configuration
snippet is given below.
1
2
3
4
5
6
7
8
9
10
<feature>
<get-number-of-attributes-func-name>GetNumberOfAttributes</get-number-of-attributes-func-name>
<get-features-func-name>GetFeatures</get-features-func-name>
<parameter name="ma\_short\_period" type="int">20</parameter>
<parameter name="ma\_long\_period" type="int">50</parameter>
<identifier>python-test-1</identifier>
</feature>
Table 9.12: python-script feature parameter settings.

Option
type
set-parameter-value-func-name
get-number-of-attributes-func-name
get-features-func-name
identifier
parameter
scale-type
Description
Must be python-script.
The name of the function that is used to set
parameters used by other functions in the script.
Typically SetParameterValue
The name of the function that returns the number of attributes set by the function set in
get-features-func-name. Typically GetNumberOfAttributes
The name of the function that is responsible
for generating the numerical values of the attributes. Typically GetFeatures.
A unique identifier for this feature.
An optional parameter that is passed via
set-parameter-value-func-name. You can
have as many parameters defined as you need.
The values are able to be set using the genetic
algorithm if desired. Two attributes must be set
with this element:
name
Name of the parameter which
will be passed as a string
to the function defined by
set-parameter-value-func-name.
type
Takes one of the following values: int, string, double. The
value of parameter is passed as
this type to the function defined by
The type of scaling to use to normalise the features. Takes one of the following values:
min-max Scale all values between 1 and 1
using the minimum and maximum
values for the feature value.

zscore
div-sd
div-max
log10
none
54
For each feature value, subtract the
mean and divide by the standard deviation. The resulting scaled feature
has a mean of zero and a standard
deviation of one.
Divide each feature by the standard
deviation.
Divide each feature by the maximum of the absolute values of the
maximum and minimum.
Take the base-10 logarithm of each
feature value.
9.5
55
Targets
A target, also termed label, is the dependent variable that we are trying to predict. Currently
DeepThought supports the ability to target future price changes and a custom target where you
provide Python script to compute the target. The target configuration forms part of a model
configuration. Each model configuration must have one and only one target section.
9.5.1
bars-in-future
This is a built in target where we are predicting the change in price at the close of one or more
candles in the future. A sample configuration snippet is:
1
2
3
4
5
6
7
<target>
<number>1</number>
</target>
This example labels a training set with a target of 1 bar in the future on the EURUSDh4 bar series.
A forecast will attempt to predict this label. The price-type option in this case returns 1 for
price will move up and 1 for price will move down. Table 9.13 summarises the options.
Table 9.13: bars-in-future target.
Option
type
identifier
bar-series
price-type
number
Description
Must be bars-in-future.
A unique identifier for this feature.
The bar-series that the target is calculated
on.
following values:
up-down For classification, 1 if the price moves up, 1 if the
price moves down.
close
For regression, the change in price of the close at
number bars in the future.
high
For regression, the change in price of the high at
number bars in the future.
low
For regression, the change in price of the low at number
bars in the future.
The number of bars to look into the future.
9.5.2
56
python-script
You can provide a Python script to compute a target. This is covered in more detail with example
scripts in section 8.3 on page 28. An example configuration snippet is given below.
1
2
3
4
5
6
7
8
9
<target>
<identifier>target-next-pip-movement</identifier>
<bar-series>EURUSDm1</bar-series>
<set-parameter-value-func-name>TargetSetParameterValue</set-parameter-value-func-name>
<check-target-trigger-func-name>GetIsTargetTrigger</check-target-trigger-func-name>
<get-target-func-name>GetTarget</get-target-func-name>
<parameter name="pip-movement" type="double">20.0</parameter>
</target>
This example labels a training example with a target of +1 or -1 depending on whether the price
will move 20 pips up or 20 pips down. There is no time limit for the price movement, although
this could be coded in the Python script. Table 9.14 summarises the options.
57
Table 9.14: python-script target.

Option
type
identifier
bar-series
check-target-trigger-func-name
get-target-func-name
parameter
Description
Must be python-script.
A unique identifier for this target.
The bar-series to trigger from. At the close of
each candle in this series, the functions defined by
check-target-trigger and get-target-func-name
are called.
The name of the function that is used to set parameters used by other functions in the script. Typically
SetParameterValue
The name of the function that checks to see if a the criteria to trigger a forecast has been reached. It it has, a
training example is created and a forecast made. The
training example is cached until the criteria to label a target has been met as defined by the function
given in get-target-func-name. Typically GetIsTargetTrigger.
The name of the function that checks to see if a target
has been reached. This function compares the state
of the example when the trigger was reached with the
current state of the market and assigns a target value
if the target criteria has been reached. Typically GetTarget.
set-parameter-value-func-name. You can have as
many parameters defined as you need. The values are
able to be set using the genetic algorithm if desired.
Two attributes must be set with this element:
name Name of the parameter which will be passed
as a string to the function defined by
type Takes one of the following values: int,
string, double. The value of parameter is
passed as this type to the function defined by
9.6
58
Predictors
Your configuration can have as many predictor sections as you desire. The number is only
limited by the computing power of your hardware. The forecasts are combined and a majority
vote decides the signal direction. If there is no net agreement among the predictors, the signal
will be either hold or exit all. A threshold can be set in the signal-generator configure
section so signals will only be generated if the sum of the forecasts is higher than this threshold.
Several predictor types are implemented, detailed in the following sections.
9.6.1
svm-predictor
The svm-predictor supports both classification and regression forecasts, and several kernel
types are available. We recommend using classification with the RBF kernel to start with.
Below is an sample configuration snippet:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<svm-predictor>
<continuous-tune-num-param-sets>1</continuous-tune-num-param-sets>
<model-min-accuracy>54.0</<model-min-accuracy>
<params> 
<penalty>512</penalty>
<gamma>0.25</gamma>
</params>
<params> 
<gamma>0.25</gamma>
</params>
<params> 
<gamma>0.25</gamma>
</params>
</svm-predictor>
This example uses the features generated by the h4-features model. It has three sets of
parameters, so it is actually a mini ensemble of three predictors. The forecast value for signal
generation purposes is the sum of the three predictors so this example can only output the
following values: 3, 1, 1, 3. The predictor-weight is set at 1.0. If we had other predictors
we can adjust this option to weight the individual predictors.
In the example configuration above, a training set comprises 2000 training examples using a
sliding window of 1 defined by the num-training-skip option. This means that the 2000
training examples are sampled at the close of each candle using the most recent history and
working back in time. If we had set num-training-skip to 5, for example, then training
samples would be created every 5th candle.
Each individual SVM has its own set of parameters. In the above example all the SVMs are
classification as svm-type is set to SVC, with a RBF kernel. This combination requires that the
59
penalty and gamma (Gaussian width of the RBF kernel) be defined.

Ensembles: Bucket of Models
A bucket of models is a type of ensemble where a number of models are tested and the
best one selected for forecasting. DeepThought achieves this by providing the parameters
continuous-tune and model-min-accuracy. The continuous-tune parameter is set to True
to enable continuous selection of models. After each candle completes a forecast is performed
and orders placed. After order placement the system retrains. It first does this by performing a cross-validation across Penalty and gamma for classification and additionally epsilon
for regression. The best model is selected provided that its cross-validation accuracy is at
least the accuracy specified in model-min-accuracy. If not then the the action specified by
no-model-behaviour is performed. To summarise the steps:
1. At strategy spin-up, run cross validation to find the best model. Alternatively load a
previously created model.
2. At the end of the candle, forecast and trade using the current model.
3. Run a cross-validation, including the newly completed candle in the training set.
4. If the best model(s) in the cross-validation set has accuracy of at least the accuracy specified by model-min-accuracy, then replace the model to be used at the next forecast.
5. Wait for the close of the candle, then repeat from step 2.
svm-predictor Options
Table 9.15 lists all options for the svm-predictor.
Table 9.16 lists the options for the params sections of the svm-predictor.
60
Table 9.15: svm-predictor configuration options.

Option
identifier
predictor-weight
model
continuous-tune
continuous-tune-num-param-sets
model-min-accuracy
no-model-behaviour
params
num-training-observations
num-training-skip
Description
A unique identifier for this predictor. Used in log and
output files.
The weight given to this predictor when multiple predictors are used. Can be negative.
The identifier of the model that the predictor operates on.
Values are True or False. If set to True will
conduct a parameter search (for penalty, gamma)
after each forecast, before retraining.
The top
continuous-tune-num-param-sets parameter sets
are used. Note that any params sections are ignored if
this option is set as it produces params sections which
can be continuously changing.
The number of params sections to produce when the
continuous-tune parameter is set to True.
The minimum accuracy for a new model to be selected
with continuous-tune (see above for a detailed explanation).
The action when a model cannot be found with accuracy of at least model-min-accuracy. Takes one of
dont-trade
Close all orders and do not
forecast
use-last-model
Use the last best model(s)
use-default-params Use the parameters defined in
the <params> section.
The parameter set of an individual SVM. Mandatory
to have at least one params section.
The number of training examples in a training set.
The larger this number, the further back in time samples are drawn from.
A sliding window is used to select training examples.
This parameter sets the number of bars that the sliding window is moved.
Table 9.16: params configuration options for the svm-predictor.

Option
penalty
gamma
epsilon
degree
coeff
svm-type
kernel
Description
The penalty (sometimes written as C).
The Gaussian width of an RBF kernel when the kernel is set to rbf.
The epsilon insensitivity, used when the svm-type is SVR.
The polynomial degree when kernel is set to polynomial.
A coefficient used in the polynomial and sigmoid kernels.
The prediction type if the SVM. Takes one of the following values:
SVC
Classification (two class).
SVR
Regression (continuous values).
The kernel to use. All kernels require the penalty and gamma options.
rbf
Gaussian Radial Basis Function.
Requires the
epsilon option when svm-type is SVR.
k(xi , xj ) = exp(kxi xj k2 )
where =gamma
linear
Linear kernel.
k(xi , xj ) = xi xj
polynomial Polynomial kernel. Requires degree to specify the
polynomial degree.
k(xi , xj ) = (xi xj + c)d
where c=coeff, d=degree
sigmoid
Sigmoid (hyperbolic tangent) kernel.
k(xi , xj ) = tanh(xi xj + c)
where c=coeff, =gamma
61
9.6.2
62
linear-svm-predictor
The linear-svm-predictor is an SVM supporting only linear models. Use with caution as
linear modelling may not be the best way to model financial markets, however the linear-svm-predictor may be useful as part of an ensemble containing different predictor types. More
information on linear SVM is at http://www.csie.ntu.edu.tw/~cjlin/liblinear/. Below is
a sample configuration snippet:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<linear-svm-predictor>
<identifier>linear-svm-1</identifier>
<model>15min-features</model>
<params>
<penalty>1.0</penalty>
<solver-type>L2R_L2LOSS_SVC_DUAL</solver-type>
</params>
<params>
<epsilon>0.05</epsilon>
<solver-type>L2R_L2LOSS_SVR_DUAL</solver-type>
</params>
</linear-svm-predictor>
This example uses the features generated by the 15min-features model. It has two set of
SVM parameters, so it is actually a mini ensemble of two predictors. The first set defines a
classification predictor and the second set defines a regression predictor. The forecast value for
signal generation purposes is the sum of the two predictors. If we had other predictors, we can
adjust predictor-weight to weight the final value. This example uses a training set comprising
1000 training examples using a sliding window of 1 defined by the num-training-skip option.
This means that the 1000 training examples are sampled at the close of each candle using the
most recent history and working back in time. If we had set num-training-skip to 5 for
example, then training samples would be created every 5th candle.
Each individual SVM has its own set of parameters. In this example the first SVMs is a classifier
as solver-type is set to L2R L2LOSS SVC DUAL and the second forecasts a value (regression) as
the solver-type is L2R L2LOSS SVR DUAL. Its probably not a good idea to mix classification
and regression like this, we just show it here as a configuration example.
Table 9.17 lists all options for the linear-svm-predictor and table 9.18 lists the options for
the params sections of the linear-svm-predictor.
63
Table 9.17: linear-svm-predictor configuration options.

Option
identifier
predictor-weight
model
params
num-training-skip
Description
A unique identifier for this predictor. Used in
log and output files.
The weight given to this predictor when multiple
predictors are used. Can be negative.
The identifier of the model that the predictor
operates on.
The parameter set of an individual SVM.
Mandatory to have at least one params section.
The number of training examples in a training
set. The larger this number, the further back in
time samples are drawn from.
A sliding window is used to select training examples. This parameter sets the number of bars
that the sliding window is moved.
Table 9.18: params configuration options for the linear-svm-predictor.

Option
penalty
epsilon
solver-type
Description
The penalty (sometimes written as C).
The epsilon insensitivity, used for regression problems. Not used for
classification.
The solver used in the SVM. Takes one of the following values (note SVC
for classification and SVR for regression:
L2R LR
L2-regularized logistic regression (primal)
L2R L2LOSS SVC DUAL L2-regularized L2-loss support vector classification
(dual)
L2R L2LOSS SVC
L2-regularized L2-loss support vector classification
(primal)
L2R L1LOSS SVC DUAL L2-regularized L1-loss support vector classification
(dual)
MCSVM CS
support vector classification by Crammer and Singer
L1R L2LOSS SVC
L1-regularized L2-loss support vector classification
L1R LR
L1-regularized logistic regression
L2R LR DUAL
L2-regularized logistic regression (dual)
L2R L2LOSS SVR
L2-regularized L2-loss support vector regression (primal)
L2R L2LOSS SVR DUAL L2-regularized L2-loss support vector regression (dual)
L2R L1LOSS SVR DUAL L2-regularized L1-loss support vector regression (dual)
9.6.3
64
gbt-predictor
Gradient boosted trees are a decision tree based predictor. An initial decision tree is constructed
and subsequent trees are trained on the errors of the previous trees. They are good at generalising
and over fitting tends not to be an issue. Below is an example configuration snippet:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<gbt-predictor>
<identifier>gbt-1</identifier>
<params>
<num-trees>500</num-trees>
<depth>6</depth>
</params>
<params>
<depth>5</depth>
</params>
</gbt-predictor>
The above example has a mini ensemble of two GBT predictors defined by the two sets of
parameters. These param sets apply in the same was as for the SVM predictor described in
section 9.6.1. Table 9.19 details the options for the gbt-predictor configuration section.
Table 9.19: params configuration options for the gbt-predictor.
Option
identifier
predictor-weight
model
params
num-training-skip
Description
operates on.
The parameter set of an individual GBT.
Mandatory to have at least one params section.
Table 9.20 lists the options for the params section of the gbt-predictor.
Option
num-trees
depth
Description
The number of decision trees to use. The higher the number, the more accurate the prediction, however it increases
computation time with diminishing returns.
The depth of the individual decision trees.
Table 9.20: gbt-predictor options.
9.6.4
65
random-forest-predictor
The random forest predictor works by randomly selecting features and training samples from
the training set. Another way to view this is if we consider the training set to be a matrix,
decision trees are constructed on random selections of the rows and columns. This creates a
forest of decision trees. When forecasting, the majority class produced by all the trees is the
final prediction. The configuration is similar to the gbt-predictor configuration. Below is an
example configuration snippet:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<random-forest-predictor>
<identifier>gbt-1</identifier>
<params>
<depth>6</depth>
</params>
<params>
<depth>5</depth>
</params>
</random-forest-predictor>
The above example has a mini ensemble of two random forest predictors defined by the two
sets of parameters. These param sets apply in the same was as for the SVM predictor described in section 9.6.1. Table 9.21 details the options for the random-forest-predictor
configuration section and table 9.22 lists the options for the params section of the
random-forest-predictor.
Table 9.21: params configuration options for the random-forest-predictor.
Option
identifier
predictor-weight
model
params
num-training-skip
Description
operates on.
The parameter set of an individual Random Forest. Mandatory to have at least one params section.
Table 9.22: Random Forest random-forest-predictor options.

Option
num-trees
depth
Description
The number of decision trees to use in the forest.
The maximum depth of the individual decision trees.
9.6.5
66
extremely-randomised-trees-predictor
Extremely randomised trees are a variation of random forests. The parameters are identical
to the random-forest parameters. The only difference is that the configuration section is
defined by extremely-randomised-trees-predictor. See section 9.6.4 above for details on
the configuration.
9.6.6
multi-layer-perceptron-predictor
Also known more popularaly as a Neural Network. This predictor comprises one or more hidden
layers with variable number of neurons per layer. The topology is illustrated in figure 4.2 on
page 8. Three variations are available for the multi-layer perceptron (MLP). These are
Regression - forecasting the price move.
Classification - forecasting 1 or 1 for up/down.
Classification - forecasting a value between 1 and 1, where the sign (positive or negative)
indicates direction and the magnitude indicates the probability or certainty.
A sample configuration snippet is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<multi-layer-perceptron-predictor>
<identifier>nn-1</identifier>
<params>
<training-algo>rprop</training-algo> 
<hidden-layers>20</hidden-layers>
<activation-function>sigmoid</activation-function> 
<max-iterations>1000</max-iterations>
<termination-epsilon>0.01</termination-epsilon>
<forecast-type>classification</forecast-type> 
<classification-output>value</classification-output> 
</params>
</multi-layer-perceptron-predictor>
Table 9.23 details the options for the multi-layer-perceptron-predictor configuration section and table 9.24 lists the options for the params sections.
Table 9.23: Multi-layer Perceptron multi-layer-perceptron-predictor options.

Option
identifier
predictor-weight
model
params
num-training-skip
Description
operates on.
The parameter set of an individual multi-layer
perceptron. Mandatory to have at least one
params section.
Table 9.24: params configuration options for the multi-layer-perceptron-predictor.

Option
training-algo
hidden-layers
activation-function
max-iterations
termination-epsilon
forecast-type
classification-output
Description
DeepThought
supports
two
training
algorithms.
training-algo must be one of:
rprop
Resilient backpropagation
backpropagation Standard back-propagation
Comma delimited list of the number of neurons in the hidden
layer(s). Use a single number for one hidden layer.
The function applied to the output of a neurons value. Must
be one of:
identity
Use the output value as is.
sigmoid
The most commonly used function.
gaussian
An experimental gaussian function.
Stop training after this many iterations.
Stop training when the error drops below this value. Use 0
to disable.
Must be one of:
classification
Up/down classification problem.
regression
Regression problem forecasting the change
in price.
If forecast-type has been set as classification, two
variations are available. These are:
binary
Output is 1 or 1 for up/down.
value
Output is a value between 1 and 1,
where the sign (positive or negative) indicates direction and the magnitude indicates the probability or certainty.
67
9.6.7
68
python-predictor
This is a custom predictor where you supply Python code. This is discussed in more detail
in secion 8.4 on page 30 with example Python scripts. The predictor need not be machine
learning based. In fact it is possible to use standard mechanical technical analysis using a
Python predictor. A sample configuration snippet is given below.
1
2
3
4
5
6
7
8
9
10
11
12
<python-predictor>
<identifier>python-predictor-h4</identifier>
<predict-func>Predict</predict-func>
<train-func>Train</train-func>
<parameter name="ma-long" type="int">20</parameter>
<parameter name="ma-short" type="int">5</parameter>
</python-predictor>
The above example is a configuration for a moving-average cross predictor where two parameters are supplied to the Python script. Note that the number of training observations is set only
to 25. In this example we only need enough observations to compute the moving averages.
Table 9.25 details the options for the python-predictor.
9.7
predictor-ensemble
The predictor-ensemble parameter only defines a single option. A sample configuration snippet is given below.
1
2
3
<retrain-period>Weekly</retrain-period>
The retrain-period option will retrain all predictors after forecasting and signal generation.
The values for retrain-period are given in table 9.26.
69
Table 9.25: python-predictor parameter settings.

Option
identifier
predictor-weight
model
predict-func
train-func
parameter
num-training-skip
Description
A unique identifier for this predictor.
The weight given to this predictor when multiple predictors are used. Can be negative.
The identifier of the model that the predictor operates on.
The name of the function that is used to set parameters used by other functions in the script. Typically
SetParameterValue
The name of the function that performs the prediction.
Typically Predict
The name of the function that does training on historical data. Typically Train.
set-parameter-value-func-name. You can have as
many parameters defined as you need. The values are
able to be set using the genetic algorithm if desired.
name Name of the parameter which will be passed
as a string to the function defined by
type Takes one of the following values: int,
string, double. The value of parameter is
passed as this type to the function defined by
The number of training examples in a training set.
The larger this number, the further back in time samples are drawn from.
A sliding window is used to select training examples.
This parameter sets the number of bars that the sliding window is moved.
Table 9.26: retrain-period options for the predictor-ensemble.

Option
none
each-bar
daily
weekly
monthly
Description
Dont retrain after the initial training.
Train after each bar of the bar-series defined by
the target in each model.
Retrain each day at 00:00.
Retrain weekly after the first forecast on Monday.
Retrain monthly after the first forecast on the
first Monday of the month.
9.8
70
signal-generator
The signal-generator configuration section defines how a signal is created from a predictor
ensemble. A signal in this sense is an action to do something. This could be buy, sell, place
limit order, cancel unfilled orders, do nothing or close all trades. You can control the action
of the signals in the Metatrader EA provided. You can also supply optional Python script to
combine forecasts into a signal. More detail on Python scripting in the signal generator is given
in section 8.5 on page 32. A configuration snippet is show below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<signal-generator>
<entry-times>
<hour>all</hour>
</entry-times>
<combine-forecasts-func-name>CombineForecasts</combine-forecasts-func-name>
<parameter name="threshold" type="double">20.0</parameter>
<reverse-signals>False</reverse-signals>
</signal-generator>
This example will place trades at any time of the day on any day of the week. There is no
threshold so if the ensemble returns a positive value it will place a buy and if the ensemble
returns a negative value it will place a sell. Take profit is set to 50 pips, and a stop loss will
be moved to break-even +1 pip when an open position is 20 pips in profit. Table 9.27 lists the
options for the signal-generator section.
Table 9.27: signal-generator configuration options
Option
entry-times
entry-threshold
Description
Defines the hours and days that the signal generator will
generate signals for. The parameters in this subsection
are:
hour
Use all for all hours, or a comma
separated list of the allowed hours.
For example, to trade only at 10am,
12pm and 4pm the entry would be
<hour>10,12,16</hour>
day-of-week Use all for all days, or a comma
separated list of the allowed days.
Numbers (0 is Sunday) or three character representations can be used. For
example to only trade on Tuesday,
Wednesday and Thursday the entry
would
be
<day-of-week>Tue,Wed,
Thu</day-of-week>
or
<day-of-week>2,3,4</day-of-week>.
This is the value that the ensemble must exceed to generate a signal.
combine-forecasts-func-name
parameter
take-profit
stop-loss
break-even
exit-all-hour
trade-bar-series
reverse-signals
71
Optional. Only needed if you are using Python scripting.

The name of the function that is used to set parameters
used by other functions in the script. Typically SetParameterValue
Optional. Only needed if you are using Python scripting.
The name of the function that is used to combine the
forecasts from the predictors into a signal. This function
must send buy/sell/close signals as it overrides the rules
built into DeepThought . Typically SetParameterValue
Only needed if using Python scripting and your Python
function requires parameters. Defines a parameter that is
passed via set-parameter-value-func-name. You can
have as many parameters defined as you need. The values
are able to be set using the genetic algorithm if desired.
name
Name of the parameter which will be
passed as a string to the function defined
by set-parameter-value-func-name.
type
int, string, double.
The
value of parameter is passed as
this type to the function defined by
Optional. Set to 0.0 to disable. The take profit in pips
set when an order is sent to the broker.
Optional. Set to 0.0 to disable. The stop loss in pips set
when an order is sent to the broker.
Optional. Set to 0.0 to disable. If an open position exceeds this value in pips, a stop loss is set to break even
+1 pip.
Optional. Set to -1 to disable. The hour of the day to
close all trades. For example if this was set to 12, then
at 12pm each day all trades will be closed.
The identifier of the bar-series that the orders are
placed on.
If set to True will reverse all signals; Sell instead of buy
and buy instead of sell. Use with caution and only when
you are certain that your ensemble is reliably wrong.
9.9
72
trader
The trader is responsible for simulating trades and passing signals to the trading platform. A
sample configuration snippet is given below and table 9.28 lists the options for trader.
1
2
3
4
5
6
7
8
9
<trader>
<direction>both</direction>
</trader>
Table 9.28: trader configuration options.

Option
hold-minutes
hold-bars
max-drawdown
scale-out
max-position
direction
limit-orders
Description
The number of minutes to hold an open position. After a position has
been open for the number of minutes given here, the position will be
automatically closed. This auto-close function is disabled if the value is
set to 0.
The number of bars to hold an open position. After a position has been
open for the number of bars given here the position will be automatically
closed. This auto-close function is disabled if the value is set to 0.
A backtest will be halted if a drawdown of this many pips is encountered.
Useful for the genetic algorithm to abandon bad configurations.
If set to True reduces the number of positions (by two) when a reverse
signal is encountered. For example, if we are long by 5 positions and a
sell signal is received, the position will be reduced to 3. If this option
is set to False then the 5 positions will all be closed and a sell position
opened (or sell limit order placed if the limit-orders option has been
set).
The maximum number of open positions at any point in time.
The direction to trade in. Takes one of the following values:
both
Trade in both directions.
long
Only take long trades.
short Only take short trades.
If this is True limit orders will be placed, offset pips below the bid
for a buy order, and offset pips above the ask for a sell order. If
limit-orders is False then market orders are sent.
9.10
73
backtest
The backtest configuration section is where parameters only related to backtesting are set.
This section is ignored when live and paper trading. A sample configuration snippet is given
below and table 9.29 lists the options.
1
2
3
4
5
6
7
<backtest>
</backtest>
Table 9.29: backtest options.

Option
start-date
stop-date
used-recorded-signals
display-progress
execute-when-complete
Description
The date the backtest is to start from in the format yyyymm-dd.
The date the backtest is to finish in the format yyyy-mm-dd.
During a backtest (and live/paper trading) forecasts from
the predictors are recorded to a CSV file. These forecasts
can be used in backtests if you are not changing any predictor options. For example if you are only experimenting
with take-profit and stop-loss then the backtesting will
be quicker by several orders of magnitude.
When backtesting using recorded signals, the backtest is normally very quick, but can be slowed down if the progress is
printed to the console. Set display-progress to True to
turn off progress printing to speed the backtest even further.
Nothing is lost as everything is still logged to the log file.
A script to execute when the backtest has been completed.
You may have as many <execute-when-complete> entries
as you require. If the macro %CONFIG LOCATION% is present,
it will be replaced with the full path of the directory containing the configuration file. This enables scripts to parse the
various output files. In this example we are using a Python
script, but it can be anything.
9.11
74
genetic-algo
The genetic algorithm functionality for parameter selection is described in detail in chapter 6.
A sample configuration snippet is given below and table 9.30 lists the available options.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<genetic-algo>
<ga-server>tcp://wraith</ga-server>
<objective-function>sharpe</objective-function>
<parameter id="stop-loss"
type="integer" low="10" high="200" step="5" />
<parameter id="take-profit" type="integer" low="10" high="200" step="5" />
</genetic-algo>
75
Table 9.30: genetic-algo options.

Option
ga-server
ga-server-port
genome-id
timeout-minutes
population-size
objective-function
mutation-probability
num-breeders-percent
min-num-breeders
num-new-random-genomes
parameter
Description
The name of the machine that the genetic algorithm is being run from.
The GA is designed to run across multiple machines, so this tells the
remote machines where to send their results.
The TCP port to use. This can be any number 1-65535, however it
cannot conflict with any existing network services.
This is set by the genetic algorithm as a way of identifying genomes
being tested with backtest results. Leave this option value as 1.
If this is set to a value greater than zero, a generation will time-out
after this many minutes. This caters for the case where most genomes
in a population have been tested and the system is waiting for 1 or 2 to
complete. The GA server will cancel all running jobs after this timeout
and the (incomplete) results are abandoned.
This is the number of genomes that make up a generation. You will
generally want to set this to be around the number of cores you have
available in your cluster. The larger this number, the more genomes
can be tested simultaneously. Even if you only have access to a single
machine we recommend making this value at least 20.
The objective that we are optimising. This is one of the following values:
pnl
Maximise profit in Pips.
sharpe
Maximise the Sharpe ratio.
sortino
Maximise the Sortino ratio.
accuracy Maximise the accuracy.
When two genomes are being crossed, this parameter controls the probability of a mutation in the child genomes.
As each generation of genomes is tested the results for all genomes
are kept and the top num-breeders-percent are used to breed more
genomes.
This is the minimum number of breeders required to create a new generation. This setting will override the num-breeders-percent in the event
that the number of genomes produced by num-breeders-percent is
lower than the number defined by min-num-breeders. This can happen
in the first couple of generations if the population size is small. Another
way to think about these parameters is that the number of breeders is
the higher of num-breeders-percent and min-num-breeders.
This is the number of new randomly created genomes added at each
generation. This is used to ensure that new genetic material is added at
each generation.
We can have as many parameter sections as we need. Each parameter
section defines an individual parameter to be included in the optimisation. The options for parameter are documented in detail in table
9.31.
Table 9.31: parameter configuration options for the genetic-algo configuration section.
Option
integer
categorical
exp-2
Description
Used when the parameter can be modelled as an integer. The options
available for the integer type are:
low
The lowest value that the integer can take.
high
The highest value that the integer can take.
step
The value to increment/decrement for different values
of this parameter.
Used when the parameter can only take certain (string) values. The
options available for the categorical type are:
values Comma separated list of values that this parameter
can take.
Used when the parameter is best suited to an exponential grid search.
For example SVM penalty, SVM gamma and SVM epsilon are
best searched using an exponential grid search. This means that
rather than use values that are linearly spaced such as 5, 10, 15, 20, ...,
we use values such as 21 , 22 , 23 , 24 , .... This results in final values of
2, 4, 8, 16, .... Note that negative numbers can be used and result in
the final values being less than 1. e.g. 25 , 24 , 23 , 22 , ... become
0.03125, 0.0625, 0.125, 0.25, .... The options available for the exp-2 type
are:
low
The lowest value that the exponent can take.
high
The highest value that the exponent can take.
step
The value to increment/decrement the exponent.
76
Chapter 10
Commandline Tools
The DeepThought commandline tool has more options than backtesting. It can be used to
generate data for analysis in other tools such as Python, R and Excel. It can also perform
diagnostic functions on your data. If the command does not appear to have worked as expected,
check the DeepThought.log file for errors and/or warnings.
To see a list of options at anytime, type
deepthought
at the command prompt, in the DeepThought installation directory.
lar to:
DeepThought built on Apr
Usage:
--backtest <directory>
9 2014 at 23:09:36
Backtest the configuration file in

<directory>. Various files are output
to the same location. The configuration
file must be named config.xml or
_config.xml.
--extract-training-set <directory>
Uses the config in <directory> to
generate training files for all <model>
sections in the config. Use
--num-samples to define the number of
samples, --num-skip X to take every Xth
sample, and --extract-to-date to only
extract samples before the supplied
date.
--extract-to-date YYYY-MM-DD
Used with --extract-training-set to
define the date to extract samples
before.
--generate-bars <database>
Generate a bar series from the given
database and save to csv with the given
filename given with the
--output-filename parameter. Requires
the --bar-type, --output-filename
parameters and --duration for
const-time, --price-movement for
const-price (Renko) candles.
--bar-type <string>
Used with --generate-bars. Type of bars
to generate. const-time | const-price-1
(Renko) | const-price-2 (Renko).
--output-filename <filename>
Used with --generate-bars. The name of
the CSV file to create. Created in the
same directory that DeepThought is run
from.
--price-movement <price delta>
Use with --generate-bars. When the
bar-type is specified as const-price
(Renko), this parameter defined the
magnitude of the price movement.
--generate-feature-stats <directory> Run through a backtest without
generating signals and generates
feature statistics.
--genetic-algo <config template>
Run a genetic algo using the supplied
file as the template. Requires a Condor
cluster.
--import-dukascopy-csv <csv filename> Import a CSV file downloaded from
Dukascopy using the tool from
http://www.strategyquant.com/tickdatado
wnloader/. Also need the --dbname
property.
--dbname <db name>
The full path of a database to load
77
The output is simi-
CHAPTER 10. COMMANDLINE TOOLS
78
Dukascopy data into.

--manual-trade-train-and-persist <directory>
Use the configuration in the supplied
directory to train the models for use
in generating a signal.
--manual-trade-generate-signal <directory>
Use the configuration in the supplied
directory to generate the latest
signal.
--num-samples <number of samples>
define the number of samples to
extract. Set to 0 to extract all
available samples.
--num-skip <number of samples to skip>
define the number of samples to skip
per extracted sample.
--print-config
Print example config.
--stats <database>
Print out some interesting statistics
for the data in the given candle
database. Requires the --duration
parameter and optional --delay,
--start-date and --data-file-dir
parameters.
--data-file-dir <directory>
Used with --stats and --check-db.
Optional directory where database files
are located. Default is C:\FX_Database.
--delay <Int>
Used with --stats. Indicates the delay
to use when creating candles. Optional.
--duration <minutes>
Used with --stats. The duration in
minutes of the candles in the datafile.
--multiplier <int>
Used with --stats. Optional price to
pips multiplier. Default is 10000.
--start-date <date>
Used with --stats. Optional start date
for when the stats are generated. in
the format YYYY-MM-DD.
--gbt-param-search-c <Filename>
Perform a parameter grid search for GBT
classification problem on the supplied
file in libSVM format.
--svm-param-search-c <Filename>
Perform a parameter grid search for an
SVM classification problem on the
supplied file in libSVM format.
--svm-param-search-r <Filename>
Perform a parameter grid search for an
SVM regression problem on the supplied
file in libSVM format.
--version
Print version info.
The various options are detailed in the following sections.
10.1
Candle Statistics (--stats)
This function prints some interesting statistics on candles generated from 1 minute candles. This
can be useful to get a feel for the average move of a bar at a certain time of day. The command is:
deepthought --stats EURUSD --data-file-dir C:\FX Database --duration 90
This example will generate statistics on the database EURUSD.db located in C:\FX Database
with a duration of 90 minutes. The output is similar to:
Hour
0
1
3
4
6
7
9
10
12
13
15
16
18
19
21
22
High-Low
32.1042
27.2416
23.9615
26.1151
40.6504
45.0207
42.4698
38.3129
57.4895
54.2737
47.0676
34.3027
33.2836
27.5136
24.2859
25.9693
High-Open
14.2428
13.9522
11.6187
12.7399
19.5089
21.5749
21.0607
17.2918
23.2958
27.0638
22.5515
17.4355
15.9249
13.5833
11.1304
12.0976
Open-Close
15.1128
11.7623
9.69082
10.4736
17.7179
20.7954
19.0148
16.6603
22.0288
25.4632
21.0145
15.2099
14.4872
12.5106
9.15172
10.7947
Open-Low
17.8614
13.2895
12.3428
13.3752
21.1415
23.4458
21.4091
21.0211
34.1937
27.2098
24.5161
16.8672
17.3587
13.9303
13.1555
13.8717
Down Candle:H-O
7.18116
6.35232
5.92409
6.86057
10.5305
10.494
10.5383
8.9854
12.2934
13.3921
11.274
8.24733
7.99955
6.70084
6.46652
6.12017
UpCandle:O-L
8.09004
7.90242
6.88904
7.9061
12.0605
12.1731
11.5586
11.2163
13.4976
13.8574
12.0082
9.64635
9.24595
7.73969
7.9826
8.07086
The meanings of each column are given in table 10.1, where a Down Candle is defined as the
close price being lower than the open price, and an Up Candle is defined as the close price
being higher than the open price.
79
Table 10.1: Column meanings using the --stats commandline option.

Column
Hour
High-Low
High-Open
Open-Close
Open-Low
DownCandle:H-O
UpCandle:O-L
Description
The hour of the day.
Average price difference in pips between the high and low
prices.
Average price difference in pips between the high and open
prices.
Average price difference in pips between the open and close
prices.
Average price difference in pips between the open and low
prices.
The average price difference for Down Candles only between the high and open prices. Useful for optimising the
level to place limit orders.
The average price difference for Up Candles only between
the open and low prices. Useful for optimising the level to
place limit orders.
Note that the command above generates the statistics using all data in the database it shows
nothing about the change in statistics over time. To generate statistics using only recent data
use the --start-date option in the format YYYY-MM-DD. For example:
deepthought --stats EURUSD --data-file-dir C:\FX Database
--duration 90 --start-date 2013-01-01
produces the following output:
Hour
0
1
3
4
6
7
9
10
12
13
15
16
18
19
21
22
High-Low
35.377
29.8979
26.3702
28.7024
44.1595
48.3887
45.716
41.5242
61.8966
57.4965
50.4498
36.5071
35.4195
30.0237
26.8316
28.6849
10.2
High-Open
15.5626
15.1253
12.7431
13.9491
21.1497
23.01
22.6288
18.72
24.3663
28.4277
23.8847
18.3648
16.9111
14.7538
12.1506
13.1876
Open-Close
16.6329
12.7776
10.4709
11.3823
19.0823
22.2336
20.4367
17.9344
22.9623
26.5514
22.2757
16.1003
15.2546
13.6816
10.029
11.8746
Open-Low
19.8144
14.7726
13.6271
14.7533
23.0098
25.3787
23.0872
22.8041
37.5303
29.0688
26.5651
18.1423
18.5084
15.2699
14.681
15.4972
Down Candle:H-O
7.80126
6.95526
6.59142
7.50945
11.4971
11.2253
11.4784
9.85978
12.899
14.2292
12.1771
8.78008
8.61007
7.25489
7.13001
6.72975
UpCandle:O-L
9.05009
8.81505
7.7593
8.84543
13.2004
13.3212
12.4819
12.2913
14.9961
15.0963
13.0914
10.3921
10.0501
8.50051
8.89532
9.02975
Generate Bars (--generate-bars)
The --generate-bars function creates candles in a CSV file for analysis. Standard constant
time and Renko (see section 9.1.1) type 1, type 2 bars can be generated. The following
command generates 45 minute candles and stores them in EURUSDm45.csv:
deepthought --generate-bars EURUSD --bar-type const-time --duration 45
--output-filename EURUSDm45
To generate Renko bars, use the following example:
deepthought --generate-bars EURUSD --bar-type const-price-1 --price-movement
80
0.002 --output-filename EURUSD Renko 20

this will generate type 1 Renko bars with a price difference of 0.002 (20 pips).
10.2 lists the required parameters for the --generate-bars function.
Table
Table 10.2: --generate-bars parameters.

Parameter
--generate-bars
--bar-type
--duration
--price-movement
--data-file-dir
--output-filename
10.3
Description
the argument immediately following, is the database. In the above example the database file is eurusd.db in C:\FX Database
The type of bar to generate. Must be one of:
const-time
Standard constant time candles. Must also provide
the duration parameter.
const-price-1 Renko type 1 bars.
Must also provide the
price-movement parameter.
const-price-2 Renko type 2 bars.
Must also provide the
price-movement parameter.
The duration, in minutes, of a constant time candle specified by a
--bar-type of const-time.
The price movement of a Renko bar specified when --bar-type is
const-price-1 or const-price-2.
Optional parameter specifying the location of the database. The default
of C:\FX Database is used if this parameter is not supplied.
The name of the CSV file to write the results. Will be created in the
same directory as DeepThought.
Generating a Manual Signal
This is covered in more detail in section 7.1. The commands given below are an example of generating a manual signal from the configuration in C:\DeepThought Configs\EURUSD Strategy 1:
deepthought --manual-trade-train-and-persist C:\DeepThought
Configs\EURUSD Strategy 1
Once the models have been trained, we can now generate the actual signal with:
deepthought --manual-trade-generate-signal C:\DeepThought
Configs\EURUSD Strategy 1
The output will be similar to below:
DeepThought built on Jan
BUY
Consensus=25
NumberOfPredictors=45
10.4
7 2014 at 16:48:35
Generating Feature Statistics (--generate-feature-stats)
During a backtest and live/paper trading, feature statistics are generated for each model in
the configuration. The --generate-feature-stats function generates these statistics without
81
performing any of the other backtest functions, thus it is a quick way to generate the feature
statistics.
Feature statistics are useful for checking data, and for ideas around what static stop loss and
take profit levels should be. The statistics are minimum, maximum, mean and standard
deviation. They are generated for each training set so are regenerated at the end of each
candle. Although the statistics can change over time, there should be no spikes or large changes.
The statistics are generated using the following command:
deepthought --generate-feature-stats C:\DeepThought Configs\EURUSD GA
10.5
Extracting a Training Set (--extract-training-set)
A training set can be extracted using the following command:

deepthought --extract-training-set ExampleConfigs\EURUSD MA
This will extract training set for all models defined in the configuration in the directory
ExampleConfigs\EURUSD MA. It will create files in libSVM format and CSV. The CSV file is
useful for examination in Excel or any other package. It also gives you more clarity on what
features are being generated and what the data actually looks like. Note that the extracted
data has had scaling applied so if you want to see the raw data, set the scale-type of each
feature to none.
We can also optionally add --num-samples to set the number of training examples in the
training set extracted and --num-skip to define how many candles the window moves between
each sample extraction. For example to extract a training set of 1500 samples, sampled at every
second bar, use the following command:
--num-samples 1500 --num-skip 2
10.6
SVM Grid Search (--svm-param-search-c)
A grid search is a brute-force approach to finding parameters. Support vector machines

require hyper-parameters to be set. These are the Penalty and Gaussian for classification,
and Penalty, Gaussian and Epsilon for regression. The steps for a grid search are:
deepthought --svm-param-search-c h4-features.training.data
The first line extracts training sets for the models defined in config.xml in the directory ExampleConfigs\EURUSD MA. In this example there is only one model and the name of
the file produced is h4-features.training.data in libSVM format. The second command
performs the parameter search.
82
The results are in the log file. The log file also contains configuration text for each parameter
set. Below is a sample of what is written to the log file for a regression model. This can be cut
and pasted into your configuration file. You can either take all parameter sets that had a cross
validated accuracy of greater than 50%, the top n parameter sets or just use all parameter sets.
If some combinations of parameters produce an accuracy of 0%, this means that the problem
couldnt converge and the parameter combination should be discarded.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<params> 

<gamma>0.25</gamma>
<epsilon0.00390625</epsilon>
<svm-type>SVR</svm-type>
</params>
<params> 
<gamma>0.00390625</gamma>
</params>
<params> 
</params>
...
For regression use the following commands:

deepthought --svm-param-search-r h4-features.training.data
10.7
GBT Grid Search (--gbt-param-search-c)
This is similar to the support vector machine parameter grid search above. In this case we are
optimising for the number of trees (num-trees) and tree depth (depth). The steps for a GBT
grid search are:
deepthought --extract-training-set ExampleConfigs\EURUSD Single GBT
deepthought --gbt-param-search-c h4-features.training.data
The first line extracts training sets for the models defined in config.xml in the directory ExampleConfigs\EURUSD Single GBT. In this example there is only one model and the
name of the file produced is h4-features.training.data in libSVM format. The second
command performs the parameter search. The results are in the log file. The log file also
contains configuration text for each parameter set. Below is a sample of what is written to the
log file for a regression model. This can be cut and pasted into your configuration file. You can
either take all parameter sets that had a cross validated accuracy of greater than 50%, the top
n parameter sets or just use all parameter sets. If some combinations of parameters produce an
accuracy of 0%, this means that the problem couldnt converge and the parameter combination
83
should be discarded.
<params> 
<depth>4</depth>
</params>
<params> 
<depth>6</depth>
</params>
<params> 
<depth>12</depth>
</params>
<params> 
<depth>5</depth>
</params>
...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
10.8
Printing
XML
(--print-config)
Configuration
Documentation
A detailed configuration will be printed to the console with the following command:
deepthought --print-config
This can be save to file using a pipe:
deepthought --print-config > config-details.xml
where the configuration will be saved in the file config-details.xml.
shown on the following pages, useful as a reference guide.
The output is
47
46
45
44
43
42
41
40
39
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
<config>

<data-file-dir>C:\FX_Database</data-file-dir> 


<bar-series>
<identifier>EURUSDm15</identifier> 
<bar-series-type>const-time</bar-series-type> 
<save-to-history-source>True</save-to-history-source> 
<history-source-type>bar-series</history-source-type> 
<history-source-in>EUR.USD</history-source-in> 
<price-to-pip-multiplier>1</price-to-pip-multiplier> 
<average-spread>0.00015</average-spread> 
<bar-duration-minutes>15</bar-duration-minutes> 
<const-bar-price>0.0</const-bar-price> 
<delay-minutes-offset>5</delay-minutes-offset> 
</bar-Series>

<model>
<identifier>test-features</identifier> 
 
<target-definition>
<identifier>target-1-bar</identifier> 
<bar-series>EURUSDrenko</bar-series> 
<number>1</number> 
<target-value-multiplier>10000.0</target-value-multiplier> 
<price-type>close</price-type> 
</target-definition>

<feature>
<type>day-of-week</type>

84
94
93
92
91
90
89
88
87
86
85
84
83
82
81
80
79
78
77
76
75
74
73
72
71
70
69
68
67
66
65
64
63
62
61
60
59
58
57
56
55
54
53
52
51
50
49
48

</feature>

<feature>
<period>H1</period> 
</feature>

<feature>
<ma-attribute-type>close</ma-attribute-type> 
<value-type>diff</value-type> 
<period>21</period> 
<selection-list>1,2,3,4,5,7,9,13,16,20,25</selection-list> 
<number>5</number> 
<bar-series>EURUSDH4</bar-series> 
</feature>

<feature>
<diff-type>close</diff-type> 
<bar-series>EURUSDr20</bar-series> 
<scale-type>min-max</scale-type> 
<min-max-clamp>0.015</min-max-clamp> 
<number>30</number> 
<selection-list>1,2,3,4,5,7,9,13,16,20,25</selection-list> 
</feature>

<feature>
<representation>single</representation>

85
141
140
139
138
137
136
135
134
133
132
131
130
129
128
127
126
125
124
123
122
121
120
119
118
117
116
115
114
113
112
111
110
109
108
107
106
105
104
103
102
101
100
99
98
97
96
95
<attribute-type>volume</attribute-type> 
<value-type>value</value-type> 
<const-volume>1000</const-volume> 
<bar-series>emini-5min</bar-series> 
<number>30</number> 
<scale-type>min-max</scale-type> 
</feature>

<feature>
<type>csv-feature</type>
<filename>C:\IBFX-MT4-AU\experts\files\EURUSD_CCI.csv</filename>
<identifier>cci_feature</identifier>
<value-type>diff</value-type> 
<selection-list>1,2,3,5,8,13,21,34</selection-list> 
</feature>

<feature>
<type>fundamental-indicator</type>
<bar-series>EURUSDh4</bar-series> 
<title>Change in Non-farm Payrolls</title> 

<scale-type>zscore</scale-type> 
</feature>
</model>

<use-recorded-signals>false</use-recorded-signals> 
<retrain-period>Weekly</retrain-period> 

<extremely-randomised-trees-predictor>
<model>15min-features</model> 
<identifier>random-forest-1</identifier> 
<predictor-weight>1.0</predictor-weight> 
<params> 
<num-trees>1000</num-trees> 
<depth>8</depth> 
</params>
<num-training-observations>1000</num-training-observations> 
<num-training-skip>1</num-training-skip> 
</extremely-randomised-trees-predictor>

<gbt-predictor>
<identifier>gbt-1</identifier> 
</params>
<num-training-observations>1000</num-training-observations> 

87
235
234
233
232
231
230
229
228
227
226
225
224
223
222
221
220
219
218
217
216
215
214
213
212
211
210
209
208
207
206
205
204
203
202
201
200
199
198
197
196
195
194
193
192
191
190
189


<linear-svm-predictor>
<identifier>linear-svm-1</identifier> 
<params> 
<penalty>1.0</penalty> 
<epsilon>0.5</epsilon> 
<solver-type>L2R_L2LOSS_SVC_DUAL</solver-type> 
</params>
</linear-svm-predictor>

<multi-layer-perceptron-predictor>
<training-algo>rprop</training-algo>
<hidden-layers>20</hidden-layers> 
<activation-function>sigmoid</activation-function> 
<max-iterations>1000</max-iterations> 
<termination-epsilon>0.01</termination-epsilon> 
<forecast-type>classification</forecast-type> 
<classification-output>binary</classification-output> 
</params>
</multi-layer-perceptron-predictor>

88
282
281
280
279
278
277
276
275
274
273
272
271
270
269
268
267
266
265
264
263
262
261
260
259
258
257
256
255
254
253
252
251
250
249
248
247
246
245
244
243
242
241
240
239
238
237
236


<random-forest-predictor>
<params> 
</params>
</random-forest-predictor>

<svm-predictor>
<continuous-tune>False</continuous-tune> 
<continuous-tune-num-param-sets>5</continuous-tune-num-param-sets> 
<model-min-accuracy>54.0</model-min-accuracy> 
<no-model-behaviour>dont-trade</no-model-behaviour> 
<identifier>svm-1</identifier> 
<params> 
<penalty>32.0</penalty> 

<epsilon>0.5</epsilon>

<coeff>0.0</coeff>

<svm-type>SVR</svm-type> 
<kernel>rbf</kernel> 
</params>

89
329
328
327
326
325
324
323
322
321
320
319
318
317
316
315
314
313
312
311
310
309
308
307
306
305
304
303
302
301
300
299
298
297
296
295
294
293
292
291
290
289
288
287
286
285
284
283
</svm-predictor>
<signal-generator>
<entry-times> 
<hour>all</hour> 
<day-of-week>all</day-of-week> 
</entry-times>
<combine-forecasts-func-name>CombineForecasts</combine-forecasts-func-name> 
<take-profit>60.0</take-profit> 
<stop-loss>40.0</stop-loss> 
<break-even>20.0</break-even> 
<exit-all-hour>-1</exit-all-hour> 
<trade-bar-series>EURUSDrenko</trade-bar-series> 
<reverse-signals>False</reverse-signals> 
</signal-generator>
<trader>
<add-to-existing>true</add-to-existing> 
<max-position>10</max-position> 
<scale-out>false</scale-out> 
<limit-orders>True</limit-orders> 
<limit-order offset="2.0">True</limit-order> 
<hold-minutes>0</hold-minutes> 
<max-drawdown>10000</max-drawdown> 
</trader>
<genetic-algo>
<ga-server>tcp://wraith</ga-server> 

<genome-id>-1</genome-id> 
<timeout-minutes>360</timeout-minutes> 
<population-size>20</population-size> 
<objective-function></objective-function> 
<mutation-probability>10</mutation-probability> 
<num-breeders-percent>30</num-breeders-percent> 
<min-num-breeders>30</min-num-breeders> 
<num-new-random-genomes>2</num-new-random-genomes> 
<num-generations>15</num-generations> 

<parameter id="stop-loss" type="integer" low="10" high="200" step="5" />
<parameter id="SVC-penalty" type="exp-2" low="1" high="15" step="1" />
</genetic-algo>
<backtest>
<display-progress>False</display-progress> 
</backtest>
</config>

91
Chapter 11
Fundamental Indicators
(Experimental)
Currently under development is support for fundamental indicators. These are news items that
influence exchange rates such as non-farm payroll, GDP, etc. This functionality is still under development, but you are free to experiment. It will be moved into mainstream functionality only
after thorough testing and tidying of loose ends (such as automated updating of values).
A Python script has been provided which should create and populate a fundamental database.
The script is run using the following command:
python create_fundamentals_db.py --dir <where to save db>
This will create a database named fundamentals.db in the directory specified by the parameter
--dir. Data is downloaded from dailyfx.com.
11.1
Fundamental Feature
A sample configuration snippet to define a fundamental indicator is given below.

<feature>
<type>fundamental-indicator</type>
<title>Advance Retail Sales</title>
<currency>USD</currency>
</feature>
The title attribute defines what the indicator is. The currently available indicators are listed
in table 11.1. We must also include a fundamentals-db section that defines the location of the
fundamentals database. Below is an example configuration snippet:
<fundamentals-db>
<data-file-dir>c:\fx_database</data-file-dir>
<db-file>fundamentals.db</db-file>
</fundamentals-db>
92
CHAPTER 11. FUNDAMENTAL INDICATORS (EXPERIMENTAL)
93
Table 11.1: title values for the fundamental-indicator feature.

Fundamental Indicator title
Advance Retail Sales
Consumer Confidence
Consumer Price Index (yoy)
Consumer price index ex food & energy (yoy)
Durable Goods Orders
Federal Open Market Committee Rate Decision
Gross Domestic Product (Annualized)
Gross Domestic Product Price Index
ISM Manufacturing
Personal Consumption
U. of Michigan Confidence
Unemployment Rate
Initial Jobless Claims
Currency
USD
USD
USD
USD
USD
USD
USD
USD
USD
USD
USD
USD
USD
The attributes generated for each fundamental-indicator feature are:

1. Most recent release consensus forecast value.
2. Most recent release actual value.
3. Most recent release previous value.
4. Next release consensus forecast value (if the release is in the next 7 days).
5. Binary missing value indicator if no value for the next release consensus forecast value.
6. Binary missing value indicator if no value for the previous release.
7. Number of bars since the previous release.
8. Number of bars to the next release.
Chapter 12
Tutorial: Preparing the

Commandline
This quick tutorial show how to customise the Windows commandline window. We prefer a
larger default window size and a better looking font. Follow these steps to setup your own nicer
looking commandline window.
12.1
Step 1: Open the commandline
Assuming Windows 7, from the start menu, select All Programs DeepThought
DeepThought CommandLine.
Figure 12.1: Opening the DeepThought Commandline
94
CHAPTER 12. TUTORIAL: PREPARING THE COMMANDLINE
12.2
95
Step 2: Open the defaults window
When the Windows commandline window appears, click on the top-right icon to drop down a
menu. Select Defaults from this menu.
Figure 12.2: Opening the Defaults Window
12.3
Step 3: Change the font
In the defaults window, first change the font to something nicer. We prefer Consolas. Click
the Font tab to access the font settings.
Figure 12.3: Changing the Font
CHAPTER 12. TUTORIAL: PREPARING THE COMMANDLINE
12.4
96
Step 4: Change the default window size
You can also change the default window size by clicking the Layout tab.
Figure 12.4: Changing the Window Size/Layout

Now click Ok to save your changes. You will need to exit and re-open the commandline Window
to see the changes.
Chapter 13
Tutorial: Backtesting in
DeepThought and MT4
This tutorial details the steps in running a backtest in DeepThought and optimising EA parameters using Metatraders strategy tester.
13.1
Step 1: Edit the configuration
The first step is to use your favourite text editor to edit the XML configuration. We use the
freeware application Notepad++ available from http://notepad-plus-plus.org/. In this
tutorial we are using the sample configuration in ExampleConfigs\EURUSD Single GBT (named
config.xml). You can copy this directory as a starting point for your own experiments. This
particular configuration contains a single gradient boosted tree.
97
CHAPTER 13. TUTORIAL: BACKTESTING IN DEEPTHOUGHT AND MT4
98
Figure 13.1: Editing the Configuration
13.2
Step 2: Start the DeepThought backtest
Open the DeepThought commandline from the start menu: select All Programs
DeepThought DeepThought CommandLine. Use the --backtest option as shown
in figure 13.2.
Figure 13.2: Starting the Backtest

After the backtest has completed the commandline window will look similar to figure 13.3
99
Figure 13.3: The Completed Backtest
13.3
Step 3: Copy files to Metatrader
The Metatrader EA for testing signals produced by DeepThought is located in C:\DeepThought\Metatrader EA. Copy this to the Metatrader experts folder. In this example we are using the broker InterbankFX, installed into a custom location C:\IBFX-MT4-AU.
Copy DeepThought Signal Tester.mq4 from C:\DeepThought\Metatrader EA to C:\IBFX-MT4-AU\experts.
Next we copy the recorded signals where Metatrader can read them. It is a limitation of
Metatrader that an EA can only read/write from a single directory. This directory changes
depending on whether the EA is being run on a chart or in the strategy tester. For our purpose
here we will be running the EA in the strategy tester, so we must copy ExampleConfigs\EURUSD Single GBT\recorded.signals.csv to C:\IBFX-MT4-AU\tester\files. Note that we
do not need to copy the DLL as the EA is only reading from recorded.signals.csv and not
generating any new signals.
If Metatrader is running exit and restart to load the EA.
13.4
Step 4: Modify the EA
At this stage we could make a copy of DeepThought Signal Tester.mq4 and include any custom
logic we wish to combine/include the DeepThought signals with any existing system. In this
example well use the EA as is and use Metatraders strategy tester to optimise the take-profit
and stop-loss settings.
13.5
Step 5: Running Metatrader Strategy Tester
If the Strategy Tester is not visible in Metatrader, open it by selecting View Strategy
Tester. Select the DeepThought signal tester EA and currency. Set the timeframe to M1 for
100
the most accurate results as shown in figure 13.4. The EA name and currency may be different
if you have renamed or modified the EA or are testing a different currency pair. It is best just
the run the EA to see that everything works. Check that the strategy tester Report tab show
the Total Trades roughly the same as the number of signals in the recorded.signals.csv
file. In this example we made roughly 34% profit over the year 2013 with a 14% drawdown with
starting capital of $10,000 with a lot size of 1.
Figure 13.4: Metatrader tester setup
Figure 13.5: The Completed Backtest
13.6
Step 6: Optimisation with MT Strategy Tester
The final step is to see if we can improve on the raw results above. In your EA you may have
other parameters related to trading logic youve added. In this example well only optimise the
take-profit and stop-loss settings.
Click Expert Properties and tick the Genetic Algorithm check box in the Testing tab as
shown in figure 13.6.
101
Figure 13.6: Enabling the Genetic Optimisation in Metatrader

Click on the Inputs tab and set the stop-loss and take-profit entries as shown in figure
13.7. Here we are optimising both stop-loss and take-profit to take values between 10 and
120 pips, with increments of 10 pips. Note that the EA takes these values as changes in actual
price rather than pips, so 120 pips is entered as 0.0120.
Figure 13.7: Selecting which Parameters to Optimise

Click Ok to save and close the window. Make sure the Optimisation option in the strategy
tester window is checked as shown in figure 13.11 then click Start to begin the optimisation.
This may take a while depending on what options have been set. You can watch the progress
by clicking on the Optimisation Results tab.
102
Figure 13.8: Enabling Optimisation in the Strategy Tester
13.7
Step 7: Analyse the Results
After the Metatrader optimiser has finished we should have a list of results similar to the results
shown below.
Figure 13.9: List of the Best Results of the Metatrader Optimiser

Here we see that a take-profit of 0.0120 and a stop-loss of 0.0110 (representing 120 and
110 pips respectively) produce the best results in terms of profit. Double click on this line and
the values will be automatically loaded into the tester. Click Start to start the test with these
values.
Below we can see that we have moderately increased the profit from 34% to 37% with around
the same drawdown.
Figure 13.10: Report of the Optimum Settings
Figure 13.11: Graph of a Test With Optimum Settings
103
Appendix A
Sample Configuration
The configuration below uses an ensemble of 10 SVM predictors. Much more are possible but
only 10 are shown for brevity. These were selected after performing an SVM parameter grid
search. The features are generated from EURUSD on an H4 timeframe. The forecast is for
the close of the next H4 candle. Historical data is stored in the database eurusd.db located
in C:\FX Database. The features used are hour-of-day, the previous 80 average-close price
differences, and 16 differences of moving average with periods 5, 10, 20, 50, 100. The differences
are spaced out over 100 candles as defined by the selection-list. The scaling is min-max
for all features. A genetic-algo section is present and will be ignored during backtest, live and
paper trading. You can add XML comments and extra XML sections are ignored.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39


<config>
<bar-series>
</bar-series>
<bar-series>
<history-source-in>EURUSDm1</history-source-in>
<history-source-type>bar-series</history-source-type>
</bar-series>
<model>
<target>
<number>1</number>
</target>
<feature>
</feature>
<feature>
104
APPENDIX A. SAMPLE CONFIGURATION

40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
<number>80</number>
</feature>
<feature>
<period>5</period>
<selection-list>0,1,2,3,4,5,7,9,13,16,20,25,31,45,55,70,100</selection-list>
<value-type>diff-with-price</value-type>
</feature>
<feature>
<period>10</period>
</feature>
<feature>
<period>21</period>
</feature>
<feature>
<period>50</period>
</feature>
</model>
<fundamentals-db>
<data-file-dir>c:\fx_database</data-file-dir>
<db-file>fundamentals.db</db-file>
</fundamentals-db>
<svm-predictor>
<params> 
<forecast-weight ga-subst="w1">1.0</forecast-weight>
</params>
<params> 
</params>
<params> 
105

113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
<gamma>0.000488281</gamma>
</params>
<params> 
</params>
<params> 
</params>
<params> 
</params>
<params> 
</params>
<params> 
</params>
<params> 
<gamma>3.05176e-005</gamma>
</params>
<params> 
<gamma>3.05176e-005</gamma>
</params>
</svm-predictor>
<retrain-period>weekly</retrain-period> 
<signal-generator>
<entry-times>
<hour>all</hour>
</entry-times>
<reverse-signals>False</reverse-signals>
</signal-generator>
106

186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
107
<trader>
<direction>both</direction>
</trader>
<backtest>
</backtest>
<genetic-algo>
<ga-server>tcp://localhost</ga-server>
<parameter id="w1" type="categorical" values="-1.0,0.0,1.0" />
</genetic-algo>
</config>
Appendix B
Condor Setup and Operation

The purpose of this chapter is to enable you to install and configure Condor. Condor is capable
of much more than given here, the details below are for the purpose of running the genetic
algorithm in DeepThought.
B.1
Installation
The following screen shots in figures B.1 to B.9 show the installation steps to install Condor on
a single computer.
Figure B.1: Condor Setup 1
108
APPENDIX B. CONDOR SETUP AND OPERATION
109
110
111
B.1.1
Adding a Condor User
C:\>condor_store_cred add -u myusername@slartibartfast

Account: myusername@slartibartfast
Enter password:
Operation succeeded.
112
B.2
113
Useful Commands
B.2.1
condor status
The condor status command shows the status of all nodes (cores). The example below shows
the status when no jobs are running:
C:\>condor_status
Name
OpSys
Arch
State
Activity LoadAv Mem
ActvtyTime
slot1@Slartibartfa
slot2@Slartibartfa
slot3@Slartibartfa
slot4@Slartibartfa
slot5@Slartibartfa
slot6@Slartibartfa
slot7@Slartibartfa
slot8@Slartibartfa
WINDOWS
X86_64 Unclaimed Idle
0.000 1015 0+00:00:23
WINDOWS
0.130 1015 0+00:00:05
WINDOWS
0.000 1015 0+00:00:25
WINDOWS
0.000 1015 0+00:00:26
WINDOWS
0.000 1015 0+00:00:27
WINDOWS
0.000 1015 0+00:00:28
WINDOWS
0.000 1015 0+00:00:29
WINDOWS
0.000 1015 0+00:00:22
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/WINDOWS
Total
C:\>
B.2.2
condor q
The condor q command lists the jobs that are in the Condor queue. The example below shows
the queue when no jobs are running:
C:\>condor_q
-- Submitter: Slartibartfast : <192.168.1.110:1072> : Slartibartfast
ID
OWNER
SUBMITTED
RUN_TIME ST PRI SIZE CMD
0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
C:\>
The example below shows jobs in three states:

C:\>condor_q
-- Submitter: Slartibartfast : <192.168.1.110:1072> : Slartibartfast
ID
OWNER
SUBMITTED
RUN_TIME ST PRI SIZE CMD
5.0
myusername
1/2 23:30
0+00:00:02 H 0
3.2 DeepThought.exe
5.1
myusername
1/2 23:30
0+00:00:03 H 0
9.8 DeepThought.exe
5.2
myusername
1/2 23:30
0+00:00:02 H 0
3.2 DeepThought.exe
5.3
myusername
1/2 23:30
0+00:00:03 H 0
3.2 DeepThought.exe
5.4
myusername
1/2 23:30
0+00:00:03 H 0
7.3 DeepThought.exe
5.5
myusername
1/2 23:30
0+00:00:03 H 0
3.2 DeepThought.exe
5.6
myusername
1/2 23:30
0+00:00:02 H 0
3.2 DeepThought.exe
5.7
myusername
1/2 23:30
0+00:00:02 H 0
3.2 DeepThought.exe
5.8
myusername
1/2 23:30
0+00:00:05 R 0
3.2 DeepThought.exe
5.9
myusername
1/2 23:30
0+00:00:05 R 0
3.2 DeepThought.exe
5.10 myusername
1/2 23:30
0+00:00:04 R 0
3.2 DeepThought.exe
5.11 myusername
1/2 23:30
0+00:00:04 R 0
3.2 DeepThought.exe
5.12 myusername
1/2 23:30
0+00:00:04 R 0
3.2 DeepThought.exe
5.13 myusername
1/2 23:30
0+00:00:04 R 0
3.2 DeepThought.exe
5.14 myusername
1/2 23:30
0+00:00:04 R 0
3.2 DeepThought.exe
5.15 myusername
1/2 23:30
0+00:00:04 R 0
3.2 DeepThought.exe
5.16 myusername
1/2 23:30
0+00:00:00 I 0
3.2 DeepThought.exe
5.17 myusername
1/2 23:30
0+00:00:00 I 0
3.2 DeepThought.exe
5.18 myusername
1/2 23:30
0+00:00:00 I 0
3.2 DeepThought.exe
5.19 myusername
1/2 23:30
0+00:00:00 I 0
3.2 DeepThought.exe
20 jobs; 0 completed, 0 removed, 4 idle, 8 running, 8 held, 0 suspended
C:\>
The states are in the column headed ST. The possible states are:
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b
--b

H
Held
Idle
R
X
Running
Stuck
114
Something went wrong with the job and it could not

complete properly. Jobs in this state should normally
disappear after a minute or two.
The job is waiting to be run. In the above example we
are running the genetic algorithm with a population
size of 20, therefore 20 jobs are created per generation.
However we are only running Condor on a single machine with 8 cores, so when the algorithm is first run,
8 jobs should be in the R state, and 12 in the I state
waiting for a slot to free.
The job is running normally.
Some jobs do not clear after being in the H state due
to some quirk in Condor. You can manually remove
them by adding the -forcex option to the condor rm
command as detailed below.
Table B.1: The Job States in Condor.
B.2.3
condor rm
Sometimes you may wish to stop all jobs on the cluster for reasons such as running the wrong
configuration or clearing jobs in the held state (sometimes they appear to get stuck). Use the
condor rm command do do this. For a single user type the following:
C:\>condor_rm myusername
All jobs of user "myusername" have been marked for removal
To remove job for all users, type the following:

C:\>condor_rm -all
All jobs have been marked for removal
When jobs are in the X state, they are generally stuck. Add the -forcex option to the condor rm
commands above.
Index
Attribute, 5
Backtest, 73
Options, 73
Backtesting, 11
Bar Series, 35
Options, 40
Candle
Duration, 36
Candles, 35
Cluster
Condor, 14
Commandline
Defaults, 94
Condor
Installation, 108
Useful Commands, 113
Condor Cluster, 8
Configuration, 11, 35
Example, 104
XML Reference, 84
Data
Importing from Dukascopy, 4
Importing from histdata.com, 4
Importing from Metatrader, 4
Database, 36
location, 40
Ensemble, 8, 36, 68
Ensembles
Bucket of Models, 59
Extremely Randomised Trees, 7, 66
Hour of Day, 43
Moving Average Options, 50
Normalisation, 9
Price Difference, 45, 46
Selection List, 45
Feature Vector, 5
Genetic Algorithm, 14, 36
Configuration, 74
Options, 75
Parameter Types, 16
Parameters, 76
Gradient Boosted Trees, 7, 64
Options, 64
Parameter Grid Search, 82
Label, 5, 55
Linear Support Vector Machine, 36
Options, 63
Live Trading, 20
Manual Trading, 20, 80
Metatrader, 3
EA Options, 22
Importing Data From, 4
Model, 5, 41
Multi-layer Perceptron, 8, 66
Multi-layer Perceptron Options, 67
Multicore Processing, 8
Neural Network, 8, 66
Neural Network Options, 67
Normalisation, 9
Output Files, 13
Feature, 5
Bar Attribute, 47
Binarised, 10, 41
Categorical, 5, 9
Configuration, 42
Continuous, 5, 9, 41
CSV Input, 51
CSV Input Options, 52
Day of Week, 44
Fundamental Indicator, 93
Paper Trading, 13
Pip, 35
Multiplier, 35
Predictor, 5, 8, 36, 58
Ensemble, 68
GBT Options, 64
GBT Parameter Options, 64
Label, 55
Linear SVM Options, 63
115
INDEX
Linear SVM Parameter Options, 63
Random Forest Options, 65
Random Forest Parameter Options,
65
SVM Options, 60
SVM Parameter Options, 61
Python
DeepThought Interface, 33
Feature, 25, 53
Moving Average Cross, 30
numpy, 28
Pandas, 27
Predictor, 30, 68
Signal Generator, 32, 70
Target, 28, 56
Random Forest, 7, 65
Options, 65
Recording Signals, 12
Renko Bars, 38
Signal Generator, 32, 70
Options, 70
Support Vector Machine, 6, 8, 36, 58
Bucket of Models, 59
Hyper Parameters, 8
Kernel Functions, 61
Linear, 62
Options, 60
Parameter Grid Search, 81
Target, 55
Bars in the Future Options, 55
Future Price Change, 55
Python Script, 56
Python Script Options, 57
Trader, 72
Options, 72
Training, 5
116

DeepThought FinML

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

DeepThought FinML

Încărcat de

Drepturi de autor:

Formate disponibile

DeepThought 1.4.

Deep Thought Software (NZ) Ltd

7 Live and Paper Trading

10.8 Printing XML Configuration Documentation (--print-config) . . . . . . . . . .

11 Fundamental Indicators (Experimental)

12 Tutorial: Preparing the Commandline

13 Tutorial: Backtesting in DeepThought and MT4

B Condor Setup and Operation

Type 1 Renko Bars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Opening the DeepThought Commandline

13.1 Editing the Configuration . . . . . . . . . . . . . . .

Normalisation and Scaling Schemes . . . . . . . . . . . . . . . . . . . . . . . . . .

Files Produced During Backtesting/Paper Trading . . . . . . . . . . . . . . . . .

parameter configuration options for the genetic-algo configuration section. . .

DeepThought parameters for the Metatrader EA. . . . . . . . . . . . . . . . . . .

Summary of the deep thought intf interface object . . . . . . . . . . . . . . . .

Sections in the XML configuration file . . . . . . . . . . . . . . . . . . . . . .

10.1 Column meanings using the --stats commandline option. . . . . . . . . . . . . .

11.1 title values for the fundamental-indicator feature. . . . . . . . . . . . . . . .

B.1 The Job States in Condor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Importing Historical Data from MT4

Exporting data as CSV from MT4

Importing MT4 CSV data into a DeepThought Database

Importing from Dukascopy

Importing from HistData

Support Vector Machines (SVM)

CHAPTER 4. MACHINE LEARNING

Gradient Boosted Trees (GBT)

Extremely Randomised Trees

CHAPTER 4. MACHINE LEARNING

CHAPTER 4. MACHINE LEARNING

CHAPTER 4. MACHINE LEARNING

Table 4.2: Binarising categorical variables.

where C:\configs\EURUSD MA TEST is the directory where the config.xml is located.

Recording and Using Recorded Signals

Order Fill Simulation

Files Produced During Backtesting and Paper Trading

Genetic Algorithm for Parameter

CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH

CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH

The parameter types are defined in table 6.1.

Running the Genetic Algorithm

The genetic algorithm is started with the following command:

CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH

CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH

Compute host=Slartibartfast svm-gamma=-2 svm-penalty=3

Genetic Algorithm Results

Using Recorded Results

The Condor Submit File

CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH

Live and Paper Trading

CHAPTER 7. LIVE AND PAPER TRADING

CHAPTER 7. LIVE AND PAPER TRADING

max trade duration seconds

trade lot size

limit order offset

CHAPTER 7. LIVE AND PAPER TRADING

CHAPTER 8. PYTHON SCRIPTING

conda install pandas

CHAPTER 8. PYTHON SCRIPTING

Listing 8.1: Python feature configuration.

CHAPTER 8. PYTHON SCRIPTING