Sunteți pe pagina 1din 26

Project Proposal

Can the weather kill goals?



The effects of weather on goal outcome for football
matches played within the German Bundesliga











Alastair Macnair, x13129325, Alastair.MacNair@student.ncirl.ie


Higher Diploma in Data Analytics

4th June 2014





Table of Contents



1. Objectives ....................................................................................................................................... 3
2. Background ................................................................................................................................... 4
3. Literature Review ......................................................................................................................... 6
4. Research Question ........................................................................................................................ 7
5. Requirements Elicitation and Analysis ....................................................................................... 9
6. Special Resources required ........................................................................................................ 13
7. Project Plan ................................................................................................................................. 14
8. Consultation ................................................................................................................................. 14
9. Declaration ................................................................................................................................... 14
Appendix A Examples of the Football Data Sets .......................................................................... 15
Appendix B Examples of the Weather Data Sets .......................................................................... 16
Appendix C Map of Current Bundesliga 1 & 2 Stadium locations. ............................................ 21
Appendix D Project Plan Gantt chart ............................................................................................ 22
Appendix E Map showing Principle Regions of Germany ........................................................... 23
Appendix F Project Proposal Revisions ......................................................................................... 24
References ............................................................................................................................................ 25

3

1. Objectives
The study will seek to assess whether certain weather factors such as temperature, cloud cover,
precipitation, wind and humidity have any determined effect on the goal outcome of football
matches within the Bundesliga 1&2 football leagues held within Germany when considered
across twenty seasons of historic play. The theory is that weather conditions, in particular lower
temperatures, may have a detrimental impact on goals scored although warmer temperatures
will also be considered. By linking daily historic weather data for specific weather stations with
stadiums and the dates and results of matches played it will be determined if the effects of
weather plays any role in goal outcome when considered over a significant time period.

Secondary objectives will consider if any difference exists between the first and second league
in relation to weather effects on goal outcome and also whether any particular stadium affects
goal outcome due to its geographical location or size for the teams that play there. The results
will also be used to compare against a particular betting instrument which is the over/under
goals scored bet (PKR, 2014) to see if any meaningful predictions can be made regarding total
match goals scored. Any possible lean towards an uneven spread (more goals under then over
due to weather factors) would be of particular interest to football teams, coaches, trainers and
in particular those companies that provide such betting instruments products.

Summary of all objectives
Objective #1 Determine if there is any link between goals scored and weather effects
within the Bundesliga 1&2 football Leagues.
Objective #2 Determine if there is any difference between the Bundesliga 1 & Bundesliga
2 due to the effects of weather, location or smaller stadiums.
Objective #3 Determine if just single or multiple weather parameters predominantly affect
goal outcome.
Objective #4 Investigate if stadium location and regional local weather affects games
played there and match outcome.
Objective #5 Compare the outcomes of matches to under/over goal difference betting
instruments to determine if the spread of match results could have been better
predicted using the results of the analysis.
Objective #6 Attempt to use the data to predict goal outcome for a number of future
matches using weather predictions and selected betting instruments.
Objective #7 Determine if goal difference between teams is greater in colder weather and
if sustained cold weather effects a teams performance over time.
Objective #8 Use analysis software including but not limited to Excel, Python, R and SQL
to gain knowledge in their use for analysing large data sets.

4

2. Background
According to a recent study by the European Commission (2012) on the contribution of sport
to the economy it placed the value at 294 Billion euros. Additionally the betting market for
sports is estimated to be around 733Billion globally (BBC, 2014) with 70% of that income
coming from football matches. Betting on football matches became popularised in the early
1920s since the creation of the football pools (2014) in the UK, the oldest gaming company in
the world, which allowed fans to predict matches and win money if those predictions proved
to be correct. With some individual bets now reaching figures of over 200,000 (BBC, 2014)
it is important for gaming companies to be able to understand the level of risk they are being
exposed to as mistakes could be costly.

Additionally trainers and teams are always looking to gain competitive advantage to ensure
success and the use of statistical information and data analytics is becoming increasingly
important within football as more and more managers and teams use data analysis to become
smarter and more efficient (Lewis, 2014.) While nearly all analysis focusses on the players
there has been much less analysis on external factors. There is some evidence and a number
of studies to indicate that weather factors, predominantly temperature, may be a factor in the
outcome of European football matches (Hamilton, 2014.) While the effects of extreme hot or
cold temperatures on human physiology are known to directly affect both performance and
health (Hong, 2014) the overall contribution that weather makes, where extreme temperature
is not a factor, and in particular, to the goal outcome of football matches, is still not clearly
understood or established.

Germany has a moderate and temperate climate (See Appendix 2(e) for a typical weather year)
with temperatures ranging on average from just below zero degrees Celsius in winter to around
the mid twentys during the summer (ECA, 2014.) The use of a moderate temperate climate
seeks to reduce as much as possible any effects of very extreme temperatures. However there
are colder areas such as Munich which can see temperatures drop to around -10C which can
affect performance (Hong, 2014) although at around 0C there should be very little drop in
performance for persons engaged in moderate exercise even if wearing t-shirts. Germany is
large enough to have distinct regions of specific weather patterns (Encyclopaedia Britannica,
2014) with variable frequency of temperature, humidity and precipitation experienced in
different regions and throughout the year. The Bundesliga stadiums are distributed around

5

Germany widely enough to see if regional weather plays a role in match outcome (Appendix
C & E.)

The study considers two primary data sets which were identified for the purposes of being
suitable for analysis and to meet the studies objectives. Firstly the Bundesliga football league
results for which reliable historic data exists for its entire history since 1963. Within this a
selection of data will be considered for the period 1993 until 2013 which represents twenty one
seasons (years) of play. Useable football score data has been identified from an online provider
(Football-UK, 2014) which provides one csv file (Refer to Appendix A) for each season played
detailing every game played within the season, the date played, half time and full time scores
and where it was played as well as range of other match information. There are around 306
games played per season per league so the football data set will comprise around 306 (games)
x 21 (seasons) x 2 (leagues) = 12, 852 football matches being analysed in total. Each of the csv
files is relatively small at around 100kb in size.

Compared to this will be daily weather data for Germany obtained from the European Climate
Assessment & Database (ECA, 2014.) This site provides data on numerous weather stations
positioned around Europe which can be matched geographically using name, latitude and
longitude co-ordinates to each stadium being considered to within a few miles. The data is
available as a number of individual text files, one for each unique weather station and weather
variable (Refer to Appendix B.) The blended data was selected for use which combines weather
data from different sources, although checking shows no difference for the weather stations
being used. The files contain comma delimited text in their raw format and the uncompressed
size of the files for each weather variable ranges from 200MB to 4GB containing
approximately 400 to 5,000 individual weather stations in each zipped folder for that particular
weather variable. Each file provides data for approximately 67 years equating to 25,591 lines
of raw data for each weather station and single weather variable. There are 18 stadiums in the
Bundesliga 1 and the same in the Bundesliga 2. However, there are over the 21 year period 38
teams that have played in the top tier with a similar number expected in the second but, there
will be instances of cross over where teams share stadiums and the closeness of stadia may
allow for a common weather station to be used. This will be subject to more a more detailed
assessment during the initial stages of the project. This means that there could be approximately
30 distinct weather files for each weather parameter. The total approximate size of the raw data
will therefore be 25,591 x 30 = 767,730 lines of data for each weather variable. It is intended

6

to consider five variables; temperature, precipitation, cloud cover, wind and humidity which
will equate to almost 3.8 million lines of raw data prior to selecting the relevant lines that equate
to the 12,852 actual football matches that were played.

There are some important limitations to the data being considered, in particular the weather
data. The data being used provides daily averages which may not equate to the conditions
experienced during the time the match was played. For example rain may have fallen before,
after or during the match. The study is seeking to determine if any relationship exists between
the historic weather data and goal outcome and so some caution is advised as links could be
established where none really exist. However, the primary objective is to consider any overall
trend over the course of a playing season i.e. changes in seasons and over months rather than
specific matches.


3. Literature Review
The literature review will at this stage examine primarily factors relating to the statistical
analysis of sports and the effects of weather on sports but should also extend to consider
statistical analysis in general, prediction analysis, climate and weather, sports performance and
stadium design. Literature has been researched sourced from Google Scholar, CiteSeerX,
Google Books and the Directory of Open Access Journals along with articles, websites and
other sources.

Statistical Analysis in sports
Sports performance analysis is the process by which the various persons involved within a sport
such as coaches, analysts or physiologists come together to break down a games performance
from observed data and then identify those factors which contributed towards either a good or
bad performance (McGarry, ODonoghue, Sampaio, 2013.) A lot of commonly accepted
anecdotal evidence within football has been proven to be incorrect using statistical analysis
such as corner kicks increasing the chances of scoring (Anderson and Sally, 2013.) The authors
propose that understanding issues like this provides competitive advantage through knowledge
justifying the time and expense in undertaking such analysis in the first place.


7

The effect of cold weather in Sport
The effects that weather and environmental factors have on sport is an area where potentially
considerable improvements could be made according to Thornes (1977) to improve sports
management, performance and economic performance. There is evidence to suggest that some
sports are more adversely affected than others with endurance sports, in particular cycling,
being affected by the weather (Pezzoli, Cristofori, Moncalero, Giacometto and Boscolo, 2013.)
The study also found that most sports were affected by three primary characteristics namely
temperature, humidity and wind. Rain was also a factor in a number of cases for some but not
all sports. Riley and Williams (2003) indicates that colder weather reduces limb temperatures
which would detrimentally affect motor performance as well as strength and power. In fact
muscle power was found to be reduced by 5% for every 1C drop in muscle temperature below
normal.

The effects of temperature on ball properties is also a possible environmental factor as with
temperatures approaching zero degrees Celsius a goalkeeper has 7% more time to react to a
penalty that at higher temperatures when the ball moves quicker. (Wiart, Kelley, James and
Allen, 2012) The flight of the ball is also affected with colder conditions causing the ball to
drop and move slower overall with less power than at warmer temperatures. However as Riley
and Williams (2003) point out in colder weather the goalkeeper is most susceptible to reduced
limb temperature and dexterity unless they keep highly active.


4. Research Question
The problem being considered is that there is a lack of information regarding the effects that
weather factors like temperature, precipitation, humidity and wind may have on goals scored
in football matches. The primary research question being considered is: -

Does the weather effect the goal outcome in football matches within the Bundesliga 1 & 2?

From this the Null Hypothesis Ho and the hypothesis that will be tested H1 is established: -

8

Ho: There is no relationship between goal outcome in football matches and daily corresponding
average values of temperature, wind, precipitation and humidity.
H1: There is a relationship between goal outcome in football matches and daily corresponding
average values of temperature, wind, precipitation and humidity.

The Null hypothesis is non directional and therefore a two tailed test will be applied where
appropriate with a significance level (critical value) of 5%


Figure 1: Graphical representation of a two tailed test with rejection regions.

Within the context of the broader research question there are further questions that will be
considered: -
(i) Is the Bundesliga 2 Leagues goal outcome affected more by weather factors than
the Bundesliga 1 League?
(ii) Can goal outcome at any particular stadium be attributed to any possible regional
weather effects?
(iii) Does a single weather variable affect match outcome or are multiple factors
required?
(iv) Do smaller stadiums have a greater effect on goal outcome due to greater expose
to the weather?
These are components of the primary research question and will be investigated. Appropriate
hypothesis testing will need to be established for these questions. Further questions will be
developed for the project.

9


Predictions
Additionally there is the possibility of the results analysis being used to undertake match
outcome prediction for goals scored using next day weather forecasting. It is expected that
rather than being able to predict actual total goals for a match with any accuracy it is more
likely that prediction of average goals scored due to general weather conditions experienced
over a time period would be possible. The use of a betting tool such as the Under/Over (x)
goals instrument will be used based on the average number of goals per game and league across
the period being considered. For example if the average goals scored was 2.7 then Under/Over
2.5 goals would be used as the instrument to see if the results can be used to reliably determine
significant push or pull above or below this level which could potentially indicate that the
predictions can be made. As the predictions are dependent on weather then the time period will
typically be in the 1 to 3 day period in line with weather forecasting but could increase to 10
days.

The research will be limited to only stadium locations within Germany, the weather data
identified and goals scored for a match. No other in match data or statistics will be used such
as corners or passes. Individual players will not be considered nor will any other variables other
than those indicated and referenced.


5. Requirements Elicitation and Analysis
Requirements elicitation is a preliminary stage in which the requirements of the process are
specified and defined which then leads to the correct solution being designed and implemented.
Undertaking requirements elicitation is primarily a process to understand a particular problem
which comes typically from a business need. The objective of requirements elicitation is to
identify all of the requirements, or as many of them as is feasibly possibly (Kasirun, 2005.) At
this stage the requirements are a preliminary step towards a more detailed project specification
later on during the second semester when the dissertation will be initiated, undertaken and
completed.

Elicitation techniques are the systems and tools used to bring forth the requirements and help
develop and find understanding. For this part of the process the tools used are Brainstorming
and Document Analysis as outlined in the (IIBA, 2009.)

10

The brainstorming process was utilised primarily at this stage to help stimulate ideas on the
project. This did not take the format of a scheduled session but instead was an ongoing process
where ideas were jotted down in a note book as and when they came to mind. No critiquing or
analysis of the ideas was undertaken deliberately as this is contrary to the brainstorming process
which is to develop new ideas.
Before determining the functional and non-functional project requirements it is useful to first
re state the problem being considered which was explored in the previous section: - The
problem being considered is that there is a lack of information regarding the effects that
weather factors like temperature, precipitation, humidity, cloud cover and wind may have on
goals scored in football matches. From this we can then look to determine the project
requirements.

Project Scope
The project is a Big Data Analysis study which will use a relational database most likely SQL
in conjunction with R Studio to undertake analysis of a large data set to find trends, patterns,
links and predictions supported by graphing and tables to present results.

General Description
The database will be created and designed to facilitate the querying and manipulation of a large
amount of data to allow for the effects of weather such as temperature, humidity, precipitation
and wind on total goals scored in football matches to be analysed to determine if a relationship
exists. The aim is that the analysis will provide insight into the possible effects of weather on
sports like football.

The database must be designed in such a way that all the entities and their relationships are
robust and well understood and that the data has been normalised prior to database creation.
The ability to handle very large queries and joins will be required as tables with thousands of
rows has a multiplying effect within SQL databases which can have significant demands on
processing ability of computers. If the database cannot function properly then either the number
of data points will have to be restricted or the amount of analysis limited which will not provide
a sufficient amount of information for a robust analysis which could damage the study as a
whole. The core function of the project is to compare the two primary datasets which must be
central to any design approach implemented.



11

System Interfaces
The database will be a self-contained system however it may interface with a PC or a server
that will be located on Amazon Web Services, or Windows Asia (to be decided subject to
further research.) It will also need to potentially receive input data from another programs such
as Microsoft Excel, R or Python and be required to export back to Excel and R Studio for
ongoing graphing and analysis.

Preliminary list of Functional Requirements
The purpose of the project is to utilise a database to either accept or reject the null hypothesis
as set out within the specified project timeline and to produce a dissertation report.
1. The weather data cleaning preparation tool (R or Python) must be able to discard the dates
and associated data that are not relevant to reduce the weather file size.
2. The weather data cleaning preparation tool (R or Python) must be able to read, re-organise
and output the data files into a readable and standardised format for entry into the SQL
database.
3. The data preparation must ensure that dates from both files are in a standardised ISO format
that are compatible with each other.
4. The weather stations should have specific identity codes matched to each stadium.
5. The SQL database system must be able to be export results data out to other programs for
analysis, graphing and visualisation.
6. The SQL database being used for analysis must be able to hold several thousand entries.
7. The SQL database must be able to filter and select different columns and rows of
information for analysis and comparison.
8. The data outputted should be produced in a form that it is capable of being analysed by
using a variety of statistical tools (it is assumed that all of these will be utilised at this stage
to some extent subject to verification during the next stage) including: -
a. z-test (hypothesis testing)
b. power analysis (due to the large sample size)
c. Analysis of variance (ANOVA) to compare each season of play and other sub
groups of means.
d. Mean (there will be multiple means considered)
e. Calculation of Standard deviation(s)
f. Calculation of Variation(s) and Covariance.

12

g. Time series analysis (for possible prediction analysis)
h. Cluster analysis
i. Correlation Analysis (Calculation of r)
j. Simple linear Regression, multiple & logistic regression tools.
9. The SQL database should be designed so that comparison against weather variables can be
made against the following football variables:
a. The entire range of matches played by date of match.
b. Each season of play (by Individual selection.)
c. By Stadium location.
d. By Team.
e. By a pre-determined or local region (Refer to Appendix E.)
10. The SQL database should be designed so that comparison against football variables can be
made against one, two, three or all of the following weather statistics:
a. Temperature
b. Humidity
c. Precipitation
d. Wind
e. Cloud Cover
11. The database team table should provide the numbers of years they have played in each
league as not every team will have played for the entire time period being analysed.

Preliminary list of Non Functional requirements
Non-functional requirements are outlined below. They include:
1. The methodology section should enable another person to reproduce the research project
in its entirety and from the same data obtain the same/similar results.
2. The project and research objectives should be able to be understood by non-experts.
3. The data being used should be verified as authentic and reliable.
4. The author must invest a minimum of three hours a week on the project based on the project
plan.
5. The author must attend all lectures and tutorials within semester 2.
6. The database should be able to achieve a reasonable level of performance in its required
operation.

13

7. The project must be stored electronically on three different media sources at all times and
at least be updated once a week.
8. The project must be completed by the specified date.


6. Special Resources required
The proposed project will require a number of programs to undertake the required analysis and
then production of results: -
1/Microsoft Excel Required to read and open primary football data files and do basic checks
and tables, graphical outputs.
2/ Microsoft Word To generate written reports.
3/ R. R will be the primary program used to prepare and analyse, graph and tabulate the data.
It will be used to clean up all the football files removing unwanted columns and binding all
years of play into one file. Weather data will also be cleaned up removing unwanted lines and
error checking for NULL values.
4/ SQL - The data lends itself towards a relational database such as SQL where the weather
data can be combined with the football data based on, temperature, precipitation, humidity,
wind or geographic location or team for example.
6/ Map Reduce/Hadoop & Python The use of a distributed computer system could offer
potential benefits for speed of computation as the data set may be too large to handle efficiently
on a single user PC. This will be investigated as to its necessity as the project develops.
7/ Pea Zip A program that can easily un-compress a variety of large file formats to be used
for the weather data.
8/ Microsoft PowerPoint To create the project presentation
9/ Adobe Photoshop - May be required to assist with image manipulation for the project and
presentation.
10/ Browser add-on for Mozilla; Download it all! to quickly extract and download all 42
football csv files.

At this stage there may be additional programs that may be useful but have not yet been
identified as being a requirement. This will be a part of the project plan to determine what
technologies should be used.




14

7. Project Plan
The project plan is provided in Appendix D and shows the general expected timeline for project
delivery in the second semester. The first half of the project is planned for research, preparing
all the data, building databases and becoming familiar with them as well as the initial parts of
the thesis. The second part focuses on the analysis, findings and writing the analysis which are
key parts of the project process. The plan has been updated based on confirmation of the
submission date in early January and additional deadlines for management reports and the
presentation.

8. Consultation
The project proposal was discussed with NCI Lecturer Padraig De Burca. The discussion took
place 26
th
May 2014 and took the form of an informal discussion after scheduled classes.
Padraig provided valuable feedback relating to the potential for use of SQL to build a database
of all the normalised match and weather variables which can then be queried in multiple ways
with the results being outputted to other programs like Excel to generate graphs. The significant
benefit of using SQL would be firstly in the speed by which stadiums, teams, results and even
certain weather conditions can be isolated for comparative analysis but also would limit the
amount of preparation the weather files needed as there would be no need to eliminate all the
dates where games were not played. Just clip the data file at the start date to eliminate the
largest unneeded section prior to 1993. This would create potentially redundant data within
the database and may affect times to undertake joins but could be quicker than trying to
eliminate certain dates in the raw weather files as there are potentially 70-100 individual
weather files.
As a result of the consultation several possible new ways to view the data were considered.
Firstly it opens the possibility of considering the past few days of weather prior to any match
for consideration which had not ben though of and secondly it allows the comparison of
sequential matches played by the same team in different locations to see if the effects of any
general ongoing weather such as sustained cold has a compounding effect. Padraig also noted
that SQL has some graphing capabilities which will be investigated as to their potential use.

9. Declaration
By submitting this proposal through the NCI Moodle system, I declare that unless otherwise specified,
all content in this proposal is my own work and has not been copied from other sources.

15

Appendix A Examples of the Football Data Sets

Data Set 1 Football results for the Bundesliga 1 & 2. Excerpt below shows Bundesliga 2
results for July 2013.




Football-Data (2014) provides a full season of play for either Bundesliga 1 or 2 as a csv file
available for download. Each csv file contains the results for one entire season of play. There
are 306 matches in total for each season which equates to 18 teams. There are 52 columns of
data per file for most files containing the date, final time results, half time results, where the
game was played and a variety of betting information. For earlier years not all this information
was recorded. Twenty years of historic football data for both leagues equates to 306(games per
season) x 21 (seasons) x 2 (leagues) = 12,852 lines of data for the football matches which in
its raw form exists in 42 corresponding csv files. Total goals is not a parameter but any program
or database such as SQL could calculate this from the home and away goals scored columns.






16

Appendix B Examples of the Weather Data Sets

Data Set 2(a) Historic weather data example for Germany for Daily Mean Temperature at
station 494 (Augsburg, Germany)



The European Climate Assessment & Database Project (2014) provides data for weather
stations across Europe. The above data sample is taken from station number 494 (Augsburg)
for mean daily temperature. The text files are comma delimited and provide (from left to right)
station number, source identifier, date (yyyy/mm/dd), temperature and quality code. This file
contains 67 years, 3 months and 29 days of data which equates to around 25, 591 lines of data
for each of the locations. The year the station began monitoring varies but typically covers a
significant time period in all cases. The temperature is provided in 0.1degrees Celsius in its
current html format and requires a decimal point to read correctly. For example the first line of
data above for the 28
th
March records a daily mean temperature of 6.5 Degrees Celsius with no
known errors or missing data. Below freezing levels are identified with a minus symbol (none
shown in example above.)







17

Data Set 2(b) Historic weather data example for Germany for Daily Humidity levels at
station 494 (Augsburg, Germany)



The above ECA (2014) data sample is taken from station number 494 (Augsburg) for daily
humidity. The text files are comma delimited and provide (from left to right) station number,
source identifier, date (yyyy/mm/dd), humidity in percent and quality code. This file also
contains 67 years, 3 months and 29 days of data which equates to around 25, 591 lines of data
for each of the locations. The year the station began monitoring varies but typically covers a
significant time period in all cases.







18

Data Set 2(c) Historic weather data example for Germany for Daily precipitation levels at
station 494 (Augsburg, Germany)




The above ECA (2014) data sample is taken from station number 494 (Augsburg) for daily
precipitation. The text files are comma delimited and provide (from left to right) station
number, source identifier, date (yyyy/mm/dd), precipitation in 0.1mm and quality code. This
file also contains 67 years, 3 months and 29 days of data which equates to around 25, 591 lines
of data for each of the locations. The year each station began monitoring varies but typically
covers a significant time period in all cases.




19

Data Set 2(d) Historic weather data example for Germany for Daily mean wind speed at
station 494 (Augsburg, Germany)




The above ECA (2014) data sample is taken from station number 494 (Augsburg) for daily
average wind speed. The text files are comma delimited and provide (from left to right) station
number, source identifier, date (yyyy/mm/dd), average wind speed in 0.1m/s and quality code.
This file also contains 67 years, 3 months and 29 days of data which equates to around 25, 591
lines of data for each of the locations. In this data set all records prior to 1960 are Null. The
actual wind speed is the above figure divided by 10. For example the first value for April shown
above would be 1.5m/s.

Data Set 2(e) Historic weather data for Germany for Cloud Cover
The cloud cover data files (not shown) are based on the oktas scale which provides a measure
of cloud cover from 0 to 8 subject to the overall portion of sky covered. Zero represents a
totally clear sky while 8 would be totally overcast.






20

Example Weather Year 2(e) -Typical Weather Year for Mean Daily Temperature for weather station 494 Augsburg



21

Appendix C Map of Current Bundesliga 1 & 2 Stadium locations.



Image Source: Total Football Forums, http://www.totalfootballforums.com/forums/topic/76502-
german-football-fans/


22

Appendix D Project Plan Gantt chart



Notes
1/ Dates shown are week commencing for the Monday of each week.
Wk_01 Wk_02 Wk_03 Wk_04 Wk_05 Wk_06 Wk_07 Wk_08 Wk_09 Wk_10 Wk_11 Wk_12 Wk_13 Wk_14 Wk_15 Wk_16 Wk_17 Wk_18 Wk_19 Wk_20 Wk_21
8/9/14 15/9/14 22/9/14 29/9/14 6/10/14 13/10/14 20/10/14 27/10/14 3/11/14 10/11/14 17/11/14 24/11/14 1/12/14 8/12/14 15/12/14 22/12/14 29/12/14 5/1/15 12/1/15 19/1/15 26/1/15
Task
Revised Proposal (28/09/14)
Statistical research (ongoing) Thesis writing
Technology & Tools research Supporting Processes
Data Cleaning & Preperation Key Landmarks
Normalisation & ERD
Requirements Specification
SQL/R Set up and programming
System Testing
Introduction
Literature Review
Methodology
Data Analysis (pre-testing)
Data Analysis & Programming
Discussion
Graphing and Visualisation
Refinemant
Conclusion
Final Checking
Printing and Binding (x3 copies)
Submission (06/01/15)
Management Reports
Write Presentation
Practice Presentation
Presentations
September October November December January

23

Appendix E Map showing Principle Regions of Germany




Note: The regions are a base point for further study as it is accepted that these region
locations do not necessarily equate to accepted regional weather.
Image Source: 24point0. http://www.24point0.com/ppt-shop/media/catalog/product/r/e/regions-
map-of-germany-ppt-slides.jpg



24

Appendix F Project Proposal Revisions
2/ Background
Extra season of play added increasing size and additional weather factor included (cloud
cover) also increasing the raw data size. (Minor Change)

4/ Research Question
A few extra sub research questions added and in the predictions section the limitations of
predictions are based on forecasting which is realistically limited to a few days.

6/ Special Resources
This area has been updated to better reflect the actual technology being used and for which
specific purpose based on time spent investigating each technology and undertaking small
scale tests.

7/ Project Plan
Updated to reflect known dates and revised to better break down sub components.

Appendix B
Cloud cover information added (without example picture) to note inclusion of this weather
data set in the project.

Appendix D
Project plan updated to reflect additional information such as key dates as outlined in section
seven.

Overall changes are considered minor with changes not exceeding 2-3% of the originally
submitted proposal.

25

References

Anderson, C., and Sally, D. (2013) The Numbers Game: Why everything you know about
football is wrong. Penguin Books.

BBC (2014) Football Betting The global industry worth Billion. [Online]. BBC. Available
at: http://www.bbc.com/sport/0/football/24354124 [Accessed 29th May 2014]

Encyclopaedia Britannica (2014) Germany - Climate [Online]. Encyclopaedia Britannica.
Available from:
http://www.britannica.com/EBchecked/topic/231186/Germany/57996/Climate [Accessed
28th May 2014].

European Commission (2012) Study on the Contribution of Sport to Economic Growth and
Employment in the EU. [Online]. European Commission. Available from:
http://ec.europa.eu/sport/library/studies/study-contribution-spors-economic-growth-final-
rpt.pdf [Accessed 1st June 2014].

Football-Data (2014) Data-Files: Germany [Online]. Football-Data. Available from:
http://football-data.co.uk/germanym.php [Accessed 21st May 2014]

Football Pools (2014) The Pioneers of Football Pools [Online]. Football Pools. Available
from: http://www.footballpools.com/cust?action=GoHelp&help_page=about_us [Accessed
1
st
June 2014]

Hamilton, H. (2014) Does the cold really kill Goals? Howard Hamilton Blog, 1
st
May.
Available from: http://www.soccermetrics.net/league-competitions/temperature-vs-goals-
study-premier-league [Accessed 24th May 2014]

Hong, Y (eds.) (2014) Routledge Handbook of ergonomics in sport and exercise. New York:
Routledge.


26

IIBA (2009) A Guide to the business analysis body of knowledge (BABOK Guide.)
International Institute of Business Analysis: Toronto, Canada.


Kasirun, Z.M. (2005) A survey on the requirements elicitation practices among courseware
developers, Malaysian Journal of Computer Science, Vol. 18 No. 1, June 2005, pp. 70-77.

Lewis, T. (2014) How computer analysts took over at Britains top football clubs [Online].
The Guardian, 9
th
March, Available from:
http://www.theguardian.com/football/2014/mar/09/premier-league-football-clubs-computer-
analysts-managers-data-winning [Accessed 28th May 2014].

McGarry, T., ODonoghue, P., and Sampaio, J. (eds) (2013) Routledge Handbook of Sports
Performance Analysis. New York: Routledge

Pezzoli, A., Cristoforu, E., Moncalero, M., Giacometto, F., and Boscolo, A. (2013)
Climatological Analysis, Weather Forecast and Sport Performance: Which are the
Connections? Journal Climatol Weather Forecasting 1: e105

PKR. (2014) Under / Over Betting [Online]. PKR. Available from:
http://bet.pkr.com/en/get-started/bet-types/under-over/ [Accessed 28
th
May 2014].

Riley, T., Williams, A.M. (eds.) (2003) Science and Soccer. 2
nd
Edition. London: Routledge.

Thornes, J. E. (1977), The Effect of Weather on Sport. Weather, 32: 258268.

Weather Online (2014) Climate Germany [Online]. Weather Online. Available from:
http://www.weatheronline.co.uk/reports/climate/Germany.htm [Accessed 28th May 2014].

Wiart, N., Kelley, J., James, D., and Allen, T. (2011) Proceedings of the Institution of
Mechanical Engineers, Part P: Journal of Sports Engineering and Technology 2011 225: 189

S-ar putea să vă placă și