Sunteți pe pagina 1din 3

COMP529/336: COURSEWORK ASSIGNMENT #2

(STREAM ANALYTICS)

Dr. Bakhtiar Amen

Coursework Date: 25/11/2019

Due Date: 16/12/ 2019

COMP529 1
INTRODUCTION

This assessed coursework assignment is worth 20% of your overall grade for COMP529/COMP336 module.
Failure on this assignment can be compensated through higher marks in other assessments on the module. The
assignment aims to test your understanding of streaming analytics, with a focus on your ability to use Storm to
solve Big Data Analytic problems. More specifically, it aims to partially assess the following learning outcome for
COMP529/336: “understanding of the middleware that can be used to enable algorithms to scale up to
analysis of large data streams in real-time”.

ASSESSMENT

The report will be assessed according to the following criteria:

Criterion Percentage

Clarity of presentation (including succinctness) of main report 20%


Quality of Java code (including assessment of how easy it is to understand) 40%
Quality of analysis performed 40%

SUBMISSION

Please submit your coursework online using the COMP529/ COMP336 page on VITAL by 3pm on Monday 16th
December. Standard lateness penalties will apply to any work handed in after this time. The report and Java
program must be written by yourself, using your own words (see the University guidance on academic integrity
for additional information).

TASK

The UK general election is planned to be held on the 12th December 2019. For this election, four parties are
running their own campaign to win most of the parliament seats. For every parliamentary constituency of the
United Kingdom, one Member of Parliament (MP) will be elected to join the House of Commons from either:
Labour, Conservative, Liberal Democrats, Green or Brexit party.

To predict the outcome of UK’s 2019 general election, you have been asked to monitor Twitter through one of
the hash-tags (e.g., #generalelection or #GE2019). This will allow you to identify the key supporters, and to
predict what party is most likely will win the majority of the election on December the 12th. The code for a spout
that extracts a “streaming” feed from Twitter is here:
https://github.com/davidkiss/storm-twitter-word-count

Your task is therefore as follows:

1) Set up a Storm cluster;


2) Write a Java program for a Storm topology job that includes a:
a. Spout that produces a stream of tweets;
b. Bolt that identifies tweets that contain some keywords related to each party (e.g., green
party, conservative, labout, brexit).
3) Bolt that collects information about the likely outcome from the general election. Use the Storm
topology to predict who will win the UK general election on the December 12th.

COMP529 2
Your output report

The output from this coursework is a brief report suggested to have sections that describe:

1) Middleware configuration: How you configured the Storm middleware (including a description of your
Storm cluster and your rationale for this choice).
2) Data Analytic Design: How you designed the Storm topology (including your rationale for your design).
3) Results: The results obtained (excluding any discussion).
4) Discussion of Results;
5) Conclusions and Recommendations (including discussion of how you would perform the task if it were
to be undertaken at much larger scale).

Format of your report

1) The output from this coursework is a brief report to be less than or equal to two 1 A4 pages excluding
any appendices (two pages only), text size is 12-point, justify text, and in only pdf/docx formats.
2) Make sure to save your file under your surname + module code (e.g., Abcd_COMP336).
3) You should include a listing of the Java program for your Storm topology in an appendix (no longer
than 1 page).

1
While the requirement is to produce no more than 2 pages, it is anticipated that the challenge will be to fit
everything into those 2 pages: it is unlikely that a report of much less than 2 pages will result in a high mark.

COMP529 3

S-ar putea să vă placă și