Bine ați venit la Scribd!

TopStory Presentation

Încărcat de

0% au considerat acest document util (0 voturi)

39 vizualizări8 pagini

Current news stories are collected from the new york times, the Guardian. Subjects are extracted from the articles using the Alchemy Concept Tagging API. Up to 20,000 most recent tweets from the past 24 hours are pulled from each database.

Descriere originală:

Drepturi de autor

Formate disponibile

PPTX, PDF, TXT sau citiți online pe Scribd

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Raportați acest document

Drepturi de autor:

Formate disponibile

Descărcați ca PPTX, PDF, TXT sau citiți online pe Scribd

Indicator pentru conținut neadecvat

0% au considerat acest document util (0 voturi)

39 vizualizări8 pagini

TopStory Presentation

Încărcat de

Dhruv

Drepturi de autor:

Formate disponibile

Descărcați ca PPTX, PDF, TXT sau citiți online pe Scribd

Indicator pentru conținut neadecvat

Salt la pagina

Sunteți pe pagina 1din 8

Căutați în document

TopStory

COMS6998: Cloud Computing & Big Data

Fall 2015

Aaron Zakem
Khetthai Laksanakorn
Dhruv Kuchhal
Venciya George

Team
Aaron Zakem
MSc. in Computer Science
Machine Learning Track

Khetthai Laksanakorn
B.Sc. in Computer Engineering

Dhruv Kuchhal

Venciya George

MSc. in Electrical Engineering

MSc. in Computer Science

Machine Learning Track

Project Summary
Collect current news stories and extract subjects.
Monitor Tweet stream and store tweets and hashtags.
Determine trending news subjects by tweet subjects by top 5 most frequently
occurring subjects and hashtags.
Display trending subjects and example stories and tweets on web page.

News Stories
Current news stories are collected from the New York Times, the Guardian, and a
variety of blogs and news sources provided by the Alchemy DataNews API.
Subjects are extracted from the New York Times, Guardian, and Alchemy DataNews
articles using the Alchemy Concept Tagging API. For each article, the 3 concepts
with the highest relevance scores are stored as the subjects of the article.
The top five most frequently occurring subjects are stored as the trending subjects.
A separate subject count is performed for only the New York Times and Guardian
articles, and the trending subjects are stored for this subset of articles alone.
The article aggregator is deployed on Elastic Beanstalk and refreshes the article
collection and top subjects every 4 hours.

Tweets
The Tweet stream is monitored and tweet text, hashtags, and other identifying information
is stored in two databases.
One database stores unfiltered tweets sampled from the tweet stream; the other database
stores tweets filtered by a string with the keywords news, report, world, politics, economy,
business, sports, international.
Up to 20,000 most recent tweets from the past 24 hours are pulled from each database
when the webpage is loaded. The hashtags are taken as subjects. The top subjects in
both the filtered tweet stream and unfiltered stream are determined by frequency of
occurrence.
Many tweets in the unfiltered stream relate to celebrity news and various celebrity-related
contests (e.g., MTV Stars). The goal of the database for the filtered stream is to see
whether more significant subjects occur at a higher rate.

Architecture

Architecture and APIs

The story aggregator is a node.js application deployed on Elastic Beanstalk. It makes use of the
NYTimes, Guardian, Alchemy DataNews, Alchemy Concept Tagging, and mySQL APIs. Data for
the aggregated articles is stored in a mySQL RDS instance on AWS.
The tweet aggregator is comprised of two node.js applications which make use of the Twitter and
mySQL APIs. Data for the aggregated tweets is stored in two separate mySQL RDS instances
on AWS.
The story server, which functions as the back end server, is a node.js application deployed on
Elastic Beanstalk, and makes use of the mySQL and Express APIs. The front end server is also
a node.js application deployed on Elastic Beanstalk that makes use of the mySQL and Express
APIs.

Results

S-ar putea să vă placă și

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
De la Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Evaluare: 4 din 5 stele
4/5 (5783)
The Yellow House: A Memoir (2019 National Book Award Winner)
De la Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Evaluare: 4 din 5 stele
4/5 (98)
Never Split the Difference: Negotiating As If Your Life Depended On It
De la Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Evaluare: 4.5 din 5 stele
4.5/5 (838)
Shoe Dog: A Memoir by the Creator of Nike
De la Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Evaluare: 4.5 din 5 stele
4.5/5 (537)
The Emperor of All Maladies: A Biography of Cancer
De la Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Evaluare: 4.5 din 5 stele
4.5/5 (271)
Fear: Trump in the White House
De la Everand
Fear: Trump in the White House
Bob Woodward
Evaluare: 3.5 din 5 stele
3.5/5 (738)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
De la Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Evaluare: 4 din 5 stele
4/5 (890)
The Little Book of Hygge: Danish Secrets to Happy Living
De la Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Evaluare: 3.5 din 5 stele
3.5/5 (399)
Team of Rivals: The Political Genius of Abraham Lincoln
De la Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Evaluare: 4.5 din 5 stele
4.5/5 (234)
Yes Please
De la Everand
Yes Please
Amy Poehler
Evaluare: 4 din 5 stele
4/5 (1888)
Grit: The Power of Passion and Perseverance
De la Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Evaluare: 4 din 5 stele
4/5 (587)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
De la Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Evaluare: 4.5 din 5 stele
4.5/5 (265)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
De la Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Evaluare: 3.5 din 5 stele
3.5/5 (231)
On Fire: The (Burning) Case for a Green New Deal
De la Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Evaluare: 4 din 5 stele
4/5 (72)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
De la Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Evaluare: 4.5 din 5 stele
4.5/5 (474)
Principles: Life and Work
De la Everand
Principles: Life and Work
Ray Dalio
Evaluare: 4 din 5 stele
4/5 (599)
Rise of ISIS: A Threat We Can't Ignore
De la Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Evaluare: 3.5 din 5 stele
3.5/5 (137)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
De la Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Evaluare: 4.5 din 5 stele
4.5/5 (344)
The Unwinding: An Inner History of the New America
De la Everand
The Unwinding: An Inner History of the New America
George Packer
Evaluare: 4 din 5 stele
4/5 (45)
Steve Jobs
De la Everand
Steve Jobs
Walter Isaacson
Evaluare: 4.5 din 5 stele
4.5/5 (806)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
De la Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Evaluare: 3.5 din 5 stele
3.5/5 (2219)
Angela's Ashes: A Memoir
De la Everand
Angela's Ashes: A Memoir
Frank McCourt
Evaluare: 4.5 din 5 stele
4.5/5 (440)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
De la Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Evaluare: 4 din 5 stele
4/5 (1090)
John Adams
De la Everand
John Adams
David McCullough
Evaluare: 4.5 din 5 stele
4.5/5 (2409)
Bad Feminist: Essays
De la Everand
Bad Feminist: Essays
Roxane Gay
Evaluare: 4 din 5 stele
4/5 (1015)
The Glass Castle: A Memoir
De la Everand
The Glass Castle: A Memoir
Jeannette Walls
Evaluare: 4.5 din 5 stele
4.5/5 (1711)
The Outsider: A Novel
De la Everand
The Outsider: A Novel
Stephen King
Evaluare: 4 din 5 stele
4/5 (1800)
The Woman in Cabin 10
De la Everand
The Woman in Cabin 10
Ruth Ware
Evaluare: 3.5 din 5 stele
3.5/5 (2322)
A Man Called Ove: A Novel
De la Everand
A Man Called Ove: A Novel
Fredrik Backman
Evaluare: 4.5 din 5 stele
4.5/5 (4609)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
De la Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Evaluare: 4.5 din 5 stele
4.5/5 (119)
The Light Between Oceans: A Novel
De la Everand
The Light Between Oceans: A Novel
M.L. Stedman
Evaluare: 4.5 din 5 stele
4.5/5 (789)
Brooklyn: A Novel
De la Everand
Brooklyn: A Novel
Colm Tóibín
Evaluare: 3.5 din 5 stele
3.5/5 (1937)
Wolf Hall: A Novel
De la Everand
Wolf Hall: A Novel
Hilary Mantel
Evaluare: 4 din 5 stele
4/5 (3811)
Manhattan Beach: A Novel
De la Everand
Manhattan Beach: A Novel
Jennifer Egan
Evaluare: 3.5 din 5 stele
3.5/5 (791)
Little Women
De la Everand
Little Women
Louisa May Alcott
Evaluare: 4 din 5 stele
4/5 (104)
The Perks of Being a Wallflower
De la Everand
The Perks of Being a Wallflower
Stephen Chbosky
Evaluare: 4.5 din 5 stele
4.5/5 (2099)
The Art of Racing in the Rain: A Novel
De la Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Evaluare: 4 din 5 stele
4/5 (4193)
A Tree Grows in Brooklyn
De la Everand
A Tree Grows in Brooklyn
Betty Smith
Evaluare: 4.5 din 5 stele
4.5/5 (1929)
Her Body and Other Parties: Stories
De la Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Evaluare: 4 din 5 stele
4/5 (821)
Sing, Unburied, Sing: A Novel
De la Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Evaluare: 4 din 5 stele
4/5 (1103)
The Constant Gardener: A Novel
De la Everand
The Constant Gardener: A Novel
John le Carré
Evaluare: 3.5 din 5 stele
3.5/5 (104)
Oracle
Document14 pagini
Oracle
Armina
Încă nu există evaluări
SQL-2 Select
Document24 pagini
SQL-2 Select
muaz.jutt113
Încă nu există evaluări
Tutorial de CodeIgniter
Document27 pagini
Tutorial de CodeIgniter
apierolli
Încă nu există evaluări
Kelompok Analisis Sistem Informasi: Jurnal Informatika:Jurnal Pengembangan IT (JPIT), Vol. 2, No. 1, Januari 2017
Document8 pagini
Kelompok Analisis Sistem Informasi: Jurnal Informatika:Jurnal Pengembangan IT (JPIT), Vol. 2, No. 1, Januari 2017
Suhendar
Încă nu există evaluări
Chapter 3: Introduction To Database Solutions
Document4 pagini
Chapter 3: Introduction To Database Solutions
Aman Singh
Încă nu există evaluări
Function Modules
Document8 pagini
Function Modules
naveendas96
Încă nu există evaluări
Single-Row Subqueries
Document12 pagini
Single-Row Subqueries
kishu
Încă nu există evaluări
SAP ECC Systems SP Upgrade and Impact On SAP BI
Document10 pagini
SAP ECC Systems SP Upgrade and Impact On SAP BI
Vasu Sree
Încă nu există evaluări
Using RMAN To Duplicate A Database
Document22 pagini
Using RMAN To Duplicate A Database
mohr_shaheen
Încă nu există evaluări
BeginningSQL Joes2Pros WebSample
Document83 pagini
BeginningSQL Joes2Pros WebSample
harit79
Încă nu există evaluări
Project Synopsis: Submitted To: Submitted By: Mr. Abc ABC
Document17 pagini
Project Synopsis: Submitted To: Submitted By: Mr. Abc ABC
arya tiwari
Încă nu există evaluări
C# Interview Questions
Document36 pagini
C# Interview Questions
Srihari Muppalla
Încă nu există evaluări
Data Structure (DS) Solved MCQs
Document6 pagini
Data Structure (DS) Solved MCQs
Chinmay Inamdar
Încă nu există evaluări
The Forrester Wave™ - Data Resiliency Solutions, Q3 2019
Document16 pagini
The Forrester Wave™ - Data Resiliency Solutions, Q3 2019
DO MINH HUNG
Încă nu există evaluări
Git Hub
Document65 pagini
Git Hub
PREETI KUMARI
Încă nu există evaluări
Keymanagement Life Cycle
Document11 pagini
Keymanagement Life Cycle
naveensamazon
Încă nu există evaluări
Managing SAP ASE From The Command Line
Document484 pagini
Managing SAP ASE From The Command Line
Charith Weerasekara
Încă nu există evaluări
2016 Answers Assignment1 Software Development Tools Usq Csc2408
Document6 pagini
2016 Answers Assignment1 Software Development Tools Usq Csc2408
alwaysastudent
Încă nu există evaluări
Infoplc Net Sitrain 15 Documenting Saving Archiving
Document26 pagini
Infoplc Net Sitrain 15 Documenting Saving Archiving
Bijoy Roy
Încă nu există evaluări
Dbms Slides
Document1.669 pagini
Dbms Slides
Shrey Khokhawat
Încă nu există evaluări
Fat32 To Ntfs
Document2 pagini
Fat32 To Ntfs
Bharat Goel
Încă nu există evaluări
Topic 4 Normalization
Document64 pagini
Topic 4 Normalization
Ruby Cortez
Încă nu există evaluări
4D ODBC Pro Reference Guide Windows and Mac OS Versions
Document166 pagini
4D ODBC Pro Reference Guide Windows and Mac OS Versions
oudet9977
Încă nu există evaluări
ETE 300 Final Report PDF
Document4 pagini
ETE 300 Final Report PDF
ETE 18
Încă nu există evaluări
DFo 1 1
Document14 pagini
DFo 1 1
Dwi Wahyu Prabowo
Încă nu există evaluări
HCIA-Storage mock exam questions
Document5 pagini
HCIA-Storage mock exam questions
johan benhabi
Încă nu există evaluări
SQL Basics
Document8 pagini
SQL Basics
sandip110
Încă nu există evaluări
How To Influence The DB2 Query Optimizer Using Optimization Profiles
Document52 pagini
How To Influence The DB2 Query Optimizer Using Optimization Profiles
James L. Chamberlain
Încă nu există evaluări
Build Spring Boot REST API with MySQL
Document15 pagini
Build Spring Boot REST API with MySQL
Aureliano Duarte
Încă nu există evaluări
Microsoft IIS Configuring BIND To Support Active Directory
Document11 pagini
Microsoft IIS Configuring BIND To Support Active Directory
James Omara
Încă nu există evaluări