Mastering Social Media Mining with Python

Ebook592 pages4 hours

Mastering Social Media Mining with Python

Name: Mastering Social Media Mining with Python
Brand: Packt Publishing
Rating: 5.0 (1 reviews)

By Marco Bonzanini

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

About This Book

Make sense of highly unstructured social media data with the help of the insightful use cases provided in this guide
Use this easy-to-follow, step-by-step guide to apply analytics to complicated and messy social data
This is your one-stop solution to fetching, storing, analyzing, and visualizing social media data

Who This Book Is For

This book is for Python developers who want to engage with the use of public APIs to collect data from social media platforms and perform statistical analysis in order to produce useful insights from data. The book assumes just a basic understanding of the Python Standard Library and provides practical examples to guide you toward the creation of your data analysis project based on social data.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateJul 29, 2016

ISBN9781783552023

Author

Marco Bonzanini

Related authors

Skip carousel

Related to Mastering Social Media Mining with Python

Related ebooks

Skip carousel

Learning Data Mining with Python - Second Edition
Ebook
Learning Data Mining with Python - Second Edition
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Learning Data Mining with Python
Ebook
Learning Data Mining with Python
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Python Web Scraping - Second Edition
Ebook
Python Web Scraping - Second Edition
byKatharine Jarmul
Rating: 5 out of 5 stars
5/5
Mastering Python for Data Science
Ebook
Mastering Python for Data Science
bySamir Madhavan
Rating: 3 out of 5 stars
3/5
Python Data Science Essentials
Ebook
Python Data Science Essentials
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Python Data Analysis - Second Edition
Ebook
Python Data Analysis - Second Edition
byArmando Fandango
Rating: 0 out of 5 stars
0 ratings
Getting Started with Beautiful Soup
Ebook
Getting Started with Beautiful Soup
byVineeth G. Nair
Rating: 3 out of 5 stars
3/5
Learning Social Media Analytics with R
Ebook
Learning Social Media Analytics with R
byDipanjan Sarkar
Rating: 0 out of 5 stars
0 ratings
Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others
Ebook
Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others
byAnish Chapagain
Rating: 0 out of 5 stars
0 ratings
Interactive Applications Using Matplotlib
Ebook
Interactive Applications Using Matplotlib
byBenjamin V. Root
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
Ebook
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
byStefanie Molin
Rating: 0 out of 5 stars
0 ratings
Getting Started with Python Data Analysis
Ebook
Getting Started with Python Data Analysis
byVo.T.H Phuong
Rating: 0 out of 5 stars
0 ratings
Web Scraping with Python
Ebook
Web Scraping with Python
byRichard Lawson
Rating: 4 out of 5 stars
4/5
Python Data Analysis
Ebook
Python Data Analysis
byIvan Idris
Rating: 4 out of 5 stars
4/5
Python for Secret Agents
Ebook
Python for Secret Agents
bySteven F. Lott
Rating: 0 out of 5 stars
0 ratings
Python Data Science Essentials - Second Edition
Ebook
Python Data Science Essentials - Second Edition
byBoschetti Alberto
Rating: 4 out of 5 stars
4/5
Advanced Machine Learning with Python
Ebook
Advanced Machine Learning with Python
byJohn Hearty
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
Ebook
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
byYoon Hyup Hwang
Rating: 5 out of 5 stars
5/5
Learning pandas - Second Edition
Ebook
Learning pandas - Second Edition
byHeydt Michael
Rating: 4 out of 5 stars
4/5
Mastering Python Data Analysis
Ebook
Mastering Python Data Analysis
byMagnus Vilhelm Persson
Rating: 0 out of 5 stars
0 ratings
Parallel Programming with Python
Ebook
Parallel Programming with Python
byJan Palach
Rating: 0 out of 5 stars
0 ratings
Designing Machine Learning Systems with Python
Ebook
Designing Machine Learning Systems with Python
byDavid Julian
Rating: 0 out of 5 stars
0 ratings
Regression Analysis with Python
Ebook
Regression Analysis with Python
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Practical Data Science Cookbook - Second Edition
Ebook
Practical Data Science Cookbook - Second Edition
byTony Ojeda
Rating: 0 out of 5 stars
0 ratings
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)
Ebook
Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)
byRituraj Dixit
Rating: 0 out of 5 stars
0 ratings
Python Deep Learning
Ebook
Python Deep Learning
byValentino Zocca
Rating: 5 out of 5 stars
5/5
Artificial Intelligence with Python - Second Edition: Your complete guide to building intelligent apps using Python 3.x, 2nd Edition
Ebook
Artificial Intelligence with Python - Second Edition: Your complete guide to building intelligent apps using Python 3.x, 2nd Edition
byAlberto Artasanchez
Rating: 0 out of 5 stars
0 ratings
Mastering Data Mining with Python – Find patterns hidden in your data
Ebook
Mastering Data Mining with Python – Find patterns hidden in your data
byMegan Squire
Rating: 0 out of 5 stars
0 ratings
Python Unlocked
Ebook
Python Unlocked
byTigeraniya Arun
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
Ebook
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
byBruce Sterling
Rating: 4 out of 5 stars
4/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
The Invisible Rainbow: A History of Electricity and Life
Ebook
The Invisible Rainbow: A History of Electricity and Life
byArthur Firstenberg
Rating: 4 out of 5 stars
4/5
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
Ebook
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
byAndrew Hodges
Rating: 4 out of 5 stars
4/5
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
Ebook
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
byKatherine Johnson Martinko
Rating: 0 out of 5 stars
0 ratings
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
Ebook
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
byDavid Kadavy
Rating: 5 out of 5 stars
5/5
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
Ebook
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
bySeth Reichelson
Rating: 0 out of 5 stars
0 ratings
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
Going Text: Mastering the Command Line
Ebook
Going Text: Mastering the Command Line
byBrian Schell
Rating: 4 out of 5 stars
4/5
People Skills for Analytical Thinkers
Ebook
People Skills for Analytical Thinkers
byGilbert Eijkelenboom
Rating: 5 out of 5 stars
5/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Harnessing Python for Research: Scientific Applications of Python with Michael Kennedy: Still scrabbling with Excel? Consider Python language uses, says programmer and podcaster Michael Kennedy. A general programming language that is easy to use in multiple environments, Python programming is limitless and has numerous open source...
Podcast episode
Harnessing Python for Research: Scientific Applications of Python with Michael Kennedy: Still scrabbling with Excel? Consider Python language uses, says programmer and podcaster Michael Kennedy. A general programming language that is easy to use in multiple environments, Python programming is limitless and has numerous open source...
byFinding Genius Podcast
0 ratings
0% found this document useful
Unraveling Python's Syntax to Its Core With Brett Cannon
Podcast episode
Unraveling Python's Syntax to Its Core With Brett Cannon
byThe Real Python Podcast
100%
100% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
Podcast episode
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
Podcast episode
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
byMaking Data Simple
0 ratings
0% found this document useful
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
Podcast episode
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
byMachine Learning Guide
0 ratings
0% found this document useful
Learning Python Through Errors
Podcast episode
Learning Python Through Errors
byThe Real Python Podcast
0 ratings
0% found this document useful
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
Podcast episode
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Improving the Learning Experience on Real Python
Podcast episode
Improving the Learning Experience on Real Python
byThe Real Python Podcast
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Advantages of Completing Small Python Projects
Podcast episode
Advantages of Completing Small Python Projects
byThe Real Python Podcast
0 ratings
0% found this document useful
Building a Platform Game With Arcade and Covering Python News Monthly
Podcast episode
Building a Platform Game With Arcade and Covering Python News Monthly
byThe Real Python Podcast
0 ratings
0% found this document useful
Measuring Your Python Learning Progress
Podcast episode
Measuring Your Python Learning Progress
byThe Real Python Podcast
100%
100% found this document useful
#059 - 10 Python clean code tips drawn from code reviews
Podcast episode
#059 - 10 Python clean code tips drawn from code reviews
byPybites Podcast
0 ratings
0% found this document useful
#51 Francois Chollet - Intelligence and Generalisation
Podcast episode
#51 Francois Chollet - Intelligence and Generalisation
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Data Visualization with Manuel Lima: Gabi Ferrara and Jon Foust are back today and joined by fellow Googler Manuel Lima.
Podcast episode
Data Visualization with Manuel Lima: Gabi Ferrara and Jon Foust are back today and joined by fellow Googler Manuel Lima.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
Podcast episode
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
byDataFramed
0 ratings
0% found this document useful
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
Podcast episode
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
byPHPRoundtable Podcast
0 ratings
0% found this document useful
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
Podcast episode
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
An Exploration Of Financial Exchange Risk Management Strategies: An interview with Paul Stafford about his experiences using computational methods to build risk management strategies for financial exchange markets.
Podcast episode
An Exploration Of Financial Exchange Risk Management Strategies: An interview with Paul Stafford about his experiences using computational methods to build risk management strategies for financial exchange markets.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
#35 Data Science in Finance
Podcast episode
#35 Data Science in Finance
byDataFramed
0 ratings
0% found this document useful
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
Podcast episode
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
byAnalytics on Fire
0 ratings
0% found this document useful
38: Top 3 BI Skills for 2020 w/ Jen Underwood: Do you know what skill sets you're going to need in the next three years to stay relevant in the BI (business intelligence) industry? Don’t wait for the BI bubble to pop! When I met Jen Underwood, she was one of Microsoft’s first Power BI product...
Podcast episode
38: Top 3 BI Skills for 2020 w/ Jen Underwood: Do you know what skill sets you're going to need in the next three years to stay relevant in the BI (business intelligence) industry? Don’t wait for the BI bubble to pop! When I met Jen Underwood, she was one of Microsoft’s first Power BI product...
byAnalytics on Fire
0 ratings
0% found this document useful
146: Automation Tools for Web App and API Development and Maintenance - Michael Kennedy: Michael Kennedy joins the show this week to share some of the tools he uses during development and maintenance. We talk about tools used for semi-automated exploratory testing. We also talk about some of the other tools and techniques he uses to keep Talk Python Training, Talk Python, and Python Bytes all up and running smoothly.
Podcast episode
146: Automation Tools for Web App and API Development and Maintenance - Michael Kennedy: Michael Kennedy joins the show this week to share some of the tools he uses during development and maintenance. We talk about tools used for semi-automated exploratory testing. We also talk about some of the other tools and techniques he uses to keep Talk Python Training, Talk Python, and Python Bytes all up and running smoothly.
byTest and Code
0 ratings
0% found this document useful
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
Podcast episode
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Power Up Your Java Using Python With JPype - Episode 286: An interview with Karl Nelson about using the JPype library for bridging the Java and Python ecosystems for scientific computing
Podcast episode
Power Up Your Java Using Python With JPype - Episode 286: An interview with Karl Nelson about using the JPype library for bridging the Java and Python ecosystems for scientific computing
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Going Beyond the Basic Stuff With Python and Al Sweigart
Podcast episode
Going Beyond the Basic Stuff With Python and Al Sweigart
byThe Real Python Podcast
0 ratings
0% found this document useful
The Undocumented Web: scraping, private APIs, proxies and “alternative solutions”: What is the undocumented web? Scott and Wes dive into it, discussing APIs, faking, scraping, automation, proxies as well as tips and tricks for best practices. Kyle Prinsloo’s Freelancing & Beyond — Sponsor Kyle Prinsloo teaches you everything...
Podcast episode
The Undocumented Web: scraping, private APIs, proxies and “alternative solutions”: What is the undocumented web? Scott and Wes dive into it, discussing APIs, faking, scraping, automation, proxies as well as tips and tricks for best practices. Kyle Prinsloo’s Freelancing & Beyond — Sponsor Kyle Prinsloo teaches you everything...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Learning Python Through Illustrated Stories
Podcast episode
Learning Python Through Illustrated Stories
byThe Real Python Podcast
0 ratings
0% found this document useful
Computational Thinking & Learning Python During an AI Revolution
Podcast episode
Computational Thinking & Learning Python During an AI Revolution
byThe Real Python Podcast
0 ratings
0% found this document useful

Skip carousel

Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
DJANGO Create A Database-driven Website
Linux Format
Article
DJANGO Create A Database-driven Website
Jun 4, 2019
The Django web framework was named after the famous guitarist Django Reinhardt and was first created by web developers at a small newspaper in Kansas. The main goals of Django is to enable fast development of complex websites with database needs. It
7 min read
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Chicago Tribune
Article
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Jul 10, 2018
3 min read
Manipulate Data Like A Pro With Pandas
Linux Format
Article
Manipulate Data Like A Pro With Pandas
Jul 27, 2021
7 min read
How Image Recognition Works
APC
Article
How Image Recognition Works
Nov 4, 2019
4 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
Top Five AI-ML Books For Business Leaders
Techfastly
Article
Top Five AI-ML Books For Business Leaders
Aug 2, 2021
5 min read
FLASK Web Frameworks
Linux Format
Article
FLASK Web Frameworks
Jun 4, 2019
The main focus of Python has always been to get you cracking on with your coding – the language was never made for web programming. However, this has just made it more interesting to extend the language for the web, or to create an interface to web-b
9 min read
IT For A New World
Business Today
Article
IT For A New World
Jun 10, 2021
6 min read
Machine Learning – With Zero Programming
APC
Article
Machine Learning – With Zero Programming
Aug 12, 2019
6 min read
Lint Hub
Linux Format
Article
Lint Hub
Jul 27, 2021
Pylint – a comprehensive linter that focuses on standards compliance and error detection. It’s likely built into your favourite IDE. Pycodestyle (formerly known as PEP8) – focuses on validating code formatting PEPs, has some overlap with Pylint. Pyfl
1 min read
Common Errors
Linux Format
Article
Common Errors
Aug 27, 2019
If you receive a ‘Script not found’ error, this probably means that you don’t have the mod scripts installed in your Minecraft directory. Check that you’ve replaced .minecraft with the one from McPiFoMo; this should include mcpipy, which will be full
1 min read
Understanding ELT & ETL
Techfastly
Article
Understanding ELT & ETL
Apr 1, 2021
8 min read
Family History In The AI Era
Family Tree UK
Article
Family History In The AI Era
Apr 12, 2024
7 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
Article
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
Note-taking Applications For Family History
Family Tree UK
Article
Note-taking Applications For Family History
Mar 10, 2023
7 min read
Can Twitter Fit Inside the Library of Congress?
The Atlantic
Article
Can Twitter Fit Inside the Library of Congress?
Aug 4, 2016
5 min read
Terminal Velocity
Linux Format
Article
Terminal Velocity
Jun 4, 2019
9 min read
ChatGPT Changed Everything. Now Its Follow-Up Is Here.
The Atlantic
Article
ChatGPT Changed Everything. Now Its Follow-Up Is Here.
Mar 14, 2023
6 min read
Finding Your People
Fast Company
Article
Finding Your People
Mar 19, 2024
1 min read
Does Facebook Even Know How to Control Facebook?
The Atlantic
Article
Does Facebook Even Know How to Control Facebook?
Oct 31, 2017
11 min read
Revealed: The Authors Whose Pirated Books Are Powering Generative AI
The Atlantic
Article
Revealed: The Authors Whose Pirated Books Are Powering Generative AI
Aug 19, 2023
8 min read
The Risks Of The Generative AI Gold Rush
APC
Article
The Risks Of The Generative AI Gold Rush
May 22, 2023
8 min read
Building AI Safely Is Getting Harder and Harder
The Atlantic
Article
Building AI Safely Is Getting Harder and Harder
Dec 22, 2023
4 min read
2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
The Risks Of The Generative AI Gold Rush
PC Pro Magazine
Article
The Risks Of The Generative AI Gold Rush
Apr 6, 2023
8 min read
Inform And Enhance Your Business With Open Data
PC Pro Magazine
Article
Inform And Enhance Your Business With Open Data
Jun 10, 2021
7 min read
This PC Does Not Exist
Maximum PC
Article
This PC Does Not Exist
May 23, 2023
7 min read
AI Search Is a Disaster
The Atlantic
Article
AI Search Is a Disaster
Feb 16, 2023
Microsoft and Google believe chatbots will change search forever. So far, there’s no reason to believe the hype.
4 min read
You Won’t Believe How Well This Algorithm Spots Clickbait
Futurity
Article
You Won’t Believe How Well This Algorithm Spots Clickbait
Aug 29, 2019
3 min read

Related categories

Skip carousel

Reviews for Mastering Social Media Mining with Python

Rating: 5 out of 5 stars

5/5

1 rating0 reviews

Book preview

Mastering Social Media Mining with Python - Marco Bonzanini

Mastering Social Media Mining with Python

Credits

About the Author

About the Reviewer

www.PacktPub.com

eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

1. Social Media, Social Data, and Python

Getting started

Social media - challenges and opportunities

Opportunities

Challenges

Social media mining techniques

Python tools for data science

Python development environment setup

pip and virtualenv

Conda, Anaconda, and Miniconda

Efficient data analysis

Machine learning

Natural language processing

Social network analysis

Data visualization

Processing data in Python

Building complex data pipelines

Summary

2. #MiningTwitter – Hashtags, Topics, and Time Series

Getting started

The Twitter API

Rate limits

Search versus Stream

Collecting data from Twitter

Getting tweets from the timeline

The structure of a tweet

Using the Streaming API

Analyzing tweets - entity analysis

Analyzing tweets - text analysis

Analyzing tweets - time series analysis

Summary

3. Users, Followers, and Communities on Twitter

Users, friends, and followers

Back to the Twitter API

The structure of a user profile

Downloading your friends' and followers' profiles

Analysing your network

Measuring influence and engagement

Mining your followers

Mining the conversation

Plotting tweets on a map

From tweets to GeoJSON

Easy maps with Folium

Summary

4. Posts, Pages, and User Interactions on Facebook

The Facebook Graph API

Registering your app

Authentication and security

Accessing the Facebook Graph API with Python

Mining your posts

The structure of a post

Time frequency analysis

Mining Facebook Pages

Getting posts from a Page

Facebook Reactions and the Graph API 2.6

Measuring engagement

Visualizing posts as a word cloud

Summary

5. Topic Analysis on Google+

Getting started with the Google+ API

Searching on Google+

Embedding the search results in a web GUI

Decorators in Python

Flask routes and templates

Notes and activities from a Google+ page

Text analysis and TF-IDF on notes

Capturing phrases with n-grams

Summary

6. Questions and Answers on Stack Exchange

Questions and answers

Getting started with the Stack Exchange API

Searching for tagged questions

Searching for a user

Working with Stack Exchange data dumps

Text classification for question tags

Supervised learning and text classification

Classification algorithms

Naive Bayes

k-Nearest Neighbor

Support Vector Machines

Evaluation

Performing text classification on Stack Exchange data

Embedding the classifier in a real-time application

Summary

7. Blogs, RSS, Wikipedia, and Natural Language Processing

Blogs and NLP

Getting data from blogs and websites

Using the WordPress.com API

Using the Blogger API

Parsing RSS and Atom feeds

Getting data from Wikipedia

A few words about web scraping

NLP Basics

Text preprocessing

Sentence boundary detection

Word tokenization

Part-of-speech tagging

Word normalization

Case normalization

Stemming

Lemmatization

Stop word removal

Synonym mapping

Information extraction

Summary

8. Mining All the Data!

Many social APIs

Mining videos on YouTube

Mining open source software on GitHub

Mining local businesses on Yelp

Building a custom Python client

HTTP made simple

Summary

9. Linked Data and the Semantic Web

A Web of Data

Semantic Web vocabulary

Microformats

Linked Data and Open Data

Resource Description Framework

JSON-LD

Schema.org

Mining relations from DBpedia

Mining geo coordinates

Extracting geodata from Wikipedia

Plotting geodata on Google Maps

Summary

Mastering Social Media Mining with Python

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: July 2016

Production reference: 1260716

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78355-201-6

www.packtpub.com

Credits

About the Author

Marco Bonzanini is a data scientist based in London, United Kingdom. He holds a PhD in information retrieval from Queen Mary University of London. He specializes in text analytics and search applications, and over the years, he has enjoyed working on a variety of information management and data science problems.

He maintains a personal blog at http://marcobonzanini.com, where he discusses different technical topics, mainly around Python, text analytics, and data science.

When not working on Python projects, he likes to engage with the community at PyData conferences and meet-ups, and he also enjoys brewing homemade beer.

This book is the outcome of a long journey that goes beyond the mere content preparation. Many people have contributed in different ways to shape the final result. Firstly, I would like to thank the team at Packt Publishing, particularly Sonali Vernekar and Siddhesh Salvi, for giving me the opportunity to work on this book and for being so helpful throughout the whole process. I would also like to thank Dr. Weiai Wayne Xu for reviewing the content of this book and suggesting many improvements. Many colleagues and friends, through casual conversations, deep discussions, and previous projects, strengthened the quality of the material presented in this book. Special mentions go to Dr. Miguel Martinez-Alvarez, Marco Campana, and Stefano Campana. I'm also happy to be part of the PyData London community, a group of smart people who regularly meet to talk about Python and data science, offering a stimulating environment. Last but not least, a distinct special mention goes to Daniela, who has encouraged me during the whole journey, sharing her thoughts, suggesting improvements, and providing a relaxing environment to go back to after work.

About the Reviewer

Weiai Wayne Xu is an assistant professor in the department of communication at University of Massachusetts – Amherst and is affiliated with the University’s Computational Social Science Institute. Previously, Xu worked as a network science scholar at the Network Science Institute of Northeastern University in Boston. His research on online communities, word-of-mouth, and social capital have appeared in various peer-reviewed journals. Xu also assisted four national grant projects in the area of strategic communication and public opinion. Aside from his professional appointment, he is a co-founder of a data lab called CuriosityBits Collective (http://www.curiositybits.org/).

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Preface

In the past few years, the popularity of social media has grown dramatically, with more and more users sharing all kinds of information through different platforms. Companies use social media platforms to promote their brands, professionals maintain a public profile online and use social media for networking, and regular users discuss about any topic. More users also means more data waiting to be mined.

You, the reader of this book, are likely to be a developer, engineer, analyst, researcher, or student who wants to apply data mining techniques to social media data. As a data mining practitioner (or practitioner-to-be), there is no lack of opportunities and challenges from this point of view.

Mastering Social Media Mining with Python will give you the basic tools you need to take advantage of this wealth of data. This book will start a journey through the main tools for data analysis in Python, providing the information you need to get started with applications such as NLP, machine learning, social network analysis, and data visualization. A step-by-step guide through the most popular social media platforms, including Twitter, Facebook, Google+, Stack Overflow, Blogger, YouTube and more, will allow you to understand how to access data from these networks, and how to perform different types of analysis in order to extract useful insight from the raw data.

There are three main aspects being touched in the book, as listed in the following list:

Social media APIs: Each platform provides access to their data in different ways. Understanding how to interact with them can answer the questions: how do we get the data? and also what kind of data can we get? This is important because, without access to the data, there would be no data analysis to carry out. Each chapter focuses on different social media platforms and provides details on how to interact with the relevant API.

Data mining techniques: Just getting the data out of an API doesn't provide much value to us. The next step is answering the question: what can we do with the data? Each chapter provides the concepts you need to appreciate the kind of analysis that you can carry out with the data, and why it provides value. In terms of theory, the choice is to simply scratch the surface of what is needed, without digging too much into details that belong to academic textbooks. The purpose is to provide practical examples that can get you easily started.

Python tools for data science: Once we understand what we can do with the data, the last question is: how do we do it? Python has established itself as one of the main languages for data science. Its easy-to-understand syntax and semantics, together with its rich ecosystem for scientific computing, provide a gentle learning curve for beginners and all the sharp tools required by experts at the same time. The book introduces the main Python libraries used in the world of scientific computing, such as NumPy, pandas, NetworkX, scikit-learn, NLTK, and many more. Practical examples will take the form of short scripts that you can use (and possibly extend) to perform different and interesting types of analysis over the social media data that you have accessed.

If exploring the area where these three main topics meet is something of interest, this book is for you.

What this book covers

Chapter 1, Social Media, Social Data, and Python, introduces the main concepts of data mining applied to social media using Python. By walking the reader through a brief overview on machine learning, NLP, social network analysis, and data visualization, this chapter discusses the main Python tools for data science and provides some help to set up the Python environment.

Chapter 2, #MiningTwitter – Hashtags, Topics, and Time Series, opens the practical discussion on data mining using the Twitter data. After setting up a Twitter app to interact with the Twitter API, the chapter explains how to get data through the streaming API and how to perform some frequentist analysis on hashtags and text. The chapter also discusses some time series analysis to understand the distribution of tweets over time.

Chapter 3, Users, Followers, and Communities on Twitter, continues the discussion on Twitter mining, focusing the attention on users and interactions between users. This chapter shows how to mine the connections and conversations between the users. Interesting applications explained in the chapter include user clustering (segmentation) and how to measure influence and user engagement.

Chapter 4, Posts, Pages, and User Interactions on Facebook, focuses on Facebook and the Facebook Graph API. After understanding how to interact with the Graph API, including aspects of security and privacy, examples of how to mine posts from a user's profile and Facebook pages are provided. The concepts of time series analysis and user engagement are applied to user interactions such as comments, Likes, and Reactions.

Chapter 5, Topic Analysis on Google+, covers the social network by Google. After understanding how to access the Google centralized platform, examples of how to search content and users on Google+ are discussed. This chapter also shows how to embed data coming from the Google API into a custom web application that is built using the Python microframework, Flask.

Chapter 6, Questions and Answers on Stack Exchange, explains the topic of question answering and uses the Stack Exchange network as paramount example. The reader has the opportunity to learn how to search for users and content on the different sites of this network, most notably Stack Overflow. By using their data dumps for online processing, this chapter introduces supervised machine learning methods applied to text classification and shows how to embed machine learning model into a real-time application.

Chapter 7, Blogs, RSS, Wikipedia, and Natural Language Processing, teaches text analytics. The Web is full of opportunities in terms of text mining, and this chapter shows how to interact with several data sources such as the WordPress.com API, Blogger API, RSS feeds, and Wikipedia API. Using textual data, the basic notions of NLP briefly mentioned throughout the book are formalized and expanded. The reader is then walked through the process of information extraction with custom examples on how to extract references of entities from free text.

Chapter 8, Mining All the Data!, reminds us of the many opportunities, in terms of data mining, that are available out there beyond the most common social networks. Examples of how to mine data from YouTube, GitHub, and Yelp are provided, along with a discussion on how to build your own API client, in case a particular platform doesn't provide one.

Chapter 9, Linked Data and the Semantic Web, provides an overview on the Semantic Web and related technologies. This chapter discusses the topics of Linked Data, microformats, and RDF, and offers examples on how to mine semantic information from DBpedia and Wikipedia.

What you need for this book

The code examples provided in this book assume that you are running a recent version of Python on either Linux, macOS, or Windows. The code has been tested on Python 3.4.* and Python 3.5.*. Older versions (Python 3.3.* or Python 2.*) are not explicitly supported.

Chapter 1, Social Media, Social Data, and Python, provides some instructions to set up a local development environment and introduces a brief list of tools that are going to be used throughout the book. We're going to take advantage of some of the essential Python libraries for scientific computing (for example, NumPy, pandas, and matplotlib), machine learning (for example, scikit-learn), NLP (for example, NLTK), and social network analysis (for example, NetworkX).

Who this book is for

This book is for intermediate Python developers who want to engage with the use of public APIs to collect data from social media platforms and perform statistical analysis in order to produce useful insights from the data. The book assumes a basic understanding of Python standard library and provides practical examples to guide you towards the creation of your data analysis project based on social data.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: Moreover, the genre attribute is here presented as a list, with a variable number of values.

A block of code is set as follows:

from timeit import timeit

import numpy as np

if __name__ == '__main__':

setup_sum = 'data = list(range(10000))'

setup_np = 'import numpy as np;'

setup_np += 'data_np = np.array(list(range(10000)))'

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

Type your question, or type exit to quit. > What's up with Gandalf and Frodo lately? They haven't been in the Shire for a while...

Question: What's up with Gandalf and Frodo lately? They haven't been in the Shire for a while...

Predicted labels: plot-explanation, the-lord-of-the-rings

Any command-line input or output is written as follows:

$ pip install --upgrade [package name]

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: On the Keys and Access Tokens configuration page, the developer can find the API key and secret, as well as the access token and access token secret.

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of. To send us general feedback, simply e-mail feedback@packtpub.com, and mention the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.

Hover the mouse pointer on the SUPPORT tab at the top.

Click on Code Downloads & Errata.

Enter the name of the book in the Search box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on Code Download.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/bonzanini/Book-SocialMediaMiningPython. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/MasteringSocialMediaMiningWithPython_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at copyright@packtpub.com with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at questions@packtpub.com, and we will do our best to address the problem.

Chapter 1. Social Media, Social Data, and Python

This book is about applying data mining techniques to social media using Python. The three highlighted keywords in the previous sentence help us define the intended audience of this book: any developer, engineer, analyst, researcher, or student who is interested in exploring the area where the three topics meet.

In this chapter, we will cover the following topics:

Social media and social data

The overall process of data mining from social media

Setting up the Python development environment

Python tools for data science

Processing data in Python

Getting started

In the second quarter of 2015, Facebook reported nearly 1.5 billion monthly active users. In 2013, Twitter had reported a volume of 500+ million tweets per day. On a smaller scale, but certainly of interest for the readers of this book, in 2015, Stack Overflow announced that more than 10 million programming questions had been asked on their platform since the website has opened.

These numbers are just the tip of the iceberg when describing how the popularity of social media has grown exponentially with more users sharing more and more information through different platforms. This wealth of data provides unique opportunities for data mining practitioners. The purpose of this book is to guide the reader through the use of social media APIs to collect data that can be analyzed with Python tools in order to produce interesting insights on how users interact on social media.

This chapter lays the ground for an initial discussion on challenges and opportunities in social media mining and introduces some Python tools that will be used in the following chapters.

Social media - challenges and opportunities

In traditional media, users are typically just consumers. Information flows in one direction: from the publisher to the users. Social media breaks this model, allowing every user to be a consumer and publisher at the same time. Many academic publications have been written on this topic with the purpose of defining what the term social media really means (for example, Users of the world, unite! The challenges and opportunities of Social Media, Andreas M. Kaplan and Michael Haenlein, 2010). The aspects that are most commonly shared across different social media platforms are as follows:

Internet-based applications

User-generated content

Networking

Social media are Internet-based applications. It is clear that the advances in Internet and mobile technologies have promoted the expansion of social media. Through your mobile, you can, in fact, immediately connect to a social media platform, publish your content, or catch up with the latest news.

Social media platforms are driven by user-generated content. As opposed to the traditional media model, every user is a potential publisher. More importantly, any user can interact with every other user by sharing content, commenting, or expressing positive appraisal via the like button (sometimes referred to as upvote, or thumbs up).

Social media is about networking. As described, social media is about the users interacting with other users. Being connected is the central concept for most social media platform, and the content you consume via your news feed or timeline is driven by your connections.

With these main features being central across several platforms, social media is used for a variety of purposes:

Staying in touch with friends and family (for example, via Facebook)

Microblogging and catching up with the latest news (for example, via Twitter)

Staying in touch with your professional network (for example, via LinkedIn)

Sharing multimedia content (for example, via Instagram, YouTube, Vimeo, and Flickr)

Finding answers to your questions (for example, via Stack Overflow, Stack Exchange, and Quora)

Finding and organizing items of interest (for example, via Pinterest)

This book aims to answer one central question: how to extract useful knowledge from the data coming from the social media? Taking one step back, we need to define what is knowledge and what is useful.

Traditional definitions of knowledge come from information science. The concept of knowledge is usually pictured as part of a pyramid, sometimes referred to as knowledge hierarchy, which has data as its foundation, information as the middle layer, and knowledge at the top. This knowledge hierarchy is represented in the following diagram:

Figure 1.1: From raw data to semantic knowledge

Climbing the pyramid means refining knowledge from raw data. The journey from raw data to distilled knowledge goes through the integration of context and meaning. As we climb up the pyramid, the technology we build gains a deeper understanding of the original data, and more importantly, of the users who generate such data. In other words, it becomes more useful.

In this context, useful knowledge means actionable knowledge, that is, knowledge that enables a decision maker to implement a business strategy. As a reader of this book, you'll understand the key principles to extract value from social data. Understanding how users interact through social media platforms is one of the key aspects in this journey.

The following sections lay down some of the challenges and opportunities of mining data from social media platforms.

Opportunities

The key opportunity of developing data mining systems is to extract useful insights from data. The aim of the process is to answer interesting (and sometimes difficult) questions using data mining techniques to enrich our knowledge about a particular domain. For example, an online retail store can apply data mining to understand how their customers shop. Through this analysis, they are able to recommend products to their customers, depending on their shopping habits (for example, users who buy item A, also buy item B). This, in

Enjoying the preview?

Page 1 of 1

Mastering Social Media Mining with Python

About this ebook

Marco Bonzanini

Related authors

Related to Mastering Social Media Mining with Python

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for Mastering Social Media Mining with Python

What did you think?

Book preview

Mastering Social Media Mining with Python - Marco Bonzanini

Table of Contents

Mastering Social Media Mining with Python

Mastering Social Media Mining with Python

About the Author

About the Reviewer

eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Note

Tip

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Chapter 1. Social Media, Social Data, and Python

Getting started

Social media - challenges and opportunities

Opportunities