Research Proposal BI in AI

SUICIDAL BEHAVIOUR: TEXTUAL MARKERS IN
SOCIAL MEDIA
INTRODUCTION
The paper focuses on the role social media plays in pushing people to suicides among all age
groups. Social media has transformed the world during the past decade. Social media
platforms such as chat rooms, blogging sites, video sites (YouTube), social networking sites
(e.g. Facebook, Twitter, Google Plus), as well as mails and text messaging have completely
altered traditional methods of communication. At the end of 2011, Facebook had nearly 700
million users; by February 2019, the site has risen to 2.4 billion active users. Facebook has
reported that an average of 300 million photos and content are shared per day. Social media
has become a constant in the everyday lives of people.
Suicide is a serious health problem. The World Health Organisation (WHO) estimates that
close to one million people die every year with a global mortality rate of 16 people per
100,000 or one death every 40 seconds. By 2020, it is predicted to increase to one every 20
seconds. The role that social media plays is a topic of growing debate and interest. The recent
increase in these publicized cases of suicide that involve social media has drawn further
attention to this topic. Researchers are also wondering whether the Internet is actually helping
or hindering suicide prevention. In this context, social network like Facebook and Twitter are
associated with bullying, harassment and even suicide on a daily basis. It is hence critical to
detect potential people at the earliest and in turn prevent suicides.
In this paper, we try to address the challenge of analyzing various Social Media posts and the
detection of suicide-related behaviour. We would be addressing two major challenges:
 Building a solid vocabulary to retrieve the messages from Twitter and Facebook and
dealing with various topics with respect to suicide. (depression, harassment).
 Mining the messages which are deviating from the normal in order to propose a
classification model and hence identify people with a high suicidal behaviour
The people can then be presented to health professionals to help them. The 2019 report of a
Chicago rapper “Cupcakke” raised a few alarms after she tweeted that she was suicidal.
There was a similar incident in December 2018 when a Saturday Night Live cast member
posted a worrying message on Instagram. Social Media website Twitter has its own team that
assesses self-harm reporting forms sent by people who are worried about someone’s mental
health. While, Facebook has its own Artificial Intelligence to scan Facebook posts, comments
and videos for suicide risk which is then reviewed by their team. However, we are still not
entirely sure of how effective it is or how effective it can be.
LITERATURE REVIEW
A large amount of information is available on the topic of suicide. Biddle had conducted a
search of 12 suicide-associated keywords to basically simulate the results of a search
conducted by someone who is seeking knowledge on suicide methods. Recupero had also
conducted a study that examined suicide-related sites on the internet and found that 31% sites
were suicide neutral, 29% were anti-suicide and 11% were pro-suicide. Hence, finding
information on the web would not take much effort.
The underlying question is whether there is an association between the rate of internet and
social media usage and population suicide rates. A journal by John. F. Gunn analyzed the
postings of a fatal suicide in the 24 hours prior to the death of young girl. He noticed that she
made close to 145 tweets where she outlined her history of sexual abuse and a trend was
found indicating an increase in positive emotions over the 24 hours. Shah had conducted a
cross-national study which examined the relationship between population suicide rates and
the prevalence of internet users using data from WHO’s and UNDP’s website and it showed a
positive correlation with the general population suicide rates. A regression analysis was done
and it showed the prevalence of internet was independently related to the population suicides
(p=0.001).
A recent study by Dunlop examined the possible contagion effects on suicidal behaviour via
internet and social media. Of the 719 individuals aged 14 to 24 years of age, 79% of them
reported that were exposed to suicide related content through family, friends and newspapers
and around 59% found such content through internet sources.
Video sharing sites have gained large presence in the current world of the Internet. Lewis
examined the accessibility of some of the most popular YouTube videos associated with non-
suicidal self-injury. Lewis reported that out of the videos, 42 were neutral, 26 were against
self-injury, 23 of them provided a mixed message and 7 were pro-self-injury.
RESEARCH METHODOLOGY
To address this issue, the first thing that we would want is a corpus of tweets or Facebook
posts by suicide individuals. Anonymized data can be collected from Twitter and Facebook,
preferably content which are tagged with the word ‘suicide’ over a specified period. This can
be done with the help of a Twitter API and Facebook API. It will help in the collection of
tweets containing some of the most common phrases which are consistent with suicidal
ideation (phrases like “end my own life”, “better off dead”, “want to die” and so on). Once
this is collected Human annotators or psychologists can be asked to indicate if the following
tweets or posts contained any suicidal ideation by asking the question “Is this person
suicidal?”.
Pre-processing should be the next step. As Twitter and Facebook is filled with non-English
tweets and posts, it should be removed. The second step would be to identify and eliminate
user mentions having the format of @username, URL’s as well as retweets (RT). The next
step would be to remove the stopwords as it does not contribute towards our classification.
Feature extraction would be the next ideal step. Since these tweets and posts lack a proper
pattern, we need to analyze a set of features. LIWC can be used to categorize and label the
phrases. It analyzes the texts word-by-word It can count pronouns, emotional words and other
words such as death. For most of the categories present, it is expressed as percentages of the
total text. POS tagging (Parts of Speech) count is used as another feature. TF-IDF is also used
as a feature to basically reflect the prominence of a particular word.
Since this is a classification problem, every tweet or post is denoted a value of 0 indicating no
suicidal ideation or 1 which indicates suicidal ideation. Hence, the classifiers must be able to
detect this in the various texts. The features presented above are used to train classification
models to identify tweets or posts exhibiting suicidal behaviour. Logistic regression is
employed for classifications.

Research Proposal BI in AI

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Research Proposal BI in AI

Încărcat de

Drepturi de autor:

Formate disponibile

SUICIDAL BEHAVIOUR: TEXTUAL MARKERS IN

S-ar putea să vă placă și