Sunteți pe pagina 1din 12

Bayes Net Applied to Terrorism Risk

CSC384 Mini-Project Report

Amiska Perera

Ruojing Song

Cissy Yao
Abstract
This report documents our attempt to generate a predictive model to aid with the screening
process of potential Syrian refugees entering Canada and the results it generated. Using a
Bayes net which examines a potential refugees sex, age, education, criminal record,
income, and risk of terrorism, our model indicates the probability that any particular refugee
is a terrorist, given their demographic information. Our model is extremely simplified due to
the limited data available, the constraints on space and runtime, and limitations of the Bayes
net itself. Despite the fact that our model requires further work to become a fully functional
system, it acts as proof of concept prototype for a system that formalizes the process that
the Canadian government uses already for selecting eligible refugees.

Our results affirm the usefulness of the Canadian governments screening process. In
prioritizing low-risk groups such as women and children, the government decreases the risk
that a terrorist enters the country as a refugee. We also found that those without a criminal
record and those with a secondary education and above have a lower overall risk of
terrorism.

The results of our Bayes net also suggest that the number of refugees is not entirely limited
by security concerns. One important finding is that for there to be a 13% chance that there is
one terrorist among a group of 25,000 Syrian refugees (the number of refugees that Canada
has pledged to accept), the risk of an individual being a refugee needs to be under 0.001%.
This acceptable risk is actually two orders of magnitude higher than the risk of terrorism that
exists in the Syrian refugee population. Thus, the risk of accepting 25,000 refugees quite
low, even if low-risk demographics have not been prioritized in the screening process. By
giving priority to low-risk demographics, the risk of a terrorist entering the country is further
lowered.

Overall, our model and its results indicate that the overall risk of terrorism from Syrian
refugees is quite low, although it is worthwhile for Canadians to balance compassion with
security in screening refugees.
Table of contents

Abstract
Table of contents
Introduction
Why was a Bayes net selected?
Applying Bayes Nets to the Problem
Outline of Bayes net
Simplifications
Calculating the conditional probability tables
Analysis and Discussion
Conclusion
Introduction

Canada is known for being an open and multicultural nation which can be seen from
cosmopolitan cities such as Toronto and Vancouver. In 2014, Canada has accepted over
165,000 immigrants and over 23,000 refugees. However, the Canadian government has
made headlines in the recent weeks due to the announcement to accept 25,000 Syrian
refugees. In light of the attacks that occurred in Paris, the media has shown much opposition
to the security risks in accepting Syrian refugees.

The nation does not blindly accept all people into the country. Refugees, in particular, must
go through many layers of screenings and security checks before they are allowed into
Canada. This is vital in keeping our current citizens safe from dangers such as possible
diseases and acts of terror from within the country.

This isnt to say that Canada should not accept any refugees, but should be robust in
screening these individuals to ensure such a travesty does not happen in Canada as well.

Screening refugees is a relatively new process that was formalised 100 years after
confederation in 1976 with the Immigration Act. This piece of legislation was insured in 1978,
and was the first instance where refugees were used with respect to immigration. The
Immigration Act refused people based on whether they would be a burden on social welfare
or health services instead of categories such as sexual orientation or disability. This Act was
replaced in 2002 with the Immigration and Refugee Protection Act (IRPA). This provided a
more sophisticated, high-level framework to deal with immigrants and refugees. Again, in
2012, Bill C-31 contained further changes to IRPA that added further security measure to
people coming into the country. One of the major changes was the inclusion of biometric
identification. Taking the persons fingerprints and photograph are now included to the rest of
the screening process.

Refugees wanting to come to Canada have to go through two screening processes before
they are allowed into the country. The first is done by The United Nations Refugee Agency
(UNHCR). Following this, the UHNCR triages these refugees and Canada selects
approximately the top 1% of these refugees. These people go through another screening
process performed by the Government of Canada. Below is a high-level overview of the
process that occurs within Canada:

1. Refugee identification before referral to Immigration, Refugees and Citizenship


Canada (IRCC)
2. Immigration and security interview by experienced visa officers
3. Identity and document verification; biometric and biographic collection
4. Health screening
5. Identity confirmation prior to departure
6. Identity verification upon arrival
At first glance, this seems like a robust process overall, however the current process cannot
fully eliminate the possibility of bringing in a fake refugee. It is clear that an assessment of
the governments process for screening refugees should be done to ensure the continued
safety of citizens of Canada and other countries alike.

Problem

Canada is not new to taking in refugees: the country has taken over 260,000 refugees since
2005 with a minimum of 22,000 in each year.
The problem that we decided to tackle is the refugee screening process. We want to create
a predictive model to aid the examiners in determining if the person is safe to enter a country
as a refugee. Our model will indicate the probability that any particular refugee is a terrorist,
given their demographic information.

Fully solving this problem will require a great deal of time and effort, more than this project
would allow, so we have simplified the problem to look at Syrian refugees entering Canada
who have already been approved by the UHNCR.

We chose this problem because of how significant it is today. These refugees coming from
Syria is a large and highly publicized issue with many people for and against it. It is relevant
as it has could impact the safety of the citizens of Canada. Since Canada is set on taking in
these refugees, it would be ideal of all of these people meant no harm. However, being
idealistic will not prevent any disasters from happening, the best thing that can be done is to
thoroughly screen each individual and verify that they are safe.

Bayesian Networks(BN) were used to model the problem because they represent a set of
random variables and their conditional dependencies. From these variables and
dependencies, inferences can be drawn and conclusions can be made. This applies to the
problem in question because the various personal characteristics of any refugee can be
treated as a random variable, while the conditional probability of the refugee being a terrorist
given these characteristics is the query assessed by the project. In this way, the problem of
finding an individuals terrorism risk given their demographic information works well through
a BN model.

However, BN work best when the chain of causation between variables is known: this results
in a smaller, less connected graph and a lower runtime. Our situation does not satisfy this
requirement due to the chain of causation being unknown and the simplifications we made to
the problem. As a result, our network may contain more connections than necessary and
thus, be less efficient than the optimal graph for the problem. This results in a
higher-than-necessary runtime. Nevertheless, the problem we used satisfies the constraints
of a Bayesian Network, and although, our network may be less than optimal, it is an
adequate fit for the problem.
Applying Bayes Nets to the Problem

Outline of Bayes net


The Bayes net used accounted for the following factors: sex (S), age (A), education (E),
criminal record (C), income (I), and their effects on the risk of terrorism (T). These factors
were selected because they are similar to the factors that the government of Canada
screens for when selecting refugees, and they are factors whose effects on terrorism have
been well-recorded.

Sex, S = male, female


Religion, R = Islam, Christianity, Druzism
Age, A = child (0-17), adult (18-54), senior (55-)
Education, E = none (below secondary school degree), secondary (secondary school
degree), postsecondary (postsecondary or above degree)
Income, I = above (above the poverty line), below (below the poverty line)
Criminal Record, C = True (has criminal record), False (does not have criminal record)
Terrorist, T = True (is a terrorist), False (is not a terrorist)
Simplifications
Some simplifications were made to the probability network, such as the low connectedness
in our Bayes net compared to the correlations seen in the real world. Most of the
relationships between the factors we examined are complex and unclear. For example, the
presence of a causal relationship between income and education level is still being debated,
although it is known that they do affect each other in some way. One could even argue that
every node in the graph would influence every other node in some way, which would result
in not only an n^2 runtime, but cycles within the Bayes net. Thus, for our graph, we made the
choice to simplify these relationships in order to reduce the runtime and avoid cycles in the
graph.

Another simplification was the choice to limit the number of nodes in the network. This is
mostly due to the inordinate size of a Bayes net which would include every relevant factor,
which again would create a very large runtime.

Additionally, we were limited by the lack of data on many of these factors and their effects on
terrorism. For instance, although the Canadian government has suggested that refugees
travelling in families are less likely to become terrorists, we did not find any data on the
percent of refugees who travelled in families, nor any data on the effects of it on the risk of
terrorism. As a result, the factor was not included in the final Bayes net.

The lack of available data on our population also resulted in a few other simplifications.
Although we chose to focus our project on UN-approved Syrian refugees, the limited data on
this developing problem required us to approximate their statistics using similar populations
wherever necessary. For example, the influence of income on terrorism was approximated
using the group Hamas instead of ISIL, the main terrorist group in Syria. Another example
would be the use of the Arab Republic of Egypts statistics for the influence of income on
education level. These simplifications required us to assume that Syrias pre-war population
demographics were similar enough to these groups for these approximations to apply.

The effect of these simplifications is that the results from our Bayes net may be artificially
inflated or decreased. However, its effect would be limited because we know that the query
of our net will always ask for an individuals risk of terrorism. The base risk of terrorism is
extremely low, orders of magnitude different from any errors that should result from our
approximations. Thus, these simplifications should not affect the results significantly.
Calculating the conditional probability tables

The conditional probability table (CPT) for each node was calculated depending on whether
the node has parents and the data available. The CPT always records the probability of the
event given its parents, unless it is a parentless node. For instance, the parentless node Age
had its CPT filled as following:

Age range Probability


<18 years 0.51
18-49 years 0.46
>=50 years 0.03

All of these values were found directly from the United Nations Refugee Agencys G lobal
Trends 2014 report. However, calculating the probability values for nodes who have
parents, or those whose probability values are not directly known, is slightly more
complicated.

For the nodes with parents, the probabilities recorded in the CPT are conditional given its
parents. For example, the CPT for the node Criminal Record recorded the probability of an
individual having a criminal record given its parent nodes, Sex and Age. Thus the
probabilities recorded in the CPT were Criminal Record | Sex and Criminal Record | Age,
respectively.

However, the probability of Criminal Record | Sex is not collected by agencies who study
criminal behaviour, although a related probability value, Sex | Criminal Record, is recorded.
Additionally, the unconditional probabilities of Criminal Record and Sex in the population are
tabulated demographic statistics. Thus, the probability required for the CPT can be
calculated using Bayes Rule:

P(A|B) = P(B|A) P(A) / P(B)

Bayes Rule allows us to calculate P(Criminal Record | Sex) given three other parameters:
P(Sex | Criminal Record), P(Criminal Record), and P(Sex), as follows:

P(Criminal Record | Sex)


= P(Sex | Criminal Record) P(Criminal Record) / P(Sex)

For each combination of Criminal Record (Yes and No) and Sex (Male or Female), the
probability was calculated. For example, probability of Criminal Record = Yes and Sex =
Male was calculated as follows:

P(Criminal Record=Yes | Sex=Male)


= P(Sex=Male | Criminal Record=Yes) P(Criminal Record=Yes) / P(Sex=Male)

All of these values were tabulated: P(Sex=Male | Criminal Record=Yes) from The Crimes
Women Commit: The Punishments They Receive, P(Criminal Record=Yes) from the United
Nations International Statistics on Crime and Justice, and P(Sex=Male) from The World
Factbook.

= (0.969) (0.01842) / (0.5)


= 0.03569796

Thus, P(Criminal Record=Yes | Sex=Male) was found as 0.03569796. This process was
repeated for all combinations of P(Criminal Record | Sex), as well as any CPT whose
probabilities were not directly tabulated in the literature.
Analysis and Discussion
By programming the above simplified Bayes Net into Python, we were able to obtain the
probability of a person being a terrorist given any factors. The following is some of the
results that we tested:

1) Given a person: Gender = male, Age = adult, Religion = Islam, Education = none,
Criminal Record = True, Income = above
Pr (Terrorist = True) = 3.39e-6

2) Given a person: Gender = female, Age = child, Religion = Christianity, Education =


secondary, Criminal Record = False, Income = below
Pr (Terrorist = True) = 6.2e-14

3) Given a person: Gender = female


Pr (Terrorist = True) = 8.02e-9
Given a person: Gender = male
Pr (Terrorist = True) = 9.02e-8

4) Given a person: Age = elder


Pr (Terrorist = True) = 1.64e-7
Given a person: Age = adult
Pr (Terrorist = True) = 8.57e-8
Given a person: Age = teen
Pr (Terrorist = True) = 1.12e-8

5) Given a person: Religion = Islam


Pr (Terrorist = True) = 5.12e-8
Given a person : Religion = Christianity
Pr (Terrorist = True) = 7.43e-12
Given a person : Religion = Druzism
Pr (Terrorist = True) = 2.48e-11

6) Given a person: Education = none


Pr (Terrorist = True) = 6.93e-8
Given a person: Education = secondary
Pr (Terrorist = True) = 2.14e-8
Given a person: Education = postsecondary
Pr (Terrorist = True) = 4.66e-8

While we were not able to find any concrete data on the true percentage of Syrian refugees
who were terrorists, it is obvious that this number must be very low given the low absolute
number of terrorists compared to the sheer number of refugees. Our findings support this
view, as the probability of an individual being a terrorist is extremely low--less than one
percent for any individual, even given the combination of traits which indicate the highest risk
of terrorism.

To ensure that there is less than a 13% chance that one terrorist out of 25,000 refugees
enters the country, the Canadian government should ensure that the overall probability of
terrorism for the refugees is less than 0.001%. This requirement is already met with
unscreened refugees. For instance, the probability of the most-likely group of refugees to
become terrorists, men with low education and criminal records, is 3.39e-4%. This would
translate to less than one terrorist out of 25,000 refugees, which would suggest a lower risk
towards Canadians.

However, the risk of terrorism can be further lowered, as the Canadian government is
already doing, by selecting for demographics that are statistically less likely to become
terrorists, such as women, those without a criminal record, children, and those with a
secondary education and above. In selecting for these characteristics, the risk of terrorism
can be decreased to as low as 6.2e-12%. This indicates that, by selecting for groups with a
lower risk of terrorism, the Canadian government could actually increase the refugee quota
from 25,000 individuals while maintaining the same amount of overall risk.

Our results also suggest that other countries could accept more refugees. For example, the
United States has pledged to accept 10,000 Syrian refugees, much less than the Canadian
quota. Due to this lower quota, ensuring that there is a less than 13% chance of one terrorist
entering the country requires only an overall probability of terrorism as high as 0.01%. As
they are also selecting for low-risk demographics like Canada, the risk of terrorism should be
quite low and the reactionary comments from many American politicians that accepting
refugees will create a high risk of terrorism are highly misguided.
Conclusion
Our Bayes net examined a potential refugees sex, age, education, criminal record, income,
and risk of terrorism to determine the probability that any particular refugee is a terrorist
given their demographics. Although the Bayes net was very simplified and requires further
work to become a useable system, it is a useful prototype for assessing the Canadian
government policies and system for screening eligible refugees.

We found that Canadian governments screening process was useful due to its prioritization
of low-risk groups such as women and children. Additionally, our results suggest that those
without a criminal record and with a secondary education and above pose a lower risk of
terrorism. By selecting for these groups, the government decreases the risk to Canadians
that a terrorist enters the country as a refugee.

Our findings also suggest that the risk of accepting 25,000 refugees quite low, even if
low-risk demographics have not been prioritized in the screening process, due to the overall
low risk of terrorism from refugees. Furthermore, since the Canadian government screens
refugees and gives priority to low-risk demographics, the risk of terrorism is even lower.

Therefore, our Bayes net model and its results suggest that accepting Syrian refugees into
the country does not significantly increase the overall risk of terrorism. Additionally, it is vital
for Canadians to keep compassion in mind, since Syrian refugees experience
life-threatening and traumatizing situations both in their home country and in their attempts
to flee. While it is important for the Canadian government to maintain the safety of the
country, it is also critical that we balance this with acting in a humane way by offering asylum
to those who need it most.

S-ar putea să vă placă și