Sunteți pe pagina 1din 9

GSBA 524: HW 1 DUE September 8

Learning objectives and outcomes:


1. Conduct Exploratory Business Data Analysis using Excel
2. The assignment offers comprehensive coverage of Excel.
3. The assignment weaves the technical content into realistic business scenarios and focuses on using Excel as a
decision-making tool.
4. The assignment cases help you progress from a basic understanding to mastery of each application, empowering
you to perform an analysis of each case with confidence in Excel.
5. With more information from data analysis, you will be equipped to make superior decisions and outperform the
competition.
6. Your mastery of essential skills of creating and communicating data analysis for improved decision making will
enhance your career and make numbers fun.

Instructions:
1. Each team consists of 4-6 students.
2. Each team submits one hard copy of the answer sheet on the due date by the beginning of the class.
3. Round all your answers as appropriate. For example, $3.215069710235448 is NOT appropriate! For
intermediate calculations, this rule is not enforced.
4. A zip file should consist of 6 excel files labeled: Case1.xslx, Case2xlsx, …., Case6.xlsx.
5. Each excel file should contain as many sheets as there are case question. For example, Case1.xlsx should
consist of 8 sheets labeled as Q1, Q2, …, Q8 that contain solutions and answers to corresponding questions.
6. Each team must submit one zip file containing excel solutions on blackboard by the beginning of the lecture on
the due date
7. Solutions should be neat and organized!
8. Violations of the above rules will result in point deduction.

Before you begin working on HW, read the above instructions carefully!

Case 1
Background and Objective: After graduating, you join an international education startup. The startup
coordinates with pre-college students that are thinking about attending university in the US. International students
are often unfamiliar with the US college-application process, particularly since in many countries university
admissions are determined by performance on a single, government-sponsored exam. Students struggle to compile
a strong application packet, have little idea which colleges and universities they should target, and do not fully
utilize the financial aid and scholarship opportunities available to them. Your startup guides students through the
application process, and, moreover, provides on-going support for students after they are admitted to help them
acclimatize to American culture and lifestyle and succeed academically. Of course, your firm is a social enterprise
startup, not a charity. At the end of the day, students pay for your services. After initial success, the company is
considering expanding its reach into a new market. Your job is to provide a recommendation.

Data: As part of this strategic decision, you’ve compiled a dataset of a few relevant market indicators for the
countries under consideration. Specifically, using publicly available data at Gap Minder, you’ve compiled the
following metrics:

 The total male population as of 2015


 The total female population as of 2015
 The 2011 GDP per capita
 The % of total income earned by the top 10% highest earners in 2005
 The % of total income earned by the top 20% highest earners in 2005
 The amount of money (per capita) spent on higher-level (tertiary) education expressed as a percentage of
GDP per capita
 The number of people between the ages of 20-39 in 2015

Data is stored in files labeled MarketEntry.txt (labels of the columns (variables) are in file
MarketEntryColumnLabels.txt) and Countries-Continents.txt.

Questions:

1. What are the 3 most populous countries in the dataset?

2. Which three countries have the highest GDP (measured in total $)? (Remember, GDP is different from GDP
per capita.) Hint: You may have to estimate this from the data you have.

3. Which three countries spend the most on higher education per capita?

4. For each continent, determine how many countries are in that continent in the data set. Hint: You may want to
use information in “Countries-Continents.csv”.

5. On average, how much does each continent spend on education ($ per capita)?

6. If we limit ourselves to countries with at least 6 million people between the ages of 20-39, how much does each
continent spend (on average) on education per capita?

7. There is, of course, tons of other publicly available data on the internet that you might use to help inform your
decision. Name 3 other metrics you would like to have on each country that you believe are likely publicly
available. Briefly explain why you would like each metric.

8. Which country would you recommend the startup enter next? Briefly explain your answer and support your
opinion with the data.

Case 2
Background and Objective: The advances in information technology have made it increasingly feasible to
collect and store tremendous amounts of data. For example, one regional supermarket chain collects two gigabytes
of data each day on customer purchases through the use of its store discount card. The question for the supermarket
chain is how knowledge can be extracted from these data and used to operate their business more efficiently and
effectively. Modern business analysts use Excel tools for organizing and analyzing data that enable them to perform
in a matter of seconds calculations that would have taken days to complete a generation ago.
You are hired as a consultant to perform supermarket’s data analytics. This will help the supermarket increase sales
and revenue, and target their marketing more strategically.

Data: The SupermarketTransactions.txt (column labels are in SupermarketTransactionsColumnLabels.txt)


data file contains over 14,000 transactions made by supermarket customers over a period of approximately two
years. One transaction corresponds to only one product category. Column B contains the date of purchase, column
C is the unique identifier for each customer, columns D-H contain information about the customer, columns I-K
contain the location of the store, columns L-N contain information about the product purchased, and the last two
columns indicate the number of items purchased and the amount paid. Sales tax rates are in
SupermarketTransactionsTax.xlsx.

 Note that all monetary values are in dollars.

 Amount paid includes tax, supermarket revenue = amount paid – tax. For example:

Example 1: Example 2:
Amount paid= $37.40 Amount paid= $17.50

Tax=$16.49×0.09=$1.48 Tax=$(4.99+3.00)×0.09=$0.72

Supermarket revenue=$35.92 Supermarket revenue=$16.78

Sales tax is 9%. Sales tax is 9%.

Questions:
1. Sales tax rate in each state and by product category is provided in Taxes.xlsx. What is the total government tax
revenue generated for each country from the supermarket chain?

2. What percentage of purchases (transactions) are made by male customers?

3. What percentage of supermarket patrons are male?

4. Which product category generates the most income for the supermarket and what is the corresponding amount
of revenue for supermarket?

5. Analyze the supermarket revenue by product category and state. Which combination of product category and
state generates the most supermarket revenue? What is the corresponding amount of the supermarket revenue?

6. Which state sold the most number of items (units)? What is the most popular product category in that state?

7. Which day of the week on average generated the most sales (amount paid in dollars)? What is the corresponding
value?

8. Provide the year and the month which correspond to the greatest monthly sales revenue (amount paid in
dollars).
Case 3
Background and Objective: Trojan Crafts is a manufacturer of toys with 30 factories located across major
US cities. With over 15 years of toy production experience, Trojan Crafts has managed to set the standard for
delivering the highest quality products. Each factory manufactures and ships the toys directly to 20 international
markets. The company wants you to conduct production performance analysis.

Data: You are given data on daily shipments from each factory to each country, see file TrojanCrafts.txt, during
March 2012. Each row in the data set represents a daily shipment from a factory to a particular international. The
per-unit manufacturing cost for each factory and the per-unit transportation cost from each factory to each market
are given in TrojanCraftsCost.xlsx.

Questions:
1. Which factory has the highest total manufacturing cost during March 2012? What is the corresponding total
manufacturing cost for that factory?

2. Which market receives the highest daily average shipment in March 2012? What is the corresponding daily
average number of units shipped to this warehouse?

3. Which market generates the highest total transportation cost during March 2012? What is the corresponding
total cost?

4. Identify the top FIVE pairs of factories and markets that generate the highest total transportation costs in March
2012? What are the corresponding costs?

Additional information to answer questions 5. and 6.:

Due to high volatility in gas prices and cost of raw materials, manufacturing and transportation costs are highly
variable. Trojan Crafts analytics department have made projections for the change in manufacturing and
transportation cost for April 2012 and expressed the changes in percent, see file TrojanCraftsChangeInCost.xlsx
For example, predicted manufacturing cost per item for New York factory for April 2012 = $9.35×(1-
0.0074)=$9.28. The shipment volumes are predicted to increase by 10 % in April.

5. Which factory is predicted to have the largest change in total manufacturing cost during April 2012? What is
the corresponding amount of change in dollars?

6. Which market is predicted to generate the highest total transportation cost during April 2012? What is the
corresponding total cost?

Case 4
Background and Objective: Whether you know it or not, you are a "risk" in the eyes of insurance
companies. If you are like most people — you're not an Olympic athlete, and you don't have serious health
problems — then you are probably what is called a "standard risk". Standard risk individuals qualify for an
insurance company's standard rates. Underwriting is the process by which a life insurance company decides which
people to accept for insurance and on what terms. Then the main idea is that the risk premium covers the average
risk. But if the premium is not adequate to the average risk, then the company will probably make a loss. The
amount of extra risk then represents the underwriter’s assessment of how much worse the applicant is in mortality
terms than a standard risk. An extra premium is an additional premium that the life insurance company charges on
the top of its standard premium where an applicant is subject to an extra risk. Mortality cost is the biggest factor in
your premium rates, and will be calculated by evaluating likelihood of dying during the policy. Your life insurance
company will consider the following: health condition, age, gender, etc.

One of the simplest types of insurance is Term life insurance. It is life insurance, which provides coverage at a
fixed rate of payments for a limited period of time. After that period expires, coverage at the previous rate of
premiums is no longer guaranteed and the client must either forgo coverage or potentially obtain further coverage
with different payments or conditions.

Perform insurance data analysis to determine the amount of premiums paid, what part of that amount should be kept
in reserve to cover future liabilities, potential profit/loss, etc.

Data: On September 7, 2014, MetLife LADT’s office have successfully issued 78 one year term life insurance
contracts (InsuranceData.txt). Company’s actuaries and underwriters calculated annual premiums in the following
way:

Risk premium=Insurance amount × Death probability (depends on gender and age, see Mortality.xls) ×
(1+Extra risk factor).

Based on the value of the extra risk factor, risk is classified into 4 categories:

No risk, if Extra risk factor=0.0

Low risk, if 0.2>=Extra risk factor>0


Extra risk =
Moderate risk, if 0.5>=Extra risk factor>0.2

High risk, if Extra risk factor>0.5

Insurance premium = Insurance amount × Death probability (depends on gender and age) × (1+Extra risk
factor) + Policy fee,

where the value of the policy fee (covers salaries for underwriters, office rent, utilities, administrative expenses,
etc.) is based on extra risk level:

10, if No risk

15, if Low risk


Policy fee =
20, if Moderate risk

25, if High risk

Death probabilities, based on age and gender, can be found in MortalityTable.xls. Each policy has insurance
amount of $20,000.

Questions:
1. What is the average risk premium based on the policies issued on September 7, 2012?

2. What is the average insurance premium based on policies issued on September 7, 2012?

3. Find the average insurance premium for different categories of “Extra risk” characteristic for males and females
separately. Report the largest average insurance premium, and its corresponding “Extra risk” category and
gender.
4. Risk premiums help the company create technical provisions (reserves) to cover insurance events associated
death risk. Of the 78 life insurance policies, would the reserve funds be sufficient to cover one death? If not,
what is the solution to insure that the company has sufficient reserves to cover 1 death? Briefly explain.

5. Calculate the average insurance premium by age, gender and “Extra risk” characteristic. Report the triplet (age,
gender and “Extra risk” characteristic) that corresponds to the largest average premium and provide the
corresponding value.

6. Assume that all 78 policies will be renewed for one more year and that no one’s health condition has changed
and that the insurance amount for each policy remained unchanged. The insurance company has decided to
increase the policy fee by 10% for the next year. What is the total amount of risk premiums the company will
collect during the second year?

Case 5
Background and Objective: A transportation network company (TNC) is a company that uses online-
enabled platform to connect to passengers with drivers using their personal, non-commercial vehicles. Examples
would include ridesharing companies such as Uber, Lyft, Sidecar, etc… TNC developed a computing platform,
which creates an online marketplace, in which a car owner, registered to the company, may offer their own labor
and car to customers, who request a ride in the marketplace. The price of the rides is dynamic and it depends on the
distance traveled as well as the daily gas prices.
A couple of USC Marshall MBA students decided to start a new TNC in Los Angeles and they called it Marshall
Rides. After almost a year of operation and collecting data about the Marshall Rides (March 15, 2014 through April
12, 2014), they wanted to run some analysis to see how their business is doing and what insights they can get from
the collected data to improve their business model. Unfortunately, they haven’t taken the GSBA 524 class yet, so
they are seeking to hire one of you! Please help them answer the following questions.

Data: The dataset, sored in MarshallRides.txt, has 1,568 observations and 6 variables. Each observation
represents a ride by a customer on a specific date. The 6 variables are listed in the following table with their
description:

Variable Description
Date Date of the ride
Customer ID A number to identify each customer
Age Age of the customer
Ride History Represents how many times the customer have used Marshall Rides
including the current ride
Destination Ride destination in Los Angeles
Distance Travelled Distance travelled by the car in current ride

Questions:
1. What is the age of the youngest customer(s)? What is the average age of all customers? And what is the age of
the oldest customer(s)?

2. What is the total number of different customers that have used Marshall Rides in this period?

3. Which date (day/month/year) does correspond to the largest total distance traveled? What is the corresponding
total value of distance traveled for that day? How many rides were made in that day?
4. The company wants to determine the busiest day of the week. Which day of the week corresponds to the largest
average distance traveled? What is the corresponding value of the average distance traveled for that weekday?
What is the average number of rides made in that weekday?

5. Marshall Rides are going to send a gift card to the top 5 % of the most frequent customers. What are the ID’s of
the top 5% of the most frequent customers between March 15 and March 31, 2014? Hint: to determine the
number of the top 5% of the most frequent customers, round to the nearest integer.

6. The company classifies its customers with respect to age:

Age Age Group


0 - 25 Young
25 - 40 Mature
40 and above Old

Which age group has the highest travel history (in terms of number of rides)? What is the total number of rides
for that group?

7. Marshall Rides charges a base fare of $1.75 for each mile travelled. Given that the gas prices change every day,
the ride fare is volatile. The company charges an additional amount per mile based on the destination, rates are
could be found in PriceIncrease.xlsx.

The company decided to give a discount rate of 10% off of rides during March 15 through April 12, 2014 for
customers who have used their services 10 times or more (excluding current ride). Marshall Rides uses the
following formula to charge their customers:

Ride Fare before Discount = (1.75 × Distance Travelled) + (Distance Travelled × PriceIncrease)

Ride Fare after Discount = Ride Fare before Discount × (1-Discount Rate)

What is the total revenue the company collects from March 15, 2014 through April 12, 2014?

8. How much revenue does the company lose because of the discount during March 15 through April 12, 2014?

Case 6
Background and Objective: Marshall School of Business offers a wide variety of graduate business and
related majors:
Big Data Analytics BDA
Business Statistics BS
Accounting AC
Marketing MK
Management MG
Business Administration BA
Information systems IS
Finance FN
Operations management OM
You are the grader for BUAD 516, a core class in data analysis and modeling at USC MSOB. At the end of a
semester your task is to assign final grades and perform grade analysis based on the following guidelines. The
syllabus of the course indicates the following: the course grades are based on four quizzes (the lowest of five
should be dropped), two individual projects, a team project, a midterm and a final exam.

The final grade will be based on the The letter grade will be assigned as follows:
following weightings:
Course total Course Numerical
weighted letter value of
Weight
score range grade course grade
Assignment in %
94 to 100 A 4
Quizzes 12 92 to 94 A- 3.7
Projects 20 90 to 92 B+ 3.3
Team Project 15 84 to 90 B 3
Midterm Exam 20 82 to 84 B- 2.7
Final Exam 33 80 to 82 C+ 2.3
74 to 80 C 2
72 to 74 C- 1.7
70 to 72 D+ 1.3
64 to 70 D 1
62 to 64 D- 0.7
below 62 F 0

Data: Gradebook data from blackboard was downloaded and is stored in Gradebook.txt. The above tables are in
GradebookAdditionalInfo.xlsx. Note that all assignment and exam scores in the gradebook are based on a 0 to
100% scale.

Questions:
1. Which assignment has the highest average? What is the corresponding average?

2. On a 0 to 100 percent scale calculate the average final grade in the course. (Hint: do not forget that the final
grade is the weighted average!)

Additional information:

Students are usually concerned with GPA as it relates to their performance over a period of time in several courses;
however, professors must be concerned with the course GPA because sometimes a particular course is curved. On
a 0 to 4 scale calculate the average GPA for the course. Marshall School of Business standard specifies that the
curve on a 4.0 scale be 3.0 give or take 0.1.

3. On a 0 to 4 scale calculate the average final grade in the course.

4. Does the course need to be curved? Why? Please note that this question do not ask you to curve the grades!

5. If instead of dropping the lowest quiz score, a professor decides to drop two lowest quiz scores. On a 0 to 4
scale calculate the new average final grade in the course. To answer the following questions, drop only the
lowest quiz.
Additional information:

The following defines qualitative performance in the course: "Poor Performance!" for those students who have
grades F, D-, D or D+. "Below Average!" for those students who have grades C-, C, or C+. "Not Bad!" for those
students who have grades B-, B, or B+. "Excellent!" for those students who have grades A, or A-.

6. What is the distribution in percentage of the above 4 qualitative performance descriptors?

7. Which major corresponds to the lowest percentage of Below Average performance in the course? What the
corresponding percentage value?

8. Students who receive 0 points on quizzes, projects, or exams are more likely to receive a “Poor Performance”.
Among students receiving a “Poor Performance”, what is the average number of quizzes, projects, or exams
with a 0 point value?

9. Mirzagaliyeva Shakhizada ( ID 20006187) this semester took the following courses: 2 unit course, BUAD 525,
from Prof. Plotts and received D+, 4 unit course, DSO 524, from Prof. Ku and received A-, 6 unit course,
GSBA 545, from Prof. Porter and received C+ and 3 unit course, GSBA 524, from Prof. Thurston for which
you need to determine the grade. Calculate the 4.0 scale GPA for Shakhizada.

10. Due to a very low average, Marshall has decided to inflate (“curve”) everybody’s total weighted score by the
percentage specified in the below table

A A- B+ B B- C+ C C- D+ D D- F
BDA 0.0 1.5 2.1 4.3 4.0 12.0 8.0 10.0 4.0 11.0 11.0 12.0
BS 0.0 1.8 2.3 2.8 4.0 4.0 9.0 8.0 4.0 11.0 3.0 14.0
AC 0.0 1.4 2.0 3.9 7.0 6.0 3.0 8.5 8.0 11.0 2.0 14.0
MK 0.0 1.3 2.7 4.1 7.0 4.0 6.0 8.0 5.0 2.0 4.0 14.0
MG 0.0 1.2 2.6 2.9 8.0 9.0 4.0 5.0 7.0 6.0 3.0 10.0
BA 0.0 1.6 2.5 3.6 6.0 9.0 9.0 9.0 8.0 10.0 8.0 13.0
IS 0.0 1.7 2.5 2.8 4.0 7.0 8.0 3.0 6.0 7.0 9.0 7.0
FN 0.0 1.9 2.4 4.7 7.0 8.0 9.0 5.0 3.0 4.0 9.0 12.0
OM 0.0 0.9 1.9 2.5 3.0 3.0 5.0 9.0 8.0 8.0 9.0 11.0

On a 0 to 4 scale calculate and report the new “inflated” average final grade in the course.

S-ar putea să vă placă și