Sunteți pe pagina 1din 8

Data Modelling and Databases (DMDB) ETH Zurich

Spring Semester 2017 Systems Group


Lecturer(s): Gustavo Alonso, Ce Zhang Date: March 20/March 27, 2017
Assistant(s): Claude Barthels, Eleftherios Sidirourgos, Last update: August 16, 2017
Eliza Wszola, Ingo Müller, Kaan Kara, Renato Marro-
quín, Zsolt István

Exercise 4: SQL
The exercises marked with * will be discussed in the exercise session. You can solve the other
exercises as practice, ask questions about them in the session, and hand them in for feedback.
All exercises may be relevant for the exam.

Ask Renato (marenato@inf.ethz.ch) for feedback on this week’s exercise sheet or give it
to the TA of your session (preferably stapled and with your e-mail address).

In this exercise you will write SQL queries and execute them on the databases that you have
to set up following the instructions found in the course website. For reference, the database
schema is provided where primary keys are underlined and foreign keys are in italic.

1 Employee Database*
The following queries refer to the employee database. For reference, the schema of this database
is shown in Figure 1.

departments(dept_no:string, dept_name:string)
dept_emp(emp_no:int, dept_no:string, from_date:date, to_date:date)
dept_manager(emp_no:int, dept_no:string, from_date:date, to_date:date)
employees(emp_no:int, birth_date:date, first_name:string, last_name:string,
gender:enum, hire_date:date)
salaries(emp_no:int, salary:int, from_date:date, to_date:date)
title(emp_no:int, title:string, from_date:date, to_date:date)

Figure 1: Employee database schema


Data Modelling and Databases (DMDB), Spring Semester 2017, Exercise 4: SQL, March
20/March 27, 2017 Legi-Nr:

1.1 Average salaries per department


The following three queries try to compute the average salary of all current employees per
department and to display department names and salary averages in descending order of the
average salary. Mark the correct solution:
SELECT d.dept_name, AVG(s.salary) AS avgs
FROM employees e
JOIN salaries s ON e.emp_no = s.emp_no
JOIN dept_emp de ON de.emp_no = s.emp_no
JOIN departments d ON d.dept_no = de.dept_no
WHERE s.to_date > NOW()
AND de.to_date > NOW()
GROUP BY d.dept_name
ORDER BY avgs DESC
SELECT d.dept_name, AVG(s.salary) AS avgs
FROM employees e
JOIN salaries s ON e.emp_no = s.emp_no
JOIN dept_emp de ON de.emp_no = s.emp_no
JOIN departments d ON d.dept_no = de.dept_no
GROUP BY d.dept_name
ORDER BY avgs DESC
SELECT e.dept_no, AVG(s.salary) AS avgs
FROM employees e
JOIN salaries s ON e.emp_no = s.emp_no
JOIN dept_emp de ON de.emp_no = s.emp_no
WHERE s.to_date > NOW()
AND de.to_date > NOW()
ORDER BY avgs DESC

1.2 Salaries
Consider the following query, which statement describes the best what the query is meant to do.

SELECT e.first_name, e.last_name, s.salary, avgst.avgs


FROM employees e
JOIN salaries s ON s.emp_no = e.emp_no
JOIN dept_manager dm ON dm.emp_no = e.emp_no
JOIN (

SELECT de.dept_no, AVG(s.salary) AS avgs


FROM employees e
JOIN salaries s ON e.emp_no = s.emp_no
JOIN dept_emp de ON de.emp_no = s.emp_no
WHERE s.to_date > NOW()
AND de.to_date > NOW()
GROUP BY de.dept_no

) avgst ON avgst.dept_no = dm.dept_no


WHERE dm.to_date > NOW()
Data Modelling and Databases (DMDB), Spring Semester 2017, Exercise 4: SQL, March
20/March 27, 2017 Legi-Nr:

AND s.to_date > NOW()


AND s.salary > avgst.avgs
It returns names of managers (first_name and last_name) who earn more than the
average salary of all employees that they have ever managed. It also returns depart-
ment’s name (dept_name), manager’s salary and department’s average salary.
It returns names of employees (first_name and last_name) who earn the most in
their department. It also returns department’s name (dept_name), employees’ salary
and department’s average salary.
It returns names of managers (first_name and last_name) who earn more than the
average salary of the current employees that they manage. It also returns department’s
name (dept_name), manager’s salary and department’s average salary.
It returns names of employees (first_name and last_name) who earn no more than
the average salary of the current employees of their same department. It also returns
department’s name (dept_name), and department’s average salary.

1.3 Employee Names (Substrings)


Which query or queries return the correct count of employees in the database whose first name
starts with the letter B?
SELECT count(first_name)
FROM employees
WHERE first_name LIKE ’B%’
SELECT count(first_name)
FROM employees
WHERE substring(first_name,0,1) in (’B’)
SELECT count(first_name)
FROM employees
WHERE first_name LIKE ’B’ OR first_name LIKE ’b’
SELECT count(first_name)
FROM employees
WHERE substring(first_name,1,1)=’B’

1.4 First employees query


Write a query that returns a list of the 10 first employees (first name (first_name) and last name
(last_name)) which the company hired (hire_date). The list should be sorted in alphabetical
order of the employees’ last names.
Hint: use the LIMIT keyword to constrain the number of output records.
Data Modelling and Databases (DMDB), Spring Semester 2017, Exercise 4: SQL, March
20/March 27, 2017 Legi-Nr:

1.5 1988
Write a query that determines for each department the average salary of anyone starting or
receiving a salary change in 1988 from that department. It should display the department
number, its name and its average.

1.6 Current managers


Return the first (first_name) and last name (last_name) of all current managers, their title,
their salary, and the department name (dept_name) for which they are responsible.
Hint: Use the NOW() function to obtain the current date and the from_date and to_date
fields in the salary and titles relations to find the corresponding records that are valid at the
current date.

2 ZVV Database
The following queries refer to the ZVV database. For reference, the schema of this database is
shown in Figure 2.

stops(stop_id:int, stop_name:string, stop_lat:float, stop_lon:float)


stop_times(trip_id:int , departure_time:string, arrival_time:string,
stop_id:int , stop_sequence:int)
trips(trip_id:int, tram_number:int, trip_headsign:string)

Figure 2: ZVV database schema

2.1 Trips and tram lines


Given the following query which is the statement that describes best the operation it performs.

SELECT COUNT(*) AS count_trips, tram_number


FROM trips
Data Modelling and Databases (DMDB), Spring Semester 2017, Exercise 4: SQL, March
20/March 27, 2017 Legi-Nr:

GROUP BY tram_number
ORDER BY count_trips DESC;

Finds how many tram lines (tram_number) are in a trip and orders its result in de-
scending order.
Finds the number of trips each tram line (tram_number) makes and orders its result
by number of trips in descending order.

Finds and counts the number of tram lines (tram_number) and orders them in descend-
ing order.
Finds the trips each tram line (tram_number) makes and orders its result by number
of trips in ascending order.

2.2 Busiest tram stations


The following queries try to compute the top 10 stations according to the number of times that
any tram stops at that station.
SELECT s.stop_id, s.stop_name
FROM stop_times st, stops s
WHERE s.stop_id = st.stop_id
GROUP BY st.stop_id, s.stop_name
ORDER BY count_stations desc;
SELECT count_stations, s.stop_name
FROM stop_times st, stops s
WHERE s.stop_id = st.stop_id
GROUP BY st.stop_id
ORDER BY count_stations desc
LIMIT 10;
SELECT COUNT(*) as count_stations, s.stop_name
FROM stop_times st, stops s
WHERE s.stop_id = st.stop_id
GROUP BY st.stop_id, s.stop_name
ORDER BY count_stations desc
LIMIT 10;
SELECT COUNT(*) as count_stations, s.stop_name
FROM stop_times st, stops s
WHERE s.stop_id = st.stop_id
GROUP BY st.stop_id
ORDER BY count_stations desc;

3 TPC-H Database
The following queries refer to the TPC-H database. For reference, the schema of this database
is shown in Figure 3.
Data Modelling and Databases (DMDB), Spring Semester 2017, Exercise 4: SQL, March
20/March 27, 2017 Legi-Nr:

Customer (customerid:integer, customername:string, nationid:integer )


Nation (nationid:integer, nationname:string, regionid:integer )
Orders (orderid:integer, customerid:integer , orderdate:date, orderpriority:string)
Orderline (orderid:integer , olid:integer, partid:integer , supplierid:integer ,
olquantity:float, oldiscount:float, olreturnflag:char, olshipdate:date,
olcommitdate:date, olreceiptdate:date)
Part (partid:integer, partname:string, partbrand:string,
parttype:string, partretailprice:float)
Region (regionid:integer, regionname:string)
Supplier (supplierid:integer, suppliername:string, nationid:integer )
Supplypart (partid:integer , supplierid:integer , supplypartcost:float)

Figure 3: TPC-H Schema

3.1 Customers from Japan


Return how many customers come from ’JAPAN’ (nationname).

3.2 Nations from multiple regions *


Given the following query that tries to list all nations in the regions of ’AMERICA’, ’AFRICA’
or ’ASIA’.

SELECT nationname
FROM nation, region
WHERE __________________
________________________
________________________;

Which of the following predicates makes the query perform the task correctly:
regionname = ’AMERICA’ OR
regionname = ’AFRICA’ OR
regionname = ’ASIA’
regionname = ’AMERICA’ AND
regionname = ’AFRICA’ AND
regionname = ’ASIA’
nation.regionid = region.regionid AND
(regionname = ’AMERICA’ OR
regionname = ’AFRICA’ OR regionname = ’ASIA’)
nation.regionid = region.regionid AND
(regionname = ’AMERICA’ AND
regionname = ’AFRICA’ AND regionname = ’ASIA’)
Data Modelling and Databases (DMDB), Spring Semester 2017, Exercise 4: SQL, March
20/March 27, 2017 Legi-Nr:

3.3 Exclusive supplier *


How many parts are supplied by the supplier with suppliername ’Supplier#000000001’ but
not supplied by the supplier with suppliername ’Supplier#000000002’?

3.4 First order *


Consider the following query that retrieves the date (orderdate) of the first order made by
customer with name (customername) ’Customer#000000001’

SELECT orderdate
FROM orders, customer
WHERE orders.customerid = customer.customerid
AND customername = ’Customer#000000001’
ORDER BY orderdate ASC
LIMIT 1;

Choose the correct statements:


A. No join is actually needed. We could use only the data from the orders table.
B. The join between the two relations is done on the WHERE clause.
C. The extra predicate checking for ’Customer#00000001’ on the WHERE clause is re-
dundant as we are going to retrieve it anyway.

D. The ORDER BY statement is actually not needed to get the data needed.
E. The ORDER BY and LIMIT statements help us getting the actual first order.

3.5 Customers with local shipments


Find the number of customers that have made at least one order where at least one part is
supplied by the same nation as the customer.
Data Modelling and Databases (DMDB), Spring Semester 2017, Exercise 4: SQL, March
20/March 27, 2017 Legi-Nr:

3.6 Revenue and profit margin


Find the total revenue and profit margin (in percent) per year for all orders and orderlines.
Revenue per orderline is defined as olquantity *partretailprice*(1 - oldiscount). Profit
margin is calculated as revenue−costs
revenue where cost per orderline is olquantity * supplypartcost.
Sort the final result in ascending order of the year.
Hint: use EXTRACT(YEAR FROM orderdate) to extract the year.

S-ar putea să vă placă și