Sunteți pe pagina 1din 3

Projects details on Python with MySQL / Data Science tools (Numpy/Pandas/Matplotlib)

Computer Society Of India

1. Matrix Operations and Linear Equations using Numerical Python(numpy) which should
provide menus on screen for the following matrix operations.

Write Python programs for the following menu based activities


Menus:
 Build a matrix (Program should input number of row/columns and followed by the
number of elements of the matrix. Check the no. of elements == row X col. Proper
Python exception should be used to validate keyboard input.).
Save the Matrix on a disk file under a name of your choice.
 Display the matrix ( After loading the matrix file from disk ) on screen.
 Find the determinant, inverse, transpose, symmetric/skew symmetric of the matrix
after loading the matrix file from disk.
 Input Co-efficient and constant matrix for 3 Linear Equations . Should display the
finally built 3 equations. Save these 2 matrices on a disk file under names of your
choice.
 Solve the Linear Equations. Should display the values of x,y,z as Solution matrix.
 Exit

2. Data Analysis using Pandas library on CSV files and subsequent visualization by
Matplotlib. Program should perform the following

a. Load and prepare the Dataset from CSV file of “empmst.csv”

- Load the data from an employee master CSV file (empmst.csv) into a DF. Sample
data for 'empmst.csv' has already been shared via mail attachment.
- Make the 'Empno' column as Index column.
- Use ‘parse_dates’ while importing DOB column
b. Access/print the all the columns by “.” operator.
c. Add column ‘CONV’ in the DF as 10% of Salary.
d. Add column 'Total' as sum of Salary + HRA + CONV.
e. Find the sum, mean, standard deviation of Total, Salary, HRA, CONV
gender wise (M. F)
f. Create a new DF called “statewise_total” which would give “group by sum()” on
Salary, HRA, CONV, Total and State as index column.
g. Plot a Bar graph showing “State” on X-axis and other numeric columns under bar.
h. Put a proper title, label on X & Y axis.
i. Convert the DF with added columns to an Excel file by “xlwt” module

3. As a ‘root’ in MySQL, perform the following pre-requisite activities.

Pre-requisites

a. Create a database. Name of the database is your choice


b. Create a user. Name of the user/password is your choice. Grant all permissions to
the user on this newly created database.
c. Logout from ‘root’ and login with this newly created user.

d. Write Python programs for the following using “mysql.connector” module:


- Python Program to Create a table called ‘loanmst’ in your default database with
the following columns
 --- Empno (int), not null, primary
 --- LoanAmt float (can not be <= 0)
 --- LoanInstNumber int
- Python program to Insert record for the above mentioned columns
- Python program to query loan details for an employee whose empno would be
input from keyboard.
Proper message should be displayed if Empno does not exist.

Write Python programs for the following menu based activities.


Menu:
- Addition of Employee Loan Details
- Deletion of Loan details based on Empno which is input from keyboard.
- Query of a loan details of an employee whose empno is input from keyboard
- Exit

4. As a ‘root’ in MySQL, perform the following pre-requisite activities .

Pre-requisites

a. Create a database. Name of the database is your choice


b. Create a user. Name of the user/password is your choice. Grant all permissions to
the user on this newly created database.
c. Logout from ‘root’ and login with this newly created user.
d. Write Python programs for the following using “mysql.connector” module:
- Python Program to Create a table called ‘itemmst’ in your default database with
the following columns
 --- ItemId (int), not null, primary key
 --- ItemDesc char 40 , cannot be null,
 --- ItemRate float , must be >0

. Write Python programs for the following menu based activities


Menu:
- Addition of Item Details
- Deletion of Item details based on Itemid which is input from keyboard.
- Query of an ItemId details based on an ItemId input from keyboard
- Exit

5. Load the dataset from ‘empmst.csv’ into a Pandas frame called ‘empdf’ and then
perform the following

a. Load and prepare the Dataset from CSV file of “empmst.csv”


- Load the data from an employee master CSV file (empmst.csv) into a DF.
Sample data for 'empmst.csv' has already been shared via mail attachment.
- Make the 'Empno' column as Index column.
- Use ‘parse_dates’ while importing DOB column

b. Access/print the all the columns by “.” operator


c. Add a column called ‘SpAllow” which is 30% of Salary
d. Calculate the ‘Total’ column as Salary + HRA + SpAllow
e. Find the sum, mean, standard deviation of Total, Salary,HRA,SpAllow
gender wise (M. F)
f. Create a new DF called “citywise_total” which would give “group by sum()” on
1. Salary, HRA, SpAllow, Total
g. Plot a Bar graph showing “State” on X-axis and other numeric columns under bar.
h. Put a proper title, label on X & Y axis.
i. Convert the new DF to an Excel file using ‘xlwt’ module. Use ‘pip’ to install module.
6. Consider the Excel file for the Titanic disaster (titanic.xls) which was sent separately.
Convert the Excel file into CSV format using Pandas tool ‘xlrd’. This file has the
following columns with their definitions mentioned.

Variable Definition Key

pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd


survived Survival 0 = No, 1 = Yes
name Name of passenger
sex Sex
Age Age in years
sibsp # of siblings / spouses
aboard the Titanic
parch # of parents / children
aboard the Titanic
ticket Ticket number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown,
S= Southampto

Find the following:

 How many people in survived vs. not survived in the disaster


with the Titanic?
 Number of males that survived vs. number of males who did not survive
 Number of females that survived vs. number of females who did not survive.
 Normalize the survival rate (output in some decimal which helps to predict in
percentage)
(Hint : print(df["survived"][df["sex"] == 'female'].value_counts(normalize=True))
 Create a column called ‘Child’ . Assign 1 to passengers under 18, 0 to those
18 or older. Print the new column
Hint : df["child"][df["Age"] < 18] = 1
 Print normalized Survival Rates for passengers under 18 and above 18

Notes:
- Use Python Modules like numpy, linear algebra, pandas, matplotlib, xlwt, xlrd
as required.
- For the menu based programs, there would be a main program which would
invoke each sub-menu program via a separate Python program (.py).
Underlined words should be your Menu text.
- Provide proper Project Title of your choice.

S-ar putea să vă placă și