Documente Academic
Documente Profesional
Documente Cultură
To download: Toolbar > File > Download > (Desired File Type)
☑ BusinessStage & Steps
☐ Data Understand the data, your questions, and your goals
I. Import Data •Download
Are youthe
simply exploring the data?in your coding environment
☐ Understandin
& Libraries
data and make it available
☐ Separate Data Types (Take an inventory of what data types you have)
II. Exploratory Initial Data Cleaning
☐ Data Analysis • Clean anything that would prevent you from exploring the data
☐ Visualize & Understand
• Understand how your data is distributed (numerical & categorical)
☐ III. Train/Test
Assess Missing Values (Don't fill/impute yet!)
• The goal here is to figure out your strategy for dealing with missing values
☐ Split
Set aside some data for testing.
• pd.concat
• pd.merge
• df.drop_duplicates()
• df.select_dtypes(['object', 'bool'])
• df.select_dtypes(['float', 'int'])
• pd.Series.str.replace()
• pd.Series.astype()
• df.value_counts()
• seaborn.distplot()
• df.isna().any()
• df.drop()
• sklearn.model_selection.train_test_split
• sklearn.model_selection.StratifiedShuffleSplit
• sklearn.impute.SimpleImputer
• sklearn.impute.IterativeImputer
• sum
• mean
• sklearn.preprocessing.StandardScaler
• sklearn.preprocessing.MinMaxScaler
• df.corr().abs()
• sklearn.feature_selection.VarianceThreshold
• sklearn.model_selection.train_test_split
• sklearn.model_selection.KFold
• sklearn.model_selection.GridSearchCV
• sklearn.model_selection.RandomizedSearchCV