Sunteți pe pagina 1din 3

1. Which of the following statements is true ?

one can estimate the votes for a presidential


candidate in a fourthcoming election by? Conduct a poll of random sample from TPS in
the country
2. Which one of the following is not an example of statistics ? Gini Index
3. Which of the following is an example of time series data ? Average batting average of a
baseball player, 2 dan 3
4. Which of the following is an example of multivariate data ? Vital signs recorded for a new
born baby
5. Which of the following is not an example of big data ? The number of football players in
FIFA
6. Which of the following is an example of categorical data ? Mode of fashion in a certain
year
7. Which of the following is not an example of ordinal data ? Number of trees in a park
8. A mean is meaningful for the following type of data ? Ratio Data
9. You have two datasets. First 1000 customers, second 1500. Dow datasets have 900 recorda
in common. 900 records
10. Consider the dataframe “df”, what does the command df.rename(columns=(‘a’:’b’))
change about dataframe “df” ? Nothing as you must set the parameter “inplace=True”
11. Consider the column of the dataframe df[‘a’]. the column tas been standardized. What is
the standard deviation of the values, l.e. the results of applying the following operation
df[‘a’].std() ? 1
12. What is the Pearson Correlation between variables X and Y, if X =0.9 * Y ? 1
13. Consider the dataframe “df”, with categorical columdfn “categories”. What would be the
output of this following command [‘categories’].value_counts()[:20].index.tolist() ? A
python list showing the top 20 categories that appears the most without its number of
occurrences
14. Based on the data frame sample below, write a Pandas program to creat and display a
Dataframe from a specified dictionary data which has the index labels. Example DataFrame
: exam_data=(‘nama’:[‘anastasia’,’dirna’,’katherine’,’james’,’emily’…)
df = pd.DataFrame(exam_data, index=labels) print (df)
15. Based on sample below, write a Pandas program to append a new row to DataFrame with
given values for each column. Now delete the new row and return the original data frame.
Example DataFrame: exam_data=(‘name’:[‘Anastasia’,’Dirna’,’katherine’,’james’
df.loc[‘k’]=[1,’Suresh’,’yes’, 15.5)
16. You surmise that the two arrays must have the same space allocated ? Print flags of both
arrays by e.flags and f.flags; check the flag “OWNDATA’. If one of them false, then both
the arrays have same space allocated
17. Suppose you want to join train and test dataset (both are two numpy ways train_set and
test_set) into a resulting array (resulting_set) resulting_set=np.vstack([train_set, test_set])
18. Which command will be appropriate to fill missing value while reading the file with numpy
? filling_values=(“*”, 0, 01/01/2010, 0) temp = np.genfromtxt (filename,
filling_values=filling_values)
19. Which is the following a preferred measure of central tendency given the data is severely
skewed. Median
20. Median represents a value in the data set where: Half of the observations are above the
median and the other half below it
21. The following is the right statement about Numpy: Numpy is a library for the Python
programming language, adding support for large, multi-dimesional arrays and matrices,
along with a large collection of high-level mathematical functions to operate on the arrays
22. What is the result of the following operation in Python: 3 + 2 * 2 ? 7
23. In Python, if you executed var =;1234567’, what would be the result of print var[::2])?
0246
24. In Python, what is the result of the following operation ‘1’ + ‘2’ ? ‘12’
25. Given myWord = ‘hello’, how would you convert myWorld into uppercase ?
myWord.upper()
26. After applying the following method,l.append((‘a’,’b’)_, the following list will only be one
element longer. True
27. What is an important difference between lists and tuples ? list are mutable, tuples are not
28. Dict=(“a”:1,”b”:”2”,”c”:[3,3,3], “D”:(4,4,4) ………… what is the result of the following
operation : Dict[“D”]. (4,4,4)
29. What is the correct way to sort the list ‘myData’ using a method, the result should not
return a new list, just change the list ‘myData’. myData.sort()
30. What are the keys of the of the following {‘a’:1,’b’:2}. A,b
31. Which one of the following statements best describes the Python scikit library ? a collection
of algorithms and tools for machine learning
32. Supposed a media content website wants to improve their customer experience by
providing recommendation system that will generally tell them what’s the popular content
among their neighbour that they might also like it. Collaborative filtering recommender
system
33. In comparison to supervised learning, unsupervised learing has ? Less test (evaluation,
approachers)
34. Which one of the following statements is the most accurate ? machine learning is the branch
of AI that covers the statistical and learning part of AI
35. What would be the result of list2 from the following code: array ([1, 12, 3, 4])
36. What do the following lines of code do ? read the file “exercise.txt”
37. Is the result of applying the following method df.head() ………. Print the first 5 row of the
dataframe
38. Consider the dataframe “df”, with categorical column “categories”. A python list showing
the top 20 categories that appears the most without its number of occurrences
39. Consider the following dataframe: the average price for each body style
40. You want to predict a field umber CHURN ………. CHURN as target field; AGE,
GENDER, and HOUSEHOLD SIZE as input fields
41. An insurance company has a dataset in place storing information about claims. One field
in the dataset flags whether the claim was fraudulent or not ……….. a classification model
42. Kevin is the head of the spare-parts inventory warehouse. Association
43. Imagine, you are solvinf a classification problems with highly imbalanced class. Accuracy
metric is not a good ideas for imbalanced class problems & Precision and recall metrics
are good for imbalanced class problems
44. When you heve a very high bias model and you have tried many algorithm with its
parameters/ Feature engineering
45. Imagine you run a binary classification usin random forest. SHAP
46. Imagine you have many features with different range value, and you want to make
prediction using lasso linear regression. Data normalization
47. Which of the following sentence that is TRUE about Decision Tree ? it can easily overfit
as the tree goes deeper
48. Which of the following sentence that is TRUE about Random Forest ? Each tree has the
same amooount pf say on determining di final result
49. Decision tree has been regaeded for simplicity and popularity in machine learning. Creating
the tree based on information gain
50. On the course of holidat, she is very pumped to try taking a coffe in the café as she never
had that experience before. Better interpretation compared with decision tree
51. Ensamble learning can work well in condition that requirement down belos is/are fulfilled.
Independen Model
52. Which of the following correctly describes the relationship between FS complexity and f’s
bias and variance terms ? as the complexity of f’s increases, the bias term decreases while
the variance term increases
53. XGBoost is a powerful library that scales very well to many samples and works for variety.
Predicting the likehood that a given users will click an ad from a very large clickstrea logi
with million of users and their web imteractions
54. For many cases, imbalaced dataset is nerarly inavitable. Overfitting
55. An online relailer wants to identify groups of customers. Segmentation
56. A mysterious disease causing uncontrollable. 90%
57. Given the type node shown below, what spesifications are necessary ….. Specify values
lower 1 and upper 15 and check action nullify

58. Which of the following statements is true about data science ?


59. Which of the following statements is true about different between data science and data
analysis ?

S-ar putea să vă placă și