Data Science Interview Questions and Answers

Basic python data science interview Questions

Q. What is Hybrid SCD?

A. Hybrid SCDs are a combination of both SCD 1 and SCD 2.
It may happen that in a table, some columns are important and we need to track changes for them i.e., capture the historical data for them whereas in some columns even if the data changes, we do not have to bother.
For such tables, we implement Hybrid SCDs, where in some columns are Type 1 and some are Type 2.

Q. Why alias in import statement? Why is it used?

A. Aliases are used in import statements for ease of usage. If the imported module has a large name, for example import multiprocessing . Every time we want to access any script present in multiprocessing module, we need to use the word multiprocessing.
However if an alias is used, import multiprocessing as mp, we can simply replace the words multiprocessing with mp.
Q. Are the aliases used for a module fixed/static ?
A. No, the aliases are not pre-fixed. The alias can be named as per your convenience. However, the documentation of a respective module sometimes specifies the alias to be used for ease of understanding.

Q. How to access a specific script inside a module Data Science?
A. If the whole module needs to be imported, we simply can use from pandas import*.

Q. What is a nonparametric test used for?
A. Non parametric tests do not assume that the data follows a specific distribution. They can be used whenever the data do not meet the assumptions of parametric test.

Q. Explain the pros and cons of Decision Trees algorithm in data science?
A. Pros – Easy to interpret. Will ignore irrelevant independent variables since information gain will be minimal. Can handle missing data. Fast modeling.
Cons – Many combinations are possible to create a tree. There are chances that it might not find the best tree possible.

