Want to pass your Databricks Certified Professional Data Scientist Exam DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST exam in the very first attempt? Try Pass2lead! It is equally effective for both starters and IT professionals.
VCE
You have collected the 100's of parameters about the 1000's of websites e.g. daily hits, average time on the websites, number of unique visitors, number of returning visitors etc. Now you have find the most important parameters which can best describe a website, so which of the following technique you will use:
A. PCA (Principal component analysis)
B. Linear Regression
C. Logistic Regression
D. Clustering
You are using k-means clustering to classify heart patients for a hospital. You have chosen Patient Sex, Height, Weight, Age and Income as measures and have used 3 clusters. When you create a pair-wise plot of the clusters, you notice that there is significant overlap between the clusters. What should you do?
A. Identify additional measures to add to the analysis
B. Remove one of the measures
C. Decrease the number of clusters
D. Increase the number of clusters
In unsupervised learning which statements correctly applies?
A. It does not have a target variable
B. Instead of telling the machine Predict Y for our data X, we're asking What can you tell me about X?
C. telling the machine Predict Y for our data X