Want to pass your Data Science Essentials DS-200 exam in the very first attempt? Try Pass2lead! It is equally effective for both starters and IT professionals.
VCE
There are 20 patients with acute lymphoblastic leukemia (ALL) and 32 patients with acute myeloid leukemia (AML), both variants of a blood cancer.
The makeup of the groups as follows:
Each individual has an expression value for each of 10000 different genes. The expression value for each gene is a continuous value between -1 and 1.
With which type of plot can you encode the most amount of the data visually?
You choose to perform agglomerative hierarchical clustering on the 10,000 features. How much RAM do you need to hold the distance Matrix, assuming each distance value is 64-bit double?
A. ~ 800 MB
B. ~ 400 MB
C. ~ 160 KB
D. ~ 4 MB
You have a large file of N records (one per line), and want to randomly sample 10% them. You have two
functions that are perfect random number generators (through they are a bit slow):
Random_uniform () generates a uniformly distributed number in the interval [0, 1] random_permotation (M)
generates a random permutation of the number O through M -1.
Below are three different functions that implement the sampling.
Method A
For line in file: If random_uniform () < 0.1; Print line
Method B
i = 0
for line in file:
if i % 10 = = 0;
print line
i += 1
Method C
idxs = random_permotation (N) [: (N/10)]
i = 0
for line in file:
if i in idxs:
print line
i +=1
Which method might introduce unexpected correlations?
A. Method A
B. Method B
C. Method C
Assuming the trends shown in this chart continue, what would we expect the value of the revenue to be in Q1 of 2013?
A. $125,000
B. $170,000
C. $220,000
D. $250,000