The Latest Real Exam Questions from the Latest DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Study Guide Try Free DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Practice Questions

Pass2lead > Databricks > Databricks Certification > DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK > DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Online Practice Questions and Answers

DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Online Practice Questions and Answers

Questions 4

Which of the elements that are labeled with a circle and a number contain an error or are misrepresented?

A. 1, 10

B. 1, 8

C. 10

D. 7, 9, 10

E. 1, 4, 6, 9

Buy Now

Questions 5

Which of the following code blocks returns a DataFrame with an added column to DataFrame transactionsDf that shows the unix epoch timestamps in column transactionDate as strings in the format month/day/year in column transactionDateFormatted?

Excerpt of DataFrame transactionsDf:

A. transactionsDf.withColumn("transactionDateFormatted", from_unixtime("transactionDate", format="dd/ MM/yyyy"))

B. transactionsDf.withColumnRenamed("transactionDate", "transactionDateFormatted", from_unixtime ("transactionDateFormatted", format="MM/dd/yyyy"))

C. transactionsDf.apply(from_unixtime(format="MM/dd/yyyy")).asColumn("transactionDateFor matted")

D. transactionsDf.withColumn("transactionDateFormatted", from_unixtime("transactionDate", format="MM/ dd/yyyy"))

E. transactionsDf.withColumn("transactionDateFormatted", from_unixtime("transactionDate"))

Buy Now

Questions 6

Which of the following code blocks applies the boolean-returning Python function evaluateTestSuccess to column storeId of DataFrame transactionsDf as a user-defined function?

A. 1.from pyspark.sql import types as T 2.evaluateTestSuccessUDF = udf(evaluateTestSuccess, T.BooleanType()) 3.transactionsDf.withColumn("result", evaluateTestSuccessUDF(col("storeId")))

B. 1.evaluateTestSuccessUDF = udf(evaluateTestSuccess) 2.transactionsDf.withColumn("result", evaluateTestSuccessUDF(storeId))

C. 1.from pyspark.sql import types as T 2.evaluateTestSuccessUDF = udf(evaluateTestSuccess, T.IntegerType()) 3.transactionsDf.withColumn("result", evaluateTestSuccess(col("storeId")))

D. 1.evaluateTestSuccessUDF = udf(evaluateTestSuccess) 2.transactionsDf.withColumn("result", evaluateTestSuccessUDF(col("storeId")))

E. 1.from pyspark.sql import types as T 2.evaluateTestSuccessUDF = udf(evaluateTestSuccess, T.BooleanType()) 3.transactionsDf.withColumn("result", evaluateTestSuccess(col("storeId")))

Buy Now

Questions 7

The code block shown below should read all files with the file ending .png in directory path into Spark. Choose the answer that correctly fills the blanks in the code block to accomplish this.

spark.__1__.__2__(__3__).option(__4__, "*.png").__5__(path)

A. 1. read()

format

"binaryFile"

"recursiveFileLookup"

load

B. 1. read

format

"binaryFile"

"pathGlobFilter"

load

C. 1. read

format

binaryFile

pathGlobFilter

load

D. 1. open

format

"image"

"fileType"

open

E. 1. open

"binaryFile"

"pathGlobFilter"

load

Buy Now

Questions 8

Which of the following describes properties of a shuffle?

A. Operations involving shuffles are never evaluated lazily.

B. Shuffles involve only single partitions.

C. Shuffles belong to a class known as "full transformations".

D. A shuffle is one of many actions in Spark.

E. In a shuffle, Spark writes data to disk.

Buy Now

Questions 9

The code block displayed below contains an error. The code block should trigger Spark to cache DataFrame transactionsDf in executor memory where available, writing to disk where insufficient

executor memory is available, in a fault-tolerant way. Find the error.

Code block:

transactionsDf.persist(StorageLevel.MEMORY_AND_DISK)

A. Caching is not supported in Spark, data are always recomputed.

B. Data caching capabilities can be accessed through the spark object, but not through the DataFrame API.

C. The storage level is inappropriate for fault-tolerant storage.

D. The code block uses the wrong operator for caching.

E. The DataFrameWriter needs to be invoked.

Buy Now

Questions 10

The code block displayed below contains an error. The code block should return a DataFrame in which column predErrorAdded contains the results of Python function add_2_if_geq_3 as applied to numeric and nullable column predError in DataFrame transactionsDf.

Find the error.

Code block:

1.def add_2_if_geq_3(x):

if x is None:

return x

elif x >= 3:

return x+2

return x

8.add_2_if_geq_3_udf = udf(add_2_if_geq_3)

10.transactionsDf.withColumnRenamed("predErrorAdded", add_2_if_geq_3_udf(col("predError")))

A. The operator used to adding the column does not add column predErrorAdded to the DataFrame.

B. Instead of col("predError"), the actual DataFrame with the column needs to be passed, like so transactionsDf.predError.

C. The udf() method does not declare a return type.

D. UDFs are only available through the SQL API, but not in the Python API as shown in the code block.

E. The Python function is unable to handle null values, resulting in the code block crashing on execution.

Buy Now

Questions 11

The code block displayed below contains an error. The code block should return all rows of DataFrame transactionsDf, but including only columns storeId and predError. Find the error.

Code block:

spark.collect(transactionsDf.select("storeId", "predError"))

A. Instead of select, DataFrame transactionsDf needs to be filtered using the filter operator.

B. Columns storeId and predError need to be represented as a Python list, so they need to be wrapped in brackets ([]).

C. The take method should be used instead of the collect method.

D. Instead of collect, collectAsRows needs to be called.

E. The collect method is not a method of the SparkSession object.

Buy Now

Questions 12

Which of the following code blocks returns a DataFrame with approximately 1,000 rows from the 10,000row DataFrame itemsDf, without any duplicates, returning the same rows even if the code block is run twice?

A. itemsDf.sampleBy("row", fractions={0: 0.1}, seed=82371)

B. itemsDf.sample(fraction=0.1, seed=87238)

C. itemsDf.sample(fraction=1000, seed=98263)

D. itemsDf.sample(withReplacement=True, fraction=0.1, seed=23536)

E. itemsDf.sample(fraction=0.1)

Buy Now

Questions 13

In which order should the code blocks shown below be run in order to return the number of records that are not empty in column value in the DataFrame resulting from an inner join of DataFrame transactionsDf and itemsDf on columns productId and itemId, respectively?

.filter(~isnull(col('value')))

.count()

transactionsDf.join(itemsDf, col("transactionsDf.productId")==col("itemsDf.itemId"))

transactionsDf.join(itemsDf, transactionsDf.productId==itemsDf.itemId, how='inner')

.filter(col('value').isnotnull())

.sum(col('value'))

A. 4, 1, 2

B. 3, 1, 6

C. 3, 1, 2

D. 3, 5, 2

E. 4, 6

Buy Now

Exam Code: DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK

Exam Name: Databricks Certified Associate Developer for Apache Spark 3.0

Last Update: Dec 15, 2024

Questions: 180

PDF (Q&A)

$49.99

ADD TO CART

VCE

$55.99

ADD TO CART

PDF + VCE

$65.99

ADD TO CART