After importing a Jupyter notebook and CSV data file into IBM Watson Studio in the IBM Public Cloud project, it is discovered that the notebook code can no longer access the CSV file. What is the most likely reason for this problem?
A. CSV files cannot be used as data sources in Watson Studio.
B. The CSV file was converted to a binary blob and must be converted in the notebook code.
C. The CSV file is stored in a Cloud Object Storage.
D. The CSV file is stored in a Watson Machine Learning instance and is only accessible via REST API.
Which is a preferred approach for simplifying the data transformation steps in machine learning model management and maintenance?
A. Implement data transformation, feature extraction, feature engineering, and imputation algorithms in one single pipeline.
B. Do not apply any data transformation or feature extraction or feature engineering steps.
C. Leverage only deep learning algorithms.
D. Apply a limited number of data transformation steps from a pre-defined catalog of possible operations independent of the machine learning use case.
Which is a technique that automates the handling of categorical variables?
A. binary encoding
B. decoding
C. autoencoding
D. one-hot encoding
A neural network is trained for a classification task. During training, you monitor the loss function for the train dataset and the validation dataset, along with the accuracy for the validation dataset. The goal is to get an accuracy of 95%.
From the graph, what modification would be appropriate to improve the performance of the model?
A. increase the depth of the neural network
B. insert a dropout layer in the neural network architecture
C. increase the proportion of the train dataset by moving examples from the validation dataset to the train dataset
D. restart the training with a higher learning rate
When communicating technical results to business stakeholders, what are three appropriate topics to include? (Choose three.)
A. methods that failed
B. newest developments in AI methods
C. benefits of cognitive over business analytics
D. realistic impact on the business measures
E. differences between cloud provider portfolios
F. alternative methods to address the business problem
Which fine-tuning technique does not optimize the hyperparameters of a machine learning model?
A. grid search
B. population based training
C. random search
D. hyperband
What are three elements that are typically part of a machine learning pipeline in scikit-learn or pyspark? (Choose three.)
A. model building
B. data preprocessing
C. model prediction
D. business understanding
E. use case selection F. data exploration
What is the name of the design thinking work product that contains a summary description of a particular person or role?
A. persona
B. snapshot
C. My Sticky Note
D. user summary report
A data scientist is exploring transaction data from a chain of stores with several locations. The data includes store number, date of sale, and purchase amount. If the data scientist wants to compare total monthly sales between stores, which two options would be good ways to aggregate the data? (Choose two.)
A. Find the sum of the transaction prices
B. Select the largest transaction amount by month and store
C. Write a GROUP BY query
D. Plot a time series plot of transaction amounts
E. Generate a pivot table
Given the following sentence:
The dog jumps over a fence.
What would a vectorized version after common English stopword removal look like?
A. ['dog', 'fence', 'run']
B. ['fence', 'jumps']
C. ['dog', 'fence', 'jumps']
D. ['a', 'dog', 'fence', 'jumps', 'over', 'the']