Assuming a cluster running HDFS, MapReduce version 2 (MRv2) on YARN with all settings at their default, what do you need to do when adding a new slave node to a cluster?
A. Nothing, other than ensuring that DNS (or /etc/hosts files on all machines) contains am entry for the new node.
B. Restart the NameNode and ResourceManager deamons and resubmit any running jobs
C. Increase the value of dfs.number.of.needs in hdfs-site.xml
D. Add a new entry to /etc/nodes on the NameNode host.
E. Restart the NameNode daemon.
For each YARN Job, the Hadoop framework generates task log files. Where are Hadoop's files stored?
A. In HDFS, In the directory of the user who generates the job
B. On the local disk of the slave node running the task
C. Cached In the YARN container running the task, then copied into HDFS on fob completion
D. Cached by the NodeManager managing the job containers, then written to a log directory on the NameNode
You have a Hadoop cluster running HDFS, and a gateway machine external to the cluster from which clients submit jobs. What do you need to do in order to run on the cluster and submit jobs from the command line of the gateway machine?
A. Install the impslad daemon, statestored daemon, and catalogd daemon on each machine in the cluster and on the gateway node
B. Install the impalad daemon on each machine in the cluster, the statestored daemon and catalogd daemon on one machine in the cluster, and the impala shell on your gateway machine
C. Install the impalad daemon and the impala shell on your gateway machine, and the statestored daemon and catalog daemon on one of the nodes in the cluster
D. Install the impalad daemon, the statestored daemon, the catalogd daemon, and the impala shell on your gateway machine
E. Install the impalad daemon, statestored daemon, and catalogd daemon on each machine in the cluster, and the impala shell on your gateway machine
Which YARN daemon or service negotiates map and reduce Containers from the Scheduler, tracking their status and monitoring for progress?
A. ResourceManager
B. ApplicationMaster
C. NodeManager
D. ApplicationManager
Your Hadoop cluster is configured with HDFS and MapReduce version 2 (MRv2) on YARN. Can you configure a worker node to run a NodeManager daemon but not a DataNode daemon and still have a function cluster?
A. Yes. The daemon will receive data from the NameNode to run Map tasks
B. Yes. The daemon will get data from another (non-local) DataNode to run Map tasks
C. Yes. The daemon will receive Reduce tasks only
You want a node to only swap Hadoop daemon data from RAM to disk when absolutely necessary. What should you do?
A. Delete the /swapfile file on the node
B. Set vm.swappiness to o in /etc/sysctl.conf
C. Set the ram.swap parameter to o in core-site.xml
D. Delete the /etc/swap file on the node
E. Delete the /dev/vmswap file on the node
Your Hadoop cluster contains nodes in three racks. You have NOT configured the dfs.hosts property in the NameNode's configuration file. What results?
A. No new nodes can be added to the cluster until you specify them in the dfs.hosts file
B. Presented with a blank dfs.hosts property, the NameNode will permit DatNode specified in mapred.hosts to join the cluster
C. Any machine running the DataNode daemon can immediately join the cluster
D. The NameNode will update the dfs.hosts property to include machine running DataNode daemon on the next NameNode reboot or with the command dfsadmin -refreshNodes
Identify two features/issues that YARN is designed to address:
A. Standardize on a single MapReduce API
B. Single point of failure in the NameNode
C. Reduce complexity of the MapReduce APIs
D. Resource pressures on the JobTracker
E. Ability to run frameworks other than MapReduce, such as MPI
F. HDFS latency
A slave node in your cluster has four 2TB hard drives installed (4 x 2TB). The DataNode is configured to store HDFS blocks on the disks. You set the value of the dfs.datanode.du.reserved parameter to 100GB. How does this alter HDFS block storage?
A. A maximum of 100 GB on each hard drive may be used to store HDFS blocks
B. All hard drives may be used to store HDFS blocks as long as atleast 100 GB in total is available on the node
C. 100 GB on each hard drive may not be used to store HDFS blocks
D. 25 GB on each hard drive may not be used to store HDFS blocks
Which process instantiates user code, and executes map and reduce tasks on a cluster running MapReduce V2 (MRv2) on YARN?
A. NodeManager
B. ApplicationMaster
C. ResourceManager
D. TaskTracker
E. JobTracker
F. DataNode
G. NameNode