You are configuring your cluster to run HDFS and MapReducer v2 (MRv2) on YARN. Which two daemons needs to be installed on your cluster's master nodes? (Choose two)
A. HMaster
B. ResourceManager
C. TaskManager
D. JobTracker
E. NameNode
F. DataNode
Which three basic configuration parameters must you set to migrate your cluster from MapReduce 1 (MRv1) to MapReduce V2 (MRv2)? (Choose three)
A. Configure the NodeManager to enable MapReduce services on YARN by setting the following property in yarn-site.xml:
B. Configure the NodeManager hostname and enable node services on YARN by setting the following property in yarn-site.xml:
C. Configure a default scheduler to run on YARN by setting the following property in mapred- site.xml:
D. Configure the number of map tasks per jon YARN by setting the following property in mapred:
E. Configure the ResourceManager hostname and enable node services on YARN by setting the following property in yarn-site.xml:
F. Configure MapReduce as a Framework running on YARN by setting the following property in mapredsite.xml:
Which YARN process run as "container 0" of a submitted job and is responsible for resource qrequests?
A. ApplicationManager
B. JobTracker
C. ApplicationMaster
D. JobHistoryServer
E. ResoureManager
F. NodeManager
Which scheduler would you deploy to ensure that your cluster allows short jobs to finish within a reasonable time without starting long-running jobs?
A. Complexity Fair Scheduler (CFS)
B. Capacity Scheduler
C. Fair Scheduler
D. FIFO Scheduler
Your cluster is configured with HDFS and MapReduce version 2 (MRv2) on YARN. What is the result when you execute: hadoop jar SampleJar MyClass on a client machine?
A. SampleJar.Jar is sent to the ApplicationMaster which allocates a container for SampleJar.Jar
B. Sample.jar is placed in a temporary directory in HDFS
C. SampleJar.jar is sent directly to the ResourceManager
D. SampleJar.jar is serialized into an XML file which is submitted to the ApplicatoionMaster
During the execution of a MapReduce v2 (MRv2) job on YARN, where does the Mapper place the intermediate data of each Map Task?
A. The Mapper stores the intermediate data on the node running the Job's ApplicationMaster so that it is available to YARN ShuffleService before the data is presented to the Reducer
B. The Mapper stores the intermediate data in HDFS on the node where the Map tasks ran in the HDFS / usercache/and(user)/apache/application_and(appid) directory for the user who ran the job
C. The Mapper transfers the intermediate data immediately to the reducers as it is generated by the Map Task
D. YARN holds the intermediate data in the NodeManager's memory (a container) until it is transferred to the Reducer
E. The Mapper stores the intermediate data on the underlying filesystem of the local disk in the directories yarn.nodemanager.locak-DIFS
On a cluster running CDH 5.0 or above, you use the hadoop fs put command to write a 300MB file into a previously empty directory using an HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another use see when they look in directory?
A. The directory will appear to be empty until the entire file write is completed on the cluster
B. They will see the file with a ._COPYING_ extension on its name. If they view the file, they will see contents of the file up to the last completed block (as each 64MB block is written, that block becomes available)
C. They will see the file with a ._COPYING_ extension on its name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster
D. They will see the file with its original name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster
Your cluster is running MapReduce version 2 (MRv2) on YARN. Your ResourceManager is configured to use the FairScheduler. Now you want to configure your scheduler such that a new user on the cluster can submit jobs into their own queue application submission. Which configuration should you set?
A. You can specify new queue name when user submits a job and new queue can be created dynamically if the property yarn.scheduler.fair.allow-undecleared-pools = true
B. Yarn.scheduler.fair.user.fair-as-default-queue = false and yarn.scheduler.fair.allow- undecleared-pools = true
C. You can specify new queue name when user submits a job and new queue can be created dynamically if yarn .schedule.fair.user-as-default-queue = false
D. You can specify new queue name per application in allocations.xml file and have new jobs automatically assigned to the application queue
On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a directory of 10 plain text files as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers will run?
A. We cannot say; the number of Mappers is determined by the ResourceManager
B. We cannot say; the number of Mappers is determined by the developer
C. 30
D. 3
E. 10
F. We cannot say; the number of mappers is determined by the ApplicationMaster
Your Hadoop cluster is configuring with HDFS and MapReduce version 2 (MRv2) on YARN. Can you configure a worker node to run a NodeManager daemon but not a DataNode daemon and still have a functional cluster?
A. Yes. The daemon will receive data from the NameNode to run Map tasks
B. Yes. The daemon will get data from another (non-local) DataNode to run Map tasks
C. Yes. The daemon will receive Map tasks only
D. Yes. The daemon will receive Reducer tasks only