Apache Hadoop Developer Certification Questions (108 Q&As)

Are you preparing for the Apache Hadoop Developer Exam?Are you looking for study materials about Apache Hadoop Developer test? You can download the Apache Hadoop Developer pdf demo on PassQuestion web site freely.Then make up your mind you can pass the Apache Hadoop Developer Exam easily by using our Apache Hadoop Developer Certification Questions.Let us help you climb that ladder of success and pass your Hadoop 2.0 Certification exam for Pig and Hive Developer exam now!

Test Online Apache Hadoop Developer Free Questions

1. Which one of the following statements describes a Pig bag. tuple, and map, respectively?


2. You want to run Hadoop jobs on your development workstation for testing before you submit them to your production cluster.

Which mode of operation in Hadoop allows you to most closely simulate a production cluster while using a single machine?


3. Which HDFS command uploads a local file X into an existing HDFS directory Y?


4. In Hadoop 2.0, which TWO of the following processes work together to provide automatic failover of the NameNode? Choose 2 answers


5. To use a lava user-defined function (UDF) with Pig what must you do?


6. When is the earliest point at which the reduce method of a given Reducer can be called?


7. Which one of the following statements describes the relationship between the ResourceManager and the ApplicationMaster?


8. Which HDFS command copies an HDFS file named foo to the local filesystem as localFoo?


9. You need to perform statistical analysis in your MapReduce job and would like to call methods in the Apache Commons Math library, which is distributed as a 1.3 megabyte Java archive (JAR) file.

Which is the best way to make this library available to your MapReducer job at runtime?


10. In a MapReduce job with 500 map tasks, how many map task attempts will there be?


11. You want to count the number of occurrences for each unique word in the supplied input data. You’ve decided to implement this by having your mapper tokenize each word and emit a literal value 1, and then have your reducer increment a counter for each literal 1 it receives. After successful implementing this, it occurs to you that you could optimize this by specifying a combiner.

Will you be able to reuse your existing Reduces as your combiner in this case and why or why not?


12. What data does a Reducer reduce method process?


13. Given a directory of files with the following structure: line number, tab character, string:





You want to send each line as one record to your Mapper.

Which InputFormat should you use to complete the line: conf.setInputFormat (____.class) ; ?


14. Examine the following Hive statements:

Assuming the statements above execute successfully, which one of the following statements is true?


15. When can a reduce class also serve as a combiner without affecting the output of a MapReduce program?


16. What does the following WebHDFS command do?

Curl -1 -L “http://host:port/webhdfs/v1/foo/bar?op=OPEN”


17. You need to run the same job many times with minor variations. Rather than hardcoding all job configuration options in your drive code, you’ve decided to have your Driver subclass org.apache.hadoop.conf.Configured and implement the org.apache.hadoop.util.Tool interface.

Indentify which invocation correctly passes.mapred.job.name with a value of Example to Hadoop?


18. Determine which best describes when the reduce method is first called in a MapReduce job?


19. You have a directory named jobdata in HDFS that contains four files: _first.txt, second.txt, .third.txt and #data.txt.

How many files will be processed by the FileInputFormat.setInputPaths () command when it’s given a path object representing this directory?


20. In a large MapReduce job with m mappers and n reducers, how many distinct copy operations will there be in the sort/shuffle phase?


21. Which Hadoop component is responsible for managing the distributed file system metadata?


22. Review the following data and Pig code.






A = LOAD 'data' USING PigStorage('.') as (gender:Chararray, age:int, zlp:chararray);


Which one of the following commands would save the results of B to a folder in hdfs named myoutput?


23. MapReduce v2 (MRv2/YARN) splits which major functions of the JobTracker into separate daemons? Select two.


24. Assuming the following Hive query executes successfully:

Which one of the following statements describes the result set?


25. Given the following Pig commands:

Which one of the following statements is true?


26. What does Pig provide to the overall Hadoop solution?


27. What types of algorithms are difficult to express in MapReduce v1 (MRv1)?


28. You need to create a job that does frequency analysis on input data. You will do this by writing a Mapper that uses TextInputFormat and splits each value (a line of text from an input file) into individual characters. For each one of these characters, you will emit the character as a key and an InputWritable as the value.

As this will produce proportionally more intermediate data than input data, which two resources should you expect to be bottlenecks?


29. Which one of the following statements regarding the components of YARN is FALSE?


30. You are developing a combiner that takes as input Text keys, IntWritable values, and emits Text keys, IntWritable values.

Which interface should your class implement?


31. Which one of the following Hive commands uses an HCatalog table named x?


32. Given the following Pig command:

logevents = LOAD 'input/my.log' AS (date:chararray, levehstring, code:int, message:string);

Which one of the following statements is true?


33. Consider the following two relations, A and B.


34. Given the following Hive commands:

Which one of the following statements Is true?


35. In a MapReduce job, the reducer receives all values associated with same key.

Which statement best describes the ordering of these values?


36. Which describes how a client reads a file from HDFS?


37. For each input key-value pair, mappers can emit:


38. You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm uses TextInputFormat: the mapper applies a regular expression over input values and emits key-values pairs with the key consisting of the matching text, and the value containing the filename and byte offset. Determine the difference between setting the number of reduces to one and settings the number of reducers to zero.


39. In Hadoop 2.0, which one of the following statements is true about a standby NameNode?

The Standby NameNode:


40. In the reducer, the MapReduce API provides you with an iterator over Writable values.

What does calling the next () method return?


Question 1 of 40

Leave a Reply

Your email address will not be published. Required fields are marked *