21. Which distributed computing framework is known for its ability to handle real-time stream processing and complex event processing (CEP)? A. Apache Kafka B. Apache HBase C. Apache Spark Streaming D. Apache Hive Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option C No explanation is given for this question Let's Discuss on Board
22. What is the primary advantage of using Apache Kafka in a big data architecture? A. Real-time data processing B. Distributed database storage C. Batch processing of large datasets D. Data visualization Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option A No explanation is given for this question Let's Discuss on Board
23. In distributed computing, what is the purpose of a "Reducer" in the MapReduce programming model? A. To split data into smaller chunks B. To process and aggregate data from Mapper tasks C. To store data in the HDFS D. To visualize data relationships Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option B No explanation is given for this question Let's Discuss on Board
24. Which distributed computing framework is designed for processing large-scale graph data, such as social networks or network analysis? A. Apache Kafka B. Apache HBase C. Apache Spark GraphX D. Apache Hive Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option C No explanation is given for this question Let's Discuss on Board
25. What is the primary goal of shuffling and sorting in the MapReduce programming model? A. To maximize data storage capacity B. To optimize job scheduling and resource management C. To reorganize data for Reducer tasks D. To increase data variety Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option C No explanation is given for this question Let's Discuss on Board
26. In the context of big data processing, what does the term "ETL" stand for? A. Extract, Transform, Load B. Evaluate, Test, Launch C. Export, Transmit, Learn D. Encode, Transmit, Log Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option A No explanation is given for this question Let's Discuss on Board
27. What is the primary role of a "Name Node" in the Hadoop Distributed File System (HDFS)? A. Storing metadata B. Managing job scheduling C. Storing and managing data blocks D. Managing data visualization Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option A No explanation is given for this question Let's Discuss on Board
28. Which distributed computing framework is known for its support of graph algorithms and is often used for analyzing large-scale graph data? A. Apache Kafka B. Apache HBase C. Apache Spark GraphX D. Apache Hive Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option C No explanation is given for this question Let's Discuss on Board
29. In big data analytics, what is the primary challenge associated with "data silos"? A. Limited data volume B. Limited data variety C. Limited data velocity D. Limited data scalability Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option B No explanation is given for this question Let's Discuss on Board
30. What is the primary purpose of a "Mapper" in the MapReduce programming model? A. To split data into smaller chunks B. To process and aggregate data from Reducer tasks C. To store data in the HDFS D. To visualize data relationships Answer & Solution Discuss in Board Save for Later Answer & Solution Answer: Option A No explanation is given for this question Let's Discuss on Board