HPC/Big Data Certification
1. What qualifies as a Big Data Workload? - ✔✔· Consists of semi-structured, or unstructured
data not suitable for relational databases
· Data volume is considered too large for other soluti
...
HPC/Big Data Certification
1. What qualifies as a Big Data Workload? - ✔✔· Consists of semi-structured, or unstructured
data not suitable for relational databases
· Data volume is considered too large for other solutions (Petabyte scale)
· To process the data in a reasonable timeframe, a massively parallel solution is required
Most common Big Data workloads are? - ✔✔· Batch processing
· In memory processing
· ML (typically GPU based)
What are the challenges customers run into for Big Data workloads on premises? - ✔✔·
Tracking growth patterns and scaling infrastructure to meet capacity requirements
· Time associated with procuring, deploying and maintaining infrastructure to meet demand
· The cost associated with processing this data using other methods
· The cost associated with Disaster Recovery when dealing with Petabytes of data
· The cost and complexity associated with hardware refresh
Customers running Big Data workloads on OCI can ___? - ✔✔· Dynamically scale capacity
against demand
· Leverage Object Storage as a cost-effective Data Lake and for Disaster Recovery
· Take advantage of the best price/performance in the cloud
· Use OCI's managed service offerings to easily deploy and run common Big Data frameworks
What is Big Data Appliance (BDA)? - ✔✔Single tenant, Cloudera based hardware appliance
deployed on-prem
What does Big Data Appliance (BDA) include? - ✔✔· Cloudera Enterprise Data Hub (EDH)
v5.12
· Big Data Manager
· Big Data SQL
What is Oracle Big Data Service (BDS)? - ✔✔· Multitenant, managed Cloudera EDH Hadoop
Deployment
What does Oracle Big Data Service (BDS) include? - ✔✔· Cloudera EDH v5.16.1 or v6.2.0
· Big Data Manager
· Big Data SQL
What is the difference between Big Data Cloud Service (BDCS) and Big Data Service (BDS)? -
✔✔· BDCS - Gen1, BDS - Gen2
· BDCS - Cloudera EDH v5.16.x, BDS - Cloudera EDH v5.16.1 or v6.2.0
· BDCS - deprecated, BDS - license included in consumption
What is Oracle Data Flow (ODF)? - ✔✔· Provides serverless framework for running Spark
based workloads
Where do customers put data and application code for Oracle Data Flow (ODF) applications? -
✔✔· Object Storage
Oracle Data Flow (ODF) provides support for what type of applications? - ✔✔· Java
· Python
· SQL
· Scala
What is Oracle Data Science? - ✔✔· Platform for data scientists to create projects which run
notebook-based modeling on-demand
What services does Oracle Data Science use? - ✔✔· Compute
· Block Storage
What shapes are available for Oracle Data Science Notebook Sessions - ✔✔· VM.Standard.E2.2,
VM.Standard.E2.4, VM.Standard.E2.8
· VM.Standard2.1, VM.Standard2.2, VM.Standard2.4, VM.Standard2.6, VM.Standard2.8,
VM.Standard2.16, VM.Standard2.24,
What is Oracle Streaming Service? - ✔✔· Kafka compatible producer/consumer service that
ingests continuous streams of data
Which Hadoop distributions are supported on OCI? - ✔✔· Cloudera
· Hortonworks
· MapR
Some self-managed Big Data Products are driven by ___? - ✔✔· OCI QuickStart program
· Marketplace
1. When deploying Hadoop on OCI, what should you consider? - ✔✔· Normalize either OCPU
or Memory against OCI shapes used as workers to meet workload requirements
· Use HDFS replication factor 3 when using DenseIO NVMe storage to mitigate hardware failure
· After normalizing OCPU or Memory, use heterogeneous storage on DenseIO workers
leveraging Block Storage to augment HDFS capacity - allows you to scale HDFS capacity
around workload
· Segregate cluster and storage network traffic on BM hosts when using Block Volumes for
HDFS by leveraging both physical VNICs - create a storage network for primary interface and
deploy Hadoop on secondary interface
· Use Private IP networks for Hadoop cluster hosts - enable cluster access either by using edge
node(s) or VPN based access like FastConnect
[Show More]