BEAD 2025

This module is part of the ISS MTech Graduate Certificate series. The module is common for both Graduate Certificate in Big Data Analytics as well as Graduate Certificate in Engineering Big Data series offered by NUS-ISS.

Participants will learn various aspects of data engineering while building data intensive applications. Participants will learn to apply key practices, identify multiple data sources appraised against their business value, design the right storage, and implement proper access model(s). Finally, participants will build a scalable data pipeline solution composed of pluggable hadoop component architecture, based on the combination of requirements in a vendor/technology agnostic manner. Participants will work with Spark framework and special processing libraries.

Key Takeaways

Upon effective completion of the course, participants will be able to: Understand the growth of big data and need for a scalable processing framework. Understand the fundamental characteristics, storage, analysis techniques and the relevant distributions Understand the distributed storage essentials, storage needs, and relevant architectural mechanism in processing large amounts of structured, semi-structured and unstructured data. Gain expertise with the fault-tolerant computing framework (E.g. Hadoop or Kubernetes) by setting up pseudo cluster nodes or cloud based nodes for processing big data. . Construct configurable and executable tasks using the In Memory Processing frameworks (E.g. Spark Core). Understand the nuances of writing functional programs and use the core libraries to manipulate the large corpse of unstructured data residing as Resilient Distributed Datasets. Organize, store and manipulate the collected data using processing libraries. For example, using special statistical operation and stream processing data tools (E.g. Spark Special Libraries). Understand various data processing, querying and persistence (E.g. Spark QL APIs) available for usage in RDD’s context. Perform tasks such as filtering, selection and categorization.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
PySparkMinIO		PySparkMinIO
PythonProject		PythonProject
colab		colab
data		data
z_IngestionExamples		z_IngestionExamples
z_ScalaExamples		z_ScalaExamples
z_ScalaSparkExamples		z_ScalaSparkExamples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BEAD 2025

Key Takeaways

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BEAD 2025

Key Takeaways

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages