diff --git a/README.md b/README.md index 98ca5b4..acd627e 100644 --- a/README.md +++ b/README.md @@ -48,6 +48,7 @@ Open Source Handbook is a resource for people of **all skill and experience leve - [Audio visualization](https://github.com/willianjusten/awesome-audio-visualization) - [Big data](categories/big-data.md) - [Datasets](https://github.com/awesomedata/awesome-public-datasets/blob/master/README.rst) + - [Cloud Native & DevOps](categories/cloud-native.md) - [Frameworks](https://github.com/topics/framework) - [Gaming](https://gist.github.com/roachhd/d579b58148d7e36a6b72) - [iOS development](https://github.com/dkhamsing/open-source-ios-apps/blob/master/APPSTORE.md#apple-watch) @@ -60,6 +61,10 @@ Open Source Handbook is a resource for people of **all skill and experience leve - Web development - [Front-end tools and resources](https://github.com/MilanAryal/web-development-resources) - [GitHub Pages](categories/github-pages.md) + - [Node.js](https://github.com/nodejs/node) - JavaScript runtime built on Chrome's V8 engine + - [React](https://github.com/facebook/react) - JavaScript library for building user interfaces + - [Vue.js](https://github.com/vuejs/vue) - Progressive JavaScript framework + - [Angular](https://github.com/angular/angular) - Platform for building mobile and desktop web applications [return to top](README.md) diff --git a/categories/big-data.md b/categories/big-data.md index c198e1b..9f19c41 100644 --- a/categories/big-data.md +++ b/categories/big-data.md @@ -1,14 +1,71 @@

Open Source Handbook

Big Data Projects

- - [Apache Crunch](https://github.com/apache/crunch) - - [Apache Hadoop](https://github.com/apache/hadoop) - - [Apache Kafka](https://github.com/apache/kafka) - - [Apache Samoa](https://github.com/apache/incubator-samoa) - - [Apache Storm](https://github.com/apache/storm) - - [Elasticsearch](https://github.com/elastic/elasticsearch) - - [HPCC Systems](https://github.com/hpcc-systems/HPCC-Platform) - - [Lumify](https://github.com/lumifyio/lumify) - - [MongoDB](https://github.com/mongodb) - - [RapidMiner](https://github.com/rapidminer) - - [Talend Open Studio for Big Data](https://github.com/Talend) +## Data Processing & Analytics + +### Stream Processing +- [Apache Kafka](https://github.com/apache/kafka) - Distributed streaming platform for building real-time data pipelines +- [Apache Storm](https://github.com/apache/storm) - Real-time computation system for processing streams of data +- [Apache Flink](https://github.com/apache/flink) - Stream processing framework for distributed, high-performing data streaming applications +- [Apache Pulsar](https://github.com/apache/pulsar) - Cloud-native, distributed messaging and streaming platform + +### Batch Processing +- [Apache Hadoop](https://github.com/apache/hadoop) - Framework for distributed storage and processing of large datasets +- [Apache Spark](https://github.com/apache/spark) - Unified analytics engine for large-scale data processing +- [Apache Crunch](https://github.com/apache/crunch) - Java library for writing MapReduce pipelines +- [Apache Samoa](https://github.com/apache/incubator-samoa) - Distributed streaming machine learning framework + +## Data Storage & Databases + +### NoSQL Databases +- [MongoDB](https://github.com/mongodb/mongo) - Document-oriented NoSQL database +- [Apache Cassandra](https://github.com/apache/cassandra) - Highly scalable distributed NoSQL database +- [Redis](https://github.com/redis/redis) - In-memory data structure store +- [ClickHouse](https://github.com/ClickHouse/ClickHouse) - Column-oriented database for analytics + +### Search & Analytics +- [Elasticsearch](https://github.com/elastic/elasticsearch) - Distributed search and analytics engine +- [Apache Solr](https://github.com/apache/solr) - Enterprise search platform +- [OpenSearch](https://github.com/opensearch-project/OpenSearch) - Community-driven search and analytics suite + +## Data Tools & Platforms + +### Workflow Management +- [Apache Airflow](https://github.com/apache/airflow) - Platform for developing, scheduling, and monitoring workflows +- [Prefect](https://github.com/PrefectHQ/prefect) - Modern workflow orchestration framework +- [Dagster](https://github.com/dagster-io/dagster) - Data orchestrator for machine learning, analytics, and ETL + +### Data Integration & ETL +- [Talend Open Studio for Big Data](https://github.com/Talend/tdi-studio-se) - Open source data integration platform +- [Apache NiFi](https://github.com/apache/nifi) - System for processing and distributing data +- [Singer](https://github.com/singer-io) - Open source standard for writing scripts that move data + +### Analytics & Visualization +- [Apache Superset](https://github.com/apache/superset) - Modern data exploration and visualization platform +- [Metabase](https://github.com/metabase/metabase) - Business intelligence tool for everyone in your company +- [Grafana](https://github.com/grafana/grafana) - Observability and data visualization platform + +## Specialized Platforms + +### Machine Learning & AI +- [MLflow](https://github.com/mlflow/mlflow) - Machine learning lifecycle management +- [Kubeflow](https://github.com/kubeflow/kubeflow) - Machine learning toolkit for Kubernetes +- [Apache Mahout](https://github.com/apache/mahout) - Distributed linear algebra framework + +### Data Lakes & Warehouses +- [Apache Iceberg](https://github.com/apache/iceberg) - High-performance format for huge analytic tables +- [Delta Lake](https://github.com/delta-io/delta) - Storage framework that brings ACID transactions to Apache Spark +- [Apache Hudi](https://github.com/apache/hudi) - Transactional data lake platform + +### Legacy & Specialized +- [HPCC Systems](https://github.com/hpcc-systems/HPCC-Platform) - Massive parallel-processing computing platform +- [RapidMiner](https://github.com/rapidminer/rapidminer-studio) - Data science platform for teams + +## Getting Started Tips + +- **For Beginners**: Start with Apache Spark or Elasticsearch - they have great documentation and active communities +- **For Data Engineers**: Check out Apache Airflow for workflow management or Apache Kafka for streaming +- **For Analysts**: Try Apache Superset or Metabase for visualization projects +- **Good First Issues**: Look for repositories with "good first issue" or "beginner-friendly" labels + +[return to top](../README.md) diff --git a/categories/cloud-native.md b/categories/cloud-native.md new file mode 100644 index 0000000..43d3861 --- /dev/null +++ b/categories/cloud-native.md @@ -0,0 +1,81 @@ +

Open Source Handbook

+

Cloud Native & DevOps Projects

+ +## Container Orchestration + +### Kubernetes Ecosystem +- [Kubernetes](https://github.com/kubernetes/kubernetes) - Container orchestration platform +- [Helm](https://github.com/helm/helm) - Package manager for Kubernetes +- [Istio](https://github.com/istio/istio) - Service mesh for microservices +- [Linkerd](https://github.com/linkerd/linkerd2) - Ultralight service mesh for Kubernetes + +### Container Runtimes +- [Docker](https://github.com/moby/moby) - Container platform (Moby project) +- [Podman](https://github.com/containers/podman) - Daemonless container engine +- [containerd](https://github.com/containerd/containerd) - Industry-standard container runtime + +## CI/CD & Automation + +### Continuous Integration +- [Jenkins](https://github.com/jenkinsci/jenkins) - Automation server for CI/CD +- [GitLab CI](https://github.com/gitlabhq/gitlabhq) - Complete DevOps platform +- [Tekton](https://github.com/tektoncd/pipeline) - Cloud-native CI/CD building blocks +- [Drone](https://github.com/harness/drone) - Container-native CI/CD platform + +### Infrastructure as Code +- [Terraform](https://github.com/hashicorp/terraform) - Infrastructure provisioning tool +- [Pulumi](https://github.com/pulumi/pulumi) - Modern infrastructure as code +- [Ansible](https://github.com/ansible/ansible) - IT automation platform +- [Chef](https://github.com/chef/chef) - Configuration management tool + +## Monitoring & Observability + +### Metrics & Monitoring +- [Prometheus](https://github.com/prometheus/prometheus) - Monitoring system and time series database +- [Grafana](https://github.com/grafana/grafana) - Observability and data visualization platform +- [Jaeger](https://github.com/jaegertracing/jaeger) - Distributed tracing platform +- [OpenTelemetry](https://github.com/open-telemetry) - Observability framework + +### Logging +- [Fluentd](https://github.com/fluent/fluentd) - Data collector for unified logging layer +- [Logstash](https://github.com/elastic/logstash) - Server-side data processing pipeline +- [Vector](https://github.com/vectordotdev/vector) - High-performance observability data pipeline + +## Service Mesh & Networking + +- [Envoy](https://github.com/envoyproxy/envoy) - Cloud-native high-performance edge/middle/service proxy +- [Consul](https://github.com/hashicorp/consul) - Service networking solution +- [Traefik](https://github.com/traefik/traefik) - Modern HTTP reverse proxy and load balancer +- [NGINX](https://github.com/nginx/nginx) - HTTP and reverse proxy server + +## Security & Policy + +- [Open Policy Agent (OPA)](https://github.com/open-policy-agent/opa) - Policy engine for cloud native environments +- [Falco](https://github.com/falcosecurity/falco) - Runtime security monitoring +- [Trivy](https://github.com/aquasecurity/trivy) - Vulnerability scanner for containers +- [Cert-Manager](https://github.com/cert-manager/cert-manager) - X.509 certificate management for Kubernetes + +## Storage & Databases + +### Cloud-Native Storage +- [Rook](https://github.com/rook/rook) - Storage orchestrator for Kubernetes +- [Longhorn](https://github.com/longhorn/longhorn) - Distributed block storage system for Kubernetes +- [OpenEBS](https://github.com/openebs/openebs) - Container-attached storage + +### Cloud-Native Databases +- [CockroachDB](https://github.com/cockroachdb/cockroach) - Distributed SQL database +- [TiDB](https://github.com/pingcap/tidb) - Distributed HTAP database +- [Vitess](https://github.com/vitessio/vitess) - Database clustering system for horizontal scaling of MySQL + +## Getting Started Tips + +- **For Beginners**: Start with Docker and Kubernetes basics, then explore Helm for package management +- **For DevOps Engineers**: Check out Prometheus + Grafana for monitoring or Terraform for infrastructure +- **For Security**: Try Trivy for vulnerability scanning or Falco for runtime security +- **Good First Issues**: Many CNCF projects have excellent "good first issue" labels and mentorship programs + +## CNCF Landscape + +Most of these projects are part of the [Cloud Native Computing Foundation (CNCF)](https://landscape.cncf.io/), which provides excellent resources for contributors and maintains a comprehensive landscape of cloud-native technologies. + +[return to top](../README.md) \ No newline at end of file