Skip to content

Latest commit

 

History

History
76 lines (57 loc) · 10.7 KB

File metadata and controls

76 lines (57 loc) · 10.7 KB

Apache Casandra

Features

Feature Description
⭐ Low-Latency, Faster Writes Since writes in Casandra result in storage in an append-only structure, writes are generally very fast.
- Casandra provides low latency, at the cost of consistency.
- Refer PACELC theorem for more info.
Rich data model This is column-oriented.
- It means, Cassandra stores columns based on the column names, leading to very quick slicing.
- Unlike traditional databases, where column names only consist of metadata, in Cassandra, column names can also consist of the actual data.
Peer to Peer Architecture There is no single point of failure in Cassandra, since it uses a P2P architecture (Leaderless replication).
- Any number of servers/nodes can be added to any Cassandra cluster in any of the data centers.
High Availability, Fault-Tolerance Apache Casandra provides high-availability & fault-tolerance with tunable consistency levels.
- Any number of nodes can be added or deleted in the Cassandra cluster without much disturbance.
- As scaling happens, read and write throughput both increase simultaneously with zero downtime or any pause to the applications.
Scales Horizontally & Linearly Apache Cassandra has a high-scalability architecture.
- Cassandra cluster can be easily scaled-up or scaled-down.
- Generally doubling the size of the cluster, would result in the half latency (both at the median and 99th percentile).
Support replication - Cross-site, Data-Centers Cassandra offers robust support for clusters spanning multiple data centers, with asynchronous leaderless replication allowing low latency operations for all clients.
Integration with systems (like Spark, HDFS etc.) Cassandra offers options for bulk importing data from other data sources (such as HDFS) into the Cassandra cluster by building entire SSTables and then streaming the tables into the cluster.
- Streaming the tables into the cluster is much simpler, faster and more efficient than sending millions or more of individual INSERT statements for all the data you want to load into Cassandra.
Supported Consistency Patterns Eventual Consistency Model
Casandra Query Language (CGL) By default, Cassandra provides a prompt Cassandra query language shell (cqlsh) that allows users to communicate with it.
- Using this shell, you can execute Cassandra Query Language (CQL).
- Using cqlsh, you can define a schema, insert data, and execute a query.
- Cassandra does not support joins or subqueries and therefore requires a developer to denormalize the data or duplicate data for efficient access.

⭐ Ideal Use Cases

Use Case
High-Write, Low-Read use cases
Historical records
Processing server logs
Social media posts
PDF documents
Emails
Time Series Data (with JSON as value)

⭐ Real world use cases of Casandra

Personalization at Spotify using Cassandra

Read more

Social Network Design Problem - User Entities like Posts, Comments etc.

Read more

Flight Booking design problem - Search

Read more)

Other UCs

History - Built by Facebook

Sample Apps

References