Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/.vuepress/sidebar/V1.3.3/en.ts
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ export const enSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: 'Cluster-related Concepts', link: 'Cluster-Concept' },
{ text: 'Common Concepts', link: 'Cluster-Concept_apache' },
{ text: 'Data Type', link: 'Data-Type' },
],
},
Expand Down
2 changes: 1 addition & 1 deletion src/.vuepress/sidebar/V1.3.3/zh.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ export const zhSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: '集群相关概念', link: 'Cluster-Concept' },
{ text: '常见概念', link: 'Cluster-Concept_apache' },
{ text: '数据类型', link: 'Data-Type' },
],
},
Expand Down
2 changes: 1 addition & 1 deletion src/.vuepress/sidebar_timecho/V1.3.3/en.ts
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ export const enSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: 'Cluster-related Concepts', link: 'Cluster-Concept' },
{ text: 'Common Concepts', link: 'Cluster-Concept_timecho' },
{ text: 'Data Type', link: 'Data-Type' },
],
},
Expand Down
2 changes: 1 addition & 1 deletion src/.vuepress/sidebar_timecho/V1.3.3/zh.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ export const zhSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: '集群相关概念', link: 'Cluster-Concept' },
{ text: '常见概念', link: 'Cluster-Concept_timecho' },
{ text: '数据类型', link: 'Data-Type' },
],
},
Expand Down
2 changes: 1 addition & 1 deletion src/.vuepress/sidebar_timecho/V2.0.1/zh-Table.ts
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ export const zhSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: '集群相关概念', link: 'Cluster-Concept' },
{ text: '常见概念', link: 'Cluster-Concept_timecho' },
{ text: '数据类型', link: 'Data-Type' },
],
},
Expand Down
2 changes: 1 addition & 1 deletion src/.vuepress/sidebar_timecho/V2.0.1/zh-Tree.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ export const zhSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: '集群相关概念', link: 'Cluster-Concept' },
{ text: '常见概念', link: 'Cluster-Concept_timecho' },
{ text: '数据类型', link: 'Data-Type' },
],
},
Expand Down
44 changes: 4 additions & 40 deletions src/UserGuide/Master/Tree/Background-knowledge/Cluster-Concept.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
---
redirectTo: Cluster-Concept_apache.html
---
<!--

Licensed to the Apache Software Foundation (ASF) under one
Expand All @@ -17,43 +20,4 @@
specific language governing permissions and limitations
under the License.

-->

# Cluster-related Concepts
The figure below illustrates a typical IoTDB 3C3D1A cluster deployment mode, comprising 3 ConfigNodes, 3 DataNodes, and 1 AINode:
<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://alioss.timecho.com/docs/img/Common-Concepts_02.png">

This deployment involves several key concepts that users commonly encounter when working with IoTDB clusters, including:
- **Nodes** (ConfigNode, DataNode, AINode);
- **Slots** (SchemaSlot, DataSlot);
- **Regions** (SchemaRegion, DataRegion);
- **Replica Groups**.

The following sections will provide a detailed introduction to these concepts.

## Nodes

An IoTDB cluster consists of three types of nodes (processes): **ConfigNode** (the main node), **DataNode**, and **AINode**, as detailed below:
- **ConfigNode:** ConfigNodes store cluster configurations, database metadata, the routing information of time series' schema and data. They also monitor cluster nodes and conduct load balancing. All ConfigNodes maintain full mutual backups, as shown in the figure with ConfigNode-1, ConfigNode-2, and ConfigNode-3. ConfigNodes do not directly handle client read or write requests. Instead, they guide the distribution of time series' schema and data within the cluster using a series of [load balancing algorithms](../Technical-Insider/Cluster-data-partitioning.md).
- **DataNode:** DataNodes are responsible for reading and writing time series' schema and data. Each DataNode can accept client read and write requests and provide corresponding services, as illustrated with DataNode-1, DataNode-2, and DataNode-3 in the above figure. When a DataNode receives client requests, it can process them directly or forward them if it has the relevant routing information cached locally. Otherwise, it queries the ConfigNode for routing details and caches the information to improve the efficiency of subsequent requests.
- **AINode:** AINodes interact with ConfigNodes and DataNodes to extend IoTDB's capabilities for data intelligence analysis on time series data. They support registering pre-trained machine learning models from external sources and performing time series analysis tasks using simple SQL statements on specified data. This process integrates model creation, management, and inference within the database engine. Currently, the system provides built-in algorithms or self-training models for common time series analysis scenarios, such as forecasting and anomaly detection.

## Slots

IoTDB divides time series' schema and data into smaller, more manageable units called **slots**. Slots are logical entities, and in an IoTDB cluster, the **SchemaSlots** and **DataSlots** are defined as follows:
- **SchemaSlot:** A SchemaSlot represents a subset of the time series' schema collection. The total number of SchemaSlots is fixed, with a default value of 1000. IoTDB uses a hashing algorithm to evenly distribute all devices across these SchemaSlots.
- **DataSlot:** A DataSlot represents a subset of the time series' data collection. Based on the SchemaSlots, the data for corresponding devices is further divided into DataSlots by a fixed time interval. The default time interval for a DataSlot is 7 days.

## Region

In IoTDB, time series' schema and data are replicated across DataNodes to ensure high availability in the cluster. However, replicating data at the slot level can increase management complexity and reduce write throughput. To address this, IoTDB introduces the concept of **Region**, which groups SchemaSlots and DataSlots into **SchemaRegions** and **DataRegions** respectively. Replication is then performed at the Region level. The definitions of SchemaRegion and DataRegion are as follows:
- **SchemaRegion**: A SchemaRegion is the basic unit for storing and replicating time series' schema. All SchemaSlots in a database are evenly distributed across the database's SchemaRegions. SchemaRegions with the same RegionID are replicas of each other. For example, in the figure above, SchemaRegion-1 has three replicas located on DataNode-1, DataNode-2, and DataNode-3.
- **DataRegion**: A DataRegion is the basic unit for storing and replicating time series' data. All DataSlots in a database are evenly distributed across the database's DataRegions. DataRegions with the same RegionID are replicas of each other. For instance, in the figure above, DataRegion-2 has two replicas located on DataNode-1 and DataNode-2.

## Replica Groups
Region replicas are critical for the fault tolerance of the cluster. Each Region's replicas are organized into **replica groups**, where the replicas are assigned roles as either **leader** or **follower**, working together to provide read and write services. Recommended replica group configurations under different architectures are as follows:

| Category | Parameter | Single-node Recommended Configuration | Distributed Recommended Configuration |
|:------------:|:-----------------------:|:------------------------------------:|:-------------------------------------:|
| Schema | `schema_replication_factor` | 1 | 3 |
| Data | `data_replication_factor` | 1 | 2 |
-->
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
<!--

​ Licensed to the Apache Software Foundation (ASF) under one
​ or more contributor license agreements. See the NOTICE file
​ distributed with this work for additional information
​ regarding copyright ownership. The ASF licenses this file
​ to you under the Apache License, Version 2.0 (the
​ "License"); you may not use this file except in compliance
​ with the License. You may obtain a copy of the License at
​ http://www.apache.org/licenses/LICENSE-2.0
​ Unless required by applicable law or agreed to in writing,
​ software distributed under the License is distributed on an
​ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
​ KIND, either express or implied. See the License for the
​ specific language governing permissions and limitations
​ under the License.

-->

# Common Concepts

## Sql_dialect Related Concepts

| Concept | Meaning |
| ----------------------- | ------------------------------------------------------------ |
| sql_dialect | IoTDB supports two time-series data models (SQL dialects), both managing devices and measurement points. Tree: Manages data in a hierarchical path manner, where one path corresponds to one measurement point of a device. Table: Manages data in a relational table manner, where one table corresponds to a category of devices. |
| Schema | Schema is the data model information of the database, i.e., tree structure or table structure. It includes definitions such as the names and data types of measurement points. |
| Device | Corresponds to a physical device in an actual scenario, usually containing multiple measurement points. |
| Timeseries | Also known as: physical quantity, time series, timeline, point location, semaphore, indicator, measurement value, etc. It is a time series formed by arranging multiple data points in ascending order of timestamps. Usually, a Timeseries represents a collection point that can periodically collect physical quantities of the environment it is in. |
| Encoding | Encoding is a compression technique that represents data in binary form to improve storage efficiency. IoTDB supports various encoding methods for different types of data. For more detailed information, please refer to:[Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) |
| Compression | After data encoding, IoTDB uses compression technology to further compress binary data to enhance storage efficiency. IoTDB supports multiple compression methods. For more detailed information, please refer to: [Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) |

## Distributed Related Concepts

The following figure shows a common IoTDB 3C3D (3 ConfigNodes, 3 DataNodes) cluster deployment pattern:

<img src="https://alioss.timecho.com/docs/img/Cluster-Concept03.png" alt="" style="width: 60%;"/>

IoTDB's cluster includes the following common concepts:

- Nodes(ConfigNode、DataNode、AINode)
- Region(SchemaRegion、DataRegion)
- Replica Groups

The above concepts will be introduced in the following text.

### Nodes

IoTDB cluster includes three types of nodes (processes): ConfigNode (management node), DataNode (data node), and AINode (analysis node), as shown below:

- ConfigNode: Manages cluster node information, configuration information, user permissions, metadata, partition information, etc., and is responsible for the scheduling of distributed operations and load balancing. All ConfigNodes are fully backed up with each other, as shown in ConfigNode-1, ConfigNode-2, and ConfigNode-3 in the figure above.
- DataNode: Serves client requests and is responsible for data storage and computation, as shown in DataNode-1, DataNode-2, and DataNode-3 in the figure above.
- AINode: Provides machine learning capabilities, supports the registration of trained machine learning models, and allows model inference through SQL calls. It has already built-in self-developed time-series large models and common machine learning algorithms (such as prediction and anomaly detection).

### Data Partitioning

In IoTDB, both metadata and data are divided into small partitions, namely Regions, which are managed by various DataNodes in the cluster.

- SchemaRegion: Metadata partition, managing the metadata of a part of devices and measurement points. SchemaRegions with the same RegionID on different DataNodes are mutual replicas, as shown in SchemaRegion-1 in the figure above, which has three replicas located on DataNode-1, DataNode-2, and DataNode-3.
- DataRegion: Data partition, managing the data of a part of devices for a certain period of time. DataRegions with the same RegionID on different DataNodes are mutual replicas, as shown in DataRegion-2 in the figure above, which has two replicas located on DataNode-1 and DataNode-2.
- For specific partitioning algorithms, please refer to: [Data Partitioning](../Technical-Insider/Cluster-data-partitioning.md)

### Replica Groups

The number of replicas for data and metadata can be configured. The recommended configurations for different deployment modes are as follows, where multi-replication can provide high-availability services.

| Category | Parameter | Stand-Alone Recommended Configuration | Cluster Recommended Configuration |
| :----- | :------------------------ | :----------- | :----------- |
| Schema | schema_replication_factor | 1 | 3 |
| Data | data_replication_factor | 1 | 2 |

## Deployment Related Concepts

IoTDB has two operating modes: Stand-Alone mode and Cluster mode.

### Stand-Alone Mode

An IoTDB Stand-Alone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D;


- **Features**:Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations.
- **Applicable Scenarios**:Scenarios with limited resources or low requirements for high availability, such as edge-side servers.
- **Deployment Method**:[Stand-Alone-Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_apache.md)

### Cluster Mode

An IoTDB cluster instance consists of 3 ConfigNodes and no less than 3 DataNodes, usually 3 DataNodes, i.e., 3C3D; when some nodes fail, the remaining nodes can still provide services, ensuring the high availability of the database service, and the database performance can be improved with the addition of nodes.

- **Features**:High availability and scalability, and the system performance can be improved by adding DataNodes.
- **Applicable Scenarios**:Enterprise-level application scenarios requiring high availability and reliability.
- **Deployment Method**:[Cluster-Deployment](../Deployment-and-Maintenance/Cluster-Deployment_apache.md)

### Summary of Features

| Dimension | Stand-Alone Mode | Cluster Mode |
| ------------ | ---------------------------- | ------------------------ |
| Applicable Scenarios | Edge-side deployment, scenarios with low requirements for high availability | High-availability business, disaster recovery scenarios, etc. |
| Number of Machines Required | 1 | ≥3 |
| Security and Reliability | Cannot tolerate single-point failures | High, can tolerate single-point failures |
| Scalability | Can expand DataNodes to improve performance | Can expand DataNodes to improve performance |
| Performance | Can be expanded with the number of DataNodes | Can be expanded with the number of DataNodes |

- The deployment steps for single-machine mode and cluster mode are similar (adding ConfigNodes and DataNodes one by one), with only the number of replicas and the minimum number of nodes that can provide services being different.
Loading
Loading