Skip to content

Commit 1e3a1ba

Browse files
authored
Common concept document for online preparation knowledge (#520)
1 parent 42848c8 commit 1e3a1ba

File tree

26 files changed

+1502
-291
lines changed

26 files changed

+1502
-291
lines changed

src/.vuepress/sidebar/V1.3.3/en.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ export const enSidebar = {
3737
collapsible: true,
3838
prefix: 'Background-knowledge/',
3939
children: [
40-
{ text: 'Cluster-related Concepts', link: 'Cluster-Concept' },
40+
{ text: 'Common Concepts', link: 'Cluster-Concept_apache' },
4141
{ text: 'Data Type', link: 'Data-Type' },
4242
],
4343
},

src/.vuepress/sidebar/V1.3.3/zh.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ export const zhSidebar = {
3838
collapsible: true,
3939
prefix: 'Background-knowledge/',
4040
children: [
41-
{ text: '集群相关概念', link: 'Cluster-Concept' },
41+
{ text: '常见概念', link: 'Cluster-Concept_apache' },
4242
{ text: '数据类型', link: 'Data-Type' },
4343
],
4444
},

src/.vuepress/sidebar_timecho/V1.3.3/en.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ export const enSidebar = {
3737
collapsible: true,
3838
prefix: 'Background-knowledge/',
3939
children: [
40-
{ text: 'Cluster-related Concepts', link: 'Cluster-Concept' },
40+
{ text: 'Common Concepts', link: 'Cluster-Concept_timecho' },
4141
{ text: 'Data Type', link: 'Data-Type' },
4242
],
4343
},

src/.vuepress/sidebar_timecho/V1.3.3/zh.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ export const zhSidebar = {
3838
collapsible: true,
3939
prefix: 'Background-knowledge/',
4040
children: [
41-
{ text: '集群相关概念', link: 'Cluster-Concept' },
41+
{ text: '常见概念', link: 'Cluster-Concept_timecho' },
4242
{ text: '数据类型', link: 'Data-Type' },
4343
],
4444
},

src/.vuepress/sidebar_timecho/V2.0.1/zh-Table.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ export const zhSidebar = {
3737
collapsible: true,
3838
prefix: 'Background-knowledge/',
3939
children: [
40-
{ text: '集群相关概念', link: 'Cluster-Concept' },
40+
{ text: '常见概念', link: 'Cluster-Concept_timecho' },
4141
{ text: '数据类型', link: 'Data-Type' },
4242
],
4343
},

src/.vuepress/sidebar_timecho/V2.0.1/zh-Tree.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ export const zhSidebar = {
3838
collapsible: true,
3939
prefix: 'Background-knowledge/',
4040
children: [
41-
{ text: '集群相关概念', link: 'Cluster-Concept' },
41+
{ text: '常见概念', link: 'Cluster-Concept_timecho' },
4242
{ text: '数据类型', link: 'Data-Type' },
4343
],
4444
},
Lines changed: 4 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
---
2+
redirectTo: Cluster-Concept_apache.html
3+
---
14
<!--
25
36
Licensed to the Apache Software Foundation (ASF) under one
@@ -17,43 +20,4 @@
1720
specific language governing permissions and limitations
1821
under the License.
1922
20-
-->
21-
22-
# Cluster-related Concepts
23-
The figure below illustrates a typical IoTDB 3C3D1A cluster deployment mode, comprising 3 ConfigNodes, 3 DataNodes, and 1 AINode:
24-
<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://alioss.timecho.com/docs/img/Common-Concepts_02.png">
25-
26-
This deployment involves several key concepts that users commonly encounter when working with IoTDB clusters, including:
27-
- **Nodes** (ConfigNode, DataNode, AINode);
28-
- **Slots** (SchemaSlot, DataSlot);
29-
- **Regions** (SchemaRegion, DataRegion);
30-
- **Replica Groups**.
31-
32-
The following sections will provide a detailed introduction to these concepts.
33-
34-
## Nodes
35-
36-
An IoTDB cluster consists of three types of nodes (processes): **ConfigNode** (the main node), **DataNode**, and **AINode**, as detailed below:
37-
- **ConfigNode:** ConfigNodes store cluster configurations, database metadata, the routing information of time series' schema and data. They also monitor cluster nodes and conduct load balancing. All ConfigNodes maintain full mutual backups, as shown in the figure with ConfigNode-1, ConfigNode-2, and ConfigNode-3. ConfigNodes do not directly handle client read or write requests. Instead, they guide the distribution of time series' schema and data within the cluster using a series of [load balancing algorithms](../Technical-Insider/Cluster-data-partitioning.md).
38-
- **DataNode:** DataNodes are responsible for reading and writing time series' schema and data. Each DataNode can accept client read and write requests and provide corresponding services, as illustrated with DataNode-1, DataNode-2, and DataNode-3 in the above figure. When a DataNode receives client requests, it can process them directly or forward them if it has the relevant routing information cached locally. Otherwise, it queries the ConfigNode for routing details and caches the information to improve the efficiency of subsequent requests.
39-
- **AINode:** AINodes interact with ConfigNodes and DataNodes to extend IoTDB's capabilities for data intelligence analysis on time series data. They support registering pre-trained machine learning models from external sources and performing time series analysis tasks using simple SQL statements on specified data. This process integrates model creation, management, and inference within the database engine. Currently, the system provides built-in algorithms or self-training models for common time series analysis scenarios, such as forecasting and anomaly detection.
40-
41-
## Slots
42-
43-
IoTDB divides time series' schema and data into smaller, more manageable units called **slots**. Slots are logical entities, and in an IoTDB cluster, the **SchemaSlots** and **DataSlots** are defined as follows:
44-
- **SchemaSlot:** A SchemaSlot represents a subset of the time series' schema collection. The total number of SchemaSlots is fixed, with a default value of 1000. IoTDB uses a hashing algorithm to evenly distribute all devices across these SchemaSlots.
45-
- **DataSlot:** A DataSlot represents a subset of the time series' data collection. Based on the SchemaSlots, the data for corresponding devices is further divided into DataSlots by a fixed time interval. The default time interval for a DataSlot is 7 days.
46-
47-
## Region
48-
49-
In IoTDB, time series' schema and data are replicated across DataNodes to ensure high availability in the cluster. However, replicating data at the slot level can increase management complexity and reduce write throughput. To address this, IoTDB introduces the concept of **Region**, which groups SchemaSlots and DataSlots into **SchemaRegions** and **DataRegions** respectively. Replication is then performed at the Region level. The definitions of SchemaRegion and DataRegion are as follows:
50-
- **SchemaRegion**: A SchemaRegion is the basic unit for storing and replicating time series' schema. All SchemaSlots in a database are evenly distributed across the database's SchemaRegions. SchemaRegions with the same RegionID are replicas of each other. For example, in the figure above, SchemaRegion-1 has three replicas located on DataNode-1, DataNode-2, and DataNode-3.
51-
- **DataRegion**: A DataRegion is the basic unit for storing and replicating time series' data. All DataSlots in a database are evenly distributed across the database's DataRegions. DataRegions with the same RegionID are replicas of each other. For instance, in the figure above, DataRegion-2 has two replicas located on DataNode-1 and DataNode-2.
52-
53-
## Replica Groups
54-
Region replicas are critical for the fault tolerance of the cluster. Each Region's replicas are organized into **replica groups**, where the replicas are assigned roles as either **leader** or **follower**, working together to provide read and write services. Recommended replica group configurations under different architectures are as follows:
55-
56-
| Category | Parameter | Single-node Recommended Configuration | Distributed Recommended Configuration |
57-
|:------------:|:-----------------------:|:------------------------------------:|:-------------------------------------:|
58-
| Schema | `schema_replication_factor` | 1 | 3 |
59-
| Data | `data_replication_factor` | 1 | 2 |
23+
-->
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
<!--
2+
3+
​ Licensed to the Apache Software Foundation (ASF) under one
4+
​ or more contributor license agreements. See the NOTICE file
5+
​ distributed with this work for additional information
6+
​ regarding copyright ownership. The ASF licenses this file
7+
​ to you under the Apache License, Version 2.0 (the
8+
​ "License"); you may not use this file except in compliance
9+
​ with the License. You may obtain a copy of the License at
10+
11+
​ http://www.apache.org/licenses/LICENSE-2.0
12+
13+
​ Unless required by applicable law or agreed to in writing,
14+
​ software distributed under the License is distributed on an
15+
​ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
16+
​ KIND, either express or implied. See the License for the
17+
​ specific language governing permissions and limitations
18+
​ under the License.
19+
20+
-->
21+
22+
# Common Concepts
23+
24+
## Sql_dialect Related Concepts
25+
26+
| Concept | Meaning |
27+
| ----------------------- | ------------------------------------------------------------ |
28+
| sql_dialect | IoTDB supports two time-series data models (SQL dialects), both managing devices and measurement points. Tree: Manages data in a hierarchical path manner, where one path corresponds to one measurement point of a device. Table: Manages data in a relational table manner, where one table corresponds to a category of devices. |
29+
| Schema | Schema is the data model information of the database, i.e., tree structure or table structure. It includes definitions such as the names and data types of measurement points. |
30+
| Device | Corresponds to a physical device in an actual scenario, usually containing multiple measurement points. |
31+
| Timeseries | Also known as: physical quantity, time series, timeline, point location, semaphore, indicator, measurement value, etc. It is a time series formed by arranging multiple data points in ascending order of timestamps. Usually, a Timeseries represents a collection point that can periodically collect physical quantities of the environment it is in. |
32+
| Encoding | Encoding is a compression technique that represents data in binary form to improve storage efficiency. IoTDB supports various encoding methods for different types of data. For more detailed information, please refer to:[Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) |
33+
| Compression | After data encoding, IoTDB uses compression technology to further compress binary data to enhance storage efficiency. IoTDB supports multiple compression methods. For more detailed information, please refer to: [Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) |
34+
35+
## Distributed Related Concepts
36+
37+
The following figure shows a common IoTDB 3C3D (3 ConfigNodes, 3 DataNodes) cluster deployment pattern:
38+
39+
<img src="https://alioss.timecho.com/docs/img/Cluster-Concept03.png" alt="" style="width: 60%;"/>
40+
41+
IoTDB's cluster includes the following common concepts:
42+
43+
- Nodes(ConfigNode、DataNode、AINode)
44+
- Region(SchemaRegion、DataRegion)
45+
- Replica Groups
46+
47+
The above concepts will be introduced in the following text.
48+
49+
### Nodes
50+
51+
IoTDB cluster includes three types of nodes (processes): ConfigNode (management node), DataNode (data node), and AINode (analysis node), as shown below:
52+
53+
- ConfigNode: Manages cluster node information, configuration information, user permissions, metadata, partition information, etc., and is responsible for the scheduling of distributed operations and load balancing. All ConfigNodes are fully backed up with each other, as shown in ConfigNode-1, ConfigNode-2, and ConfigNode-3 in the figure above.
54+
- DataNode: Serves client requests and is responsible for data storage and computation, as shown in DataNode-1, DataNode-2, and DataNode-3 in the figure above.
55+
- AINode: Provides machine learning capabilities, supports the registration of trained machine learning models, and allows model inference through SQL calls. It has already built-in self-developed time-series large models and common machine learning algorithms (such as prediction and anomaly detection).
56+
57+
### Data Partitioning
58+
59+
In IoTDB, both metadata and data are divided into small partitions, namely Regions, which are managed by various DataNodes in the cluster.
60+
61+
- SchemaRegion: Metadata partition, managing the metadata of a part of devices and measurement points. SchemaRegions with the same RegionID on different DataNodes are mutual replicas, as shown in SchemaRegion-1 in the figure above, which has three replicas located on DataNode-1, DataNode-2, and DataNode-3.
62+
- DataRegion: Data partition, managing the data of a part of devices for a certain period of time. DataRegions with the same RegionID on different DataNodes are mutual replicas, as shown in DataRegion-2 in the figure above, which has two replicas located on DataNode-1 and DataNode-2.
63+
- For specific partitioning algorithms, please refer to: [Data Partitioning](../Technical-Insider/Cluster-data-partitioning.md)
64+
65+
### Replica Groups
66+
67+
The number of replicas for data and metadata can be configured. The recommended configurations for different deployment modes are as follows, where multi-replication can provide high-availability services.
68+
69+
| Category | Parameter | Stand-Alone Recommended Configuration | Cluster Recommended Configuration |
70+
| :----- | :------------------------ | :----------- | :----------- |
71+
| Schema | schema_replication_factor | 1 | 3 |
72+
| Data | data_replication_factor | 1 | 2 |
73+
74+
## Deployment Related Concepts
75+
76+
IoTDB has two operating modes: Stand-Alone mode and Cluster mode.
77+
78+
### Stand-Alone Mode
79+
80+
An IoTDB Stand-Alone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D;
81+
82+
83+
- **Features**:Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations.
84+
- **Applicable Scenarios**:Scenarios with limited resources or low requirements for high availability, such as edge-side servers.
85+
- **Deployment Method**[Stand-Alone-Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_apache.md)
86+
87+
### Cluster Mode
88+
89+
An IoTDB cluster instance consists of 3 ConfigNodes and no less than 3 DataNodes, usually 3 DataNodes, i.e., 3C3D; when some nodes fail, the remaining nodes can still provide services, ensuring the high availability of the database service, and the database performance can be improved with the addition of nodes.
90+
91+
- **Features**:High availability and scalability, and the system performance can be improved by adding DataNodes.
92+
- **Applicable Scenarios**:Enterprise-level application scenarios requiring high availability and reliability.
93+
- **Deployment Method**[Cluster-Deployment](../Deployment-and-Maintenance/Cluster-Deployment_apache.md)
94+
95+
### Summary of Features
96+
97+
| Dimension | Stand-Alone Mode | Cluster Mode |
98+
| ------------ | ---------------------------- | ------------------------ |
99+
| Applicable Scenarios | Edge-side deployment, scenarios with low requirements for high availability | High-availability business, disaster recovery scenarios, etc. |
100+
| Number of Machines Required | 1 | ≥3 |
101+
| Security and Reliability | Cannot tolerate single-point failures | High, can tolerate single-point failures |
102+
| Scalability | Can expand DataNodes to improve performance | Can expand DataNodes to improve performance |
103+
| Performance | Can be expanded with the number of DataNodes | Can be expanded with the number of DataNodes |
104+
105+
- The deployment steps for single-machine mode and cluster mode are similar (adding ConfigNodes and DataNodes one by one), with only the number of replicas and the minimum number of nodes that can provide services being different.

0 commit comments

Comments
 (0)